-
Notifications
You must be signed in to change notification settings - Fork 639
[WIP] Remove ESPAsyncTCP dependency of OTA code, SSL fixes, better parsing #1716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
code/espurna/ota.ino
Outdated
|
||
WiFiClient _ota_client; | ||
|
||
#if ASYNC_TCP_SSL_ENABLED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTA_SECURE_ENABLED?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's even better to use a generic SUPPORT_SSL
flag (or similar)? We can use that in MQTT and other modules as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per pr - we no longer use async client, so why depend on it?
We can use that in MQTT and other modules as well.
Yep, no reason not to enable all secure things at the same time (since they use the same library and we bundle it anyways).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added SSL_ENABLED now - if it looks good I'll make the changes across the whole project in a separate PR
code/espurna/ota.ino
Outdated
#define USING_AXTLS // do not use BearSSL | ||
#endif | ||
|
||
WiFiClient _ota_client; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using single _ota_client object instead of separate non-secure and secure ?
Minor question - does this create _ota_client when secure one is used? (i.e. is it in memory at this point?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it but it clutters the code a lot if you're using a separate _secure object: all subsequent _ota_client
calls need to be surrounded with if-checks. But maybe this is not the nicest solution. Now that I think about it - not sure what happens when connecting to a HTTPS host first and to a HTTP one later. Do you have any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad not looking at this closer. Secure client will not fallback to basic one :(
Essentially, what my thinking is
- use separate methods for secure and non-secure update flow
- remove global client objects, create them inplace. We are not async, we expect them to live up until the function ends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And mqtt /ota handler needs to be aware of that too. All async-mqtt-client callbacks are inside SYS context - we cannot use Core's sync clients inside of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just pushed some changes, the client objects are created inside the function now. I have a strange crash with SSL requests though (alloc issues in SSL) - can you check on your end as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that I tried yesterday with HTTPClient & httpUpdate was requiring ~20seconds for connection to happen.
In general, I think we would want to avoid doing connection twice. HTTPClient, for example, wraps around that internally (which btw can be adapted here or straight up used as-is)
Reviewing HTTPClient I'd also noticed that it cleverly wraps around fingerprinting (see Traits classes). But I did rewrite bunch of things to stop axtls thing from happening so no idea about that, but in the end it did actually start connecting.
Another problematic part is shadowing _ota_client instead of having separate methods (which sometimes caused compiler errors for me. might be a false flag, but can be easily separated instead of creating it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remedy for 20 second wait was turning off active webserver (that was not ok...)
if (getSetting("webEnable", 1).toInt() == 0) return;
In webSetup(), wsSetup() and debugWebSetup()
At least now I have a reproducible "conflict" mentioned in the comments (which I did not remember having earlier :/). I wonder if async webserver polling method is broken somehow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion it would be better to remove ESPAsyncTCP/ESPAsyncWebServer in the whole project - it's not maintained anymore and you'll see more and more compatibility issues with future core versions. Rewriting the telnet server is not too much work (I'm testing it now). For web support: the standard ESP8266WebServer is pretty good, you'll only need something like arduinoWebSockets for WS(S) support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely thing :( Web server part is problematic for some parts though - alexa triggers header parsing bugs, some credentials stuff works differently... Which again puts the Core version as a dependency on some bug or feature. (OT: we also can switch to espressif IDF 🙄)
Websocket is not a huge problem, since it is both an external lib and pretty widely used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another heads up regarding slightly getting distracted and dependencies in general.
What we can do instead of removing the module straight away, is implement another one side by side (just another file) and make them incompatible with each other. That way we can already use new one, and don't need to think about dependencies in a strange way.
edit: ...and if I don't sleep too much tomorrow, I'll prepare bssl client as a pr
#if !defined(ARDUINO_ESP8266_RELEASE_2_5_0) && !defined(ARDUINO_ESP8266_RELEASE_2_5_1) && !defined(ARDUINO_ESP8266_RELEASE_2_5_2) | ||
#define USING_AXTLS // do not use BearSSL on core < 2.5.0 | ||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a way to check this in a better way - maybe ESP.getCoreVersion()
?
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That way bearssl is also off with git versions of the Core.
2.5.0 and up introduced https://github.com/esp8266/Arduino/blob/master/cores/esp8266/core_esp8266_version.h
tag with proper comparison, but we again are hit by multi-Core support :/
In theory, one can depend on some behaviour / implementation instead. Kind of dangerous though, if it changes. For example, see:
dev...mcspr:utils/fragmentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can use ARDUINO_ESP8266_GIT_VER
instead? The type_traits stuff looks pretty rough (not to mention badly readable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARDUINO_ESP8266_GIT_VER has the same problem as RELEASE defines. BTW I think we are carried away. Does this work? Shouldn't it be global -D... instead, so framework gets this too? IMO documentation would solve this better. If we sure this breaks horribly, static_assert(condition, "message") can also do this.
Regarding other branch, you'd say that this is a clearer intent?
espurna/code/espurna/config/prototypes.h
Lines 55 to 66 in 9ceca44
#include <cont.h> | |
#if defined(ARDUINO_ESP8266_RELEASE_2_3_0) \ | |
|| defined(ARDUINO_ESP8266_RELEASE_2_4_0) \ | |
|| defined(ARDUINO_ESP8266_RELEASE_2_4_1) | |
extern cont_t g_cont; | |
#define getFreeStack() cont_get_free_stack(&g_cont) | |
#elif defined(ARDUINO_ESP8266_RELEASE_2_4_2) | |
extern cont_t* g_pcont; | |
#define getFreeStack() cont_get_free_stack(g_pcont) | |
#else | |
#define getFreeStack() ESP.getFreeContStack() | |
#endif |
https://en.cppreference.com/w/cpp/language/decltype
https://en.cppreference.com/w/cpp/types/enable_if
It is just an overload detection. We force compiler to pick specific 'detect()' method and store bool based on the one it found (either internal or external return type)
From my limited understanding, to make this shorter 'getHeapStats' function would need different return types (or different set of args)
For some of long-stack-trace-crashes: esp8266/Arduino@d83eabe, esp8266/Arduino#6153, esp8266/Arduino#6156 Still no clue what exactly webserver does differently from telnet server :/ |
Was testing with 2.5.2 release. Running latest GIT Core (esp8266/Arduino@dddc8d2) I no longer see the client hanging when ESPAsyncWebServer is listening for connections. edit: Certs, fingerprinting works. Certs do need MFLN thing, OOM crashing without it |
Closing in favor of #1751 |
This PR changes the following:
Please test carefully, but I had no issues on core 2.4.2 and 2.5.0 on my setup.