Description
I have a server currently running on an AWS instance (md5.large). I'm currently in the process of figuring out why calls have way more latency than expected.
I'm currently rapidjson for JSON deserialization. So when my listener is called and after I have validated the request, I extract the body as a string to pass along to the appropriate handlers. This string is then deserialized with rapidjson.
I do this via
std::string body = request.extract_string().get();
I have found out, however that this call takes roughly 60000 to 120000 microseconds. Although when running on my iMac Pro, it takes a fraction of the time (50 to 500 microseconds), however this is servicing requests which are running on localhost. This translates to the time to service a request effectively being the time it takes to run extract_string
, since it eclipses all other processing by quite a bit. In diagnosing this, I have several time samples which are being logged, which allows me to be able to make the comparison
One key thing to mention here. For these particular requests, the body is 70 bytes. That's right, only 70 bytes. This is running on a dev instance with me being the only person making the requests.
I tried to use body()
directly but have found that when I use it, it ends up with an empty string. Which leaves me to believe that even though the listener has notified me of the request, the body isn't loaded. I basically took some of the existing code from extract_string
to try and get the contents of the body.
auto buf_r = request.body().streambuf();
body.resize((std::string::size_type)buf_r.in_avail());
buf_r.getn(const_cast<uint8_t*>(reinterpret_cast<const uint8_t*>(body.data())), body.size())
.get(); // There is no risk of blocking.
On my IMP I get get the body. On the AWS instance, it results in an empty string. From this it would appear there is still some buffering or something else
This also leads me to believe that pplx tasking is woefully inefficient. Which does make me also wonder about the overall efficiency of cpprestsdk, since much of it seems reliant on pplx for tasking.
Hoping someone can shed some light here. As well as if there is there a proper way to get the body without using pplx? And has there been any benchmarking on just how performant this SDK is? In particular, how efficient pplx is?