-
Notifications
You must be signed in to change notification settings - Fork 903
Fix for issue 775 #776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for issue 775 #776
Conversation
Fix possibly incorrect integer overflow check with an efficient correct check. Not fixing the issue where size == 0, unsure if this is by design, or what error to return if not.
src/unpack.c
Outdated
if (size / sizeof(msgpack_object) != n) { | ||
tmp = (unsigned long long)n * sizeof(msgpack_object); | ||
|
||
if (tmp & 0xffffffff00000000) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for my late reply. I considered many cases.
I wrote a small test code to check sizeof types.
#include <msgpack.hpp>
#include <iostream>
int main() {
std::cout << "msgpack_object : " << sizeof(msgpack_object) << std::endl;
std::cout << "size_t : " << sizeof(size_t) << std::endl;
std::cout << "unsigned int : " << sizeof(unsigned int) << std::endl;
std::cout << "unsigned long long: " <<sizeof(unsigned long long) << std::endl;
}
On 32bit environment, the output is:
msgpack_object : 16
size_t : 4
unsigned int : 4
unsigned long long: 8
On 64bit environment, the output is:
msgpack_object : 24
size_t : 8
unsigned int : 4
unsigned long long: 8
The max value of n
is 0xffffffff
. See https://github.com/msgpack/msgpack/blob/master/spec.md#array-format-family
msgpack_zone_malloc()
's 2nd parameter type is size_t
.
Your check is valid on 32bit environment.
However, on 64bit environment, tmp
could be greater than 0xffffffff
and msgpack_zone_malloc()
should be accept it. So I think that your check is false positive in this case.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type of sizeof(msgpack_object) might be 64-bit, but the value is going to much smaller, far less than 2^32-1. And n is an unsigned int. It is true that a compiler implementation could exist where sizeof(unsigned int) > 4, but we don't have that now. A static_assert for that might be helpful, or we can also just use the SafeInt library, and let it do the right thing.
But if sizeof(unsigned int) == 4, then our maximum possible value going into tmp is this:
(2^32 - 1) * sizeof(msgpack_object), which then expands to:
2^32 * sizeof(msgpack_object) - sizeof(msgpack_object)
The only way this can possibly overrun the capacity of tmp, which is 2^64-1, is for sizeof(msgpack_object) > 2^32 (plus a bit more). If that were true, then the code can't possibly run on 32-bits, and it does, so we can't reach this condition.
Then we get to what I think your objection is, which is that on 64-bit, we could allocate more than 4GB, and this code disallows that. I had assumed that we should disallow that because the type of n is 32-bit. If you are not going to disallow that, then we need to audit all the rest of the code to ensure that nothing else is going to get truncated if it does go over 4GB.
If we're really sure that all of the code properly handles huge packages (do we have a use case for this?), then we should change the type of n to size_t, and if we do that, then I do need a different check for overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we don't need to care about tmp
overflow case too.
I'm sorry I looked over unsigned int size;
(original code). It is too small type to treat size type. I think that it should be size_t size;
. msgpack_zone_malloc()
's parameter type is size_t
. https://github.com/msgpack/msgpack-c/blob/master/include/msgpack/zone.h#L106
That is important point.
If unsigned int size;
is replaced with size_t size;
, if (tmp & 0xffffffff00000000) {
could be false positive. That is I want to say.
On 64bit environment I mentioned above, size_t
and unsigned long long
has the same size.
In the case n
is 0x1000 0000
and sizeof(msgpack_object)
is 24
, tmp
is 6,442,450,944 (0x1 8000 0000)
. It is about 6GB. It happens on actual usecase. But if (tmp & 0xffffffff00000000) {
is evaluated to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we change size from unsigned int to size_t, then we also have a lot of other checking to do to ensure that we're not truncating somewhere. That can be done, but do we actually have use cases for 4GB messages? If so, then let's go do the rest of the work. Is there a spec for the overall format?
Also, if we can compile this as C++, we can just use SafeInt to check overflows, but I don't know whether that's OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, from looking at it, I'm not thrilled with the specification. The size limits can be some trouble - from https://github.com/msgpack/msgpack/blob/master/spec.md#limitation, seems that the size of some individual items can be up to 4GB, and you can have 2^32-1 number of items in an array, so put the two of those together and we need the entire 64-bit space to address them. Which we clearly cannot do in 32 bits, and we actually can't allocate that much on any real system. But from the spec, we just have to be limited by what we can allocate, which can be tricky. We also need to be very careful to check for overflows on 32-bit systems. If we can go to C++, then we can use the new std integer types that specify sizes, and aren't compiler dependent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, if we can use these, that would really help - like so std::int64_t - otherwise, some weird compiler will end up biting you. I just got a bug for SafeInt for version 3.0.18 that was fixed in 3.0.20 because I adopted these standard types. But that means C++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, from looking at it, I'm not thrilled with the specification. The size limits can be some trouble - from https://github.com/msgpack/msgpack/blob/master/spec.md#limitation, seems that the size of some individual items can be up to 4GB, and you can have 2^32-1 number of items in an array, so put the two of those together and we need the entire 64-bit space to address them.
MessagePack format contains a kind of composite (recursive) types. And the limit of recursion level (depth) is unspecified. That means the format allows ARRAY of ARRAY of ARRAY of ... (infinity). For C++ (not C), user can define limits https://github.com/msgpack/msgpack-c/blob/master/include/msgpack/v1/unpack_decl.hpp#L89
Here are MessagePack types that have up to 32bit length.
type | contant type |
---|---|
ARRAY | MessagePack(all types) |
MAP | MessagePack(all types) |
STR | byte |
BIN | byte |
EXT | byte |
- EXT's maximum bytes is 0xffffffff+1. See - Added unpack limit. #175
I think that only ARRAY and MAP require size_t
type. They have any type of children. But the implementation always allocates sizeof(msgpack_object)
. So we can calculate the total size from n
and sizeof(msgpack_object)
.
+------------+------+---------------+---------------+---------------+---------------+
|ARRAY header|length|msgpack_object |msgpack_object |msgpack_object |msgpack_object |
+------------+------+---------------+---------------+---------------+---------------+
| | | |
| | | |
| | | V
These are allocated | | V +------+
at different timing | V +-------+ | |
V +---------+ | | +------+
+----------+ | | +-------+
| | +---------+
+----------+
I'm not 100% sure but replacing unsigned int
with size_t
at ARRAY
and MAP
and then add appropriate integer overflow checking is a practical approach. In the checking, we don't need to care the (possible) MessagePack format limit. We can depend on OS limit (in C, malloc returns NULL).
Based on our conversation, I like this better, and it is very efficient. However, I am worried that there could be problems elsewhere, and will take a look. BTW, please look at the other issue I opened - I am concerned it could be exploitable. |
the block if it is a C file.
This appears to have changed something - @redboltz , could you please take a look. We have tests that are now failing. |
src/unpack.c
Outdated
// to check for int overflows. | ||
// Note - while n is constrained to 32-bit, the product of n * sizeof(msgpack_object) | ||
// might not be constrained to 4GB on 64-bit systems | ||
if( n > SIZE_MAX/sizeof(msgpack_object)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the comparison is always false on some 64bit environment.
See https://travis-ci.org/msgpack/msgpack-c/jobs/529103694#L1553
It is a warning but the CI regard it as error.
I don't come up with how to fix it, so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed n
is unsigned int
. So the solution is
if ((size_t)n > > SIZE_MAX/sizeof(msgpack_object))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if that will help. Do we happen to detect somewhere whether we have 64 or 32-bit environment, perhaps set a #define that we can check. If the compiler is smart enough, it will sort this out. I'll try fixing the tab, and try this, we can see what it will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ((size_t)n > SIZE_MAX/sizeof(msgpack_object))
Extra >
is my typo. On typical 64bit environment, SIZE_MAX
is 0xffffffffffffffff
, sizeof(msgpack_object)
is 24
. The max value of unsigned int
is 0xffffffff
.
so 0xffffffff > 0xffffffffffffffff / 24
is always false. Compiler warns it.
If cast is added, the expression becomes 0xffffffffffffffff > 0xffffffffffffffff / 24
. The first 0xffffffffffffffff
means 0x0 .. 0xffffffffffffffff
depends on n
. So comparison result is judged on runtime. As the result of that the warning will be disappeared. That is my guessing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's exactly the problem. We'd prefer to only do this check in 32-bit. The issue is that doing it this way causes the check to only be a compile time constant comparison, which is very efficient, and what we want. But in 64-bit, it is too efficient, and the compiler complains instead of discarding the branch completely.
The size_t cast might fool it, let's see. If this were C++, I can easily fix the problem with a template specialization, but this isn't C++ - though it does seem to compile cleanly as C++. I'm trying not to cause large, possibly disruptive changes.
I have another couple of angles I can try that will get us there, possibly an enum. Do we perhaps have a #defined constant that lets me know if it is 64-bit, and then I can just #ifdef around it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the way the cast is defined, doing (size_t)n can only ever result in a value of 0x00000000fffffffff, because n is unsigned, and must zero extend. So maybe the cast will get the clever compiler to be quiet, but maybe not. Let's see what it does with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, I'm watching travis-ci result (still working). The warnings on clang are disappeared. But gcc is still outputs warnings.
The operator sizeof()
is evaluated compile time, not preprocess time, but macro SIZE_MAX
is evaluated preprocess time. So something like the following code would help
#if SIZE_MAX > UINT_MAX
src/unpack.c
Outdated
// Note - while n is constrained to 32-bit, the product of n * sizeof(msgpack_object) | ||
// might not be constrained to 4GB on 64-bit systems | ||
|
||
if(n > SIZE_MAX/sizeof(msgpack_object_kv)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to the comment for template_callback_array.
src/unpack.c
Outdated
@@ -189,20 +189,27 @@ static inline int template_callback_false(unpack_user* u, msgpack_object* o) | |||
|
|||
static inline int template_callback_array(unpack_user* u, unsigned int n, msgpack_object* o) | |||
{ | |||
unsigned int size; | |||
size_t size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use spaces instead of TAB.
src/unpack.c
Outdated
@@ -222,20 +229,28 @@ static inline int template_callback_array_item(unpack_user* u, msgpack_object* c | |||
|
|||
static inline int template_callback_map(unpack_user* u, unsigned int n, msgpack_object* o) | |||
{ | |||
unsigned int size; | |||
size_t size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use spaces instead of TAB.
I think that error on appveyor(Windows) is server trouble. I restarted the failed build on appveyor. |
Thank you for updating. It looks good to me. |
Same fix as msgpack#776 on C.
No description provided.