Skip to content

Fix for issue 775 #776

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 12, 2019
Merged

Fix for issue 775 #776

merged 8 commits into from
May 12, 2019

Conversation

dcleblanc
Copy link
Contributor

No description provided.

David LeBlanc added 4 commits April 11, 2019 15:54
Fix possibly incorrect integer overflow check with an efficient correct check. Not fixing the issue where size == 0, unsure if this is by design, or what error to return if not.
src/unpack.c Outdated
if (size / sizeof(msgpack_object) != n) {
tmp = (unsigned long long)n * sizeof(msgpack_object);

if (tmp & 0xffffffff00000000) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for my late reply. I considered many cases.

I wrote a small test code to check sizeof types.

#include <msgpack.hpp>
#include <iostream>

int main() {
    std::cout << "msgpack_object    : " << sizeof(msgpack_object) << std::endl;
    std::cout << "size_t            : " << sizeof(size_t) << std::endl;
    std::cout << "unsigned int      : " << sizeof(unsigned int) << std::endl;
    std::cout << "unsigned long long: " <<sizeof(unsigned long long) << std::endl;
}

On 32bit environment, the output is:

msgpack_object    : 16
size_t            : 4
unsigned int      : 4
unsigned long long: 8

On 64bit environment, the output is:

msgpack_object    : 24
size_t            : 8
unsigned int      : 4
unsigned long long: 8

The max value of n is 0xffffffff. See https://github.com/msgpack/msgpack/blob/master/spec.md#array-format-family
msgpack_zone_malloc()'s 2nd parameter type is size_t.
Your check is valid on 32bit environment.

However, on 64bit environment, tmp could be greater than 0xffffffff and msgpack_zone_malloc() should be accept it. So I think that your check is false positive in this case.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type of sizeof(msgpack_object) might be 64-bit, but the value is going to much smaller, far less than 2^32-1. And n is an unsigned int. It is true that a compiler implementation could exist where sizeof(unsigned int) > 4, but we don't have that now. A static_assert for that might be helpful, or we can also just use the SafeInt library, and let it do the right thing.

But if sizeof(unsigned int) == 4, then our maximum possible value going into tmp is this:

(2^32 - 1) * sizeof(msgpack_object), which then expands to:

2^32 * sizeof(msgpack_object) - sizeof(msgpack_object)

The only way this can possibly overrun the capacity of tmp, which is 2^64-1, is for sizeof(msgpack_object) > 2^32 (plus a bit more). If that were true, then the code can't possibly run on 32-bits, and it does, so we can't reach this condition.

Then we get to what I think your objection is, which is that on 64-bit, we could allocate more than 4GB, and this code disallows that. I had assumed that we should disallow that because the type of n is 32-bit. If you are not going to disallow that, then we need to audit all the rest of the code to ensure that nothing else is going to get truncated if it does go over 4GB.

If we're really sure that all of the code properly handles huge packages (do we have a use case for this?), then we should change the type of n to size_t, and if we do that, then I do need a different check for overflow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we don't need to care about tmp overflow case too.

I'm sorry I looked over unsigned int size; (original code). It is too small type to treat size type. I think that it should be size_t size;. msgpack_zone_malloc()'s parameter type is size_t. https://github.com/msgpack/msgpack-c/blob/master/include/msgpack/zone.h#L106
That is important point.

If unsigned int size; is replaced with size_t size;, if (tmp & 0xffffffff00000000) { could be false positive. That is I want to say.

On 64bit environment I mentioned above, size_t and unsigned long long has the same size.
In the case n is 0x1000 0000 and sizeof(msgpack_object) is 24, tmp is 6,442,450,944 (0x1 8000 0000). It is about 6GB. It happens on actual usecase. But if (tmp & 0xffffffff00000000) { is evaluated to true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change size from unsigned int to size_t, then we also have a lot of other checking to do to ensure that we're not truncating somewhere. That can be done, but do we actually have use cases for 4GB messages? If so, then let's go do the rest of the work. Is there a spec for the overall format?

Also, if we can compile this as C++, we can just use SafeInt to check overflows, but I don't know whether that's OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, from looking at it, I'm not thrilled with the specification. The size limits can be some trouble - from https://github.com/msgpack/msgpack/blob/master/spec.md#limitation, seems that the size of some individual items can be up to 4GB, and you can have 2^32-1 number of items in an array, so put the two of those together and we need the entire 64-bit space to address them. Which we clearly cannot do in 32 bits, and we actually can't allocate that much on any real system. But from the spec, we just have to be limited by what we can allocate, which can be tricky. We also need to be very careful to check for overflows on 32-bit systems. If we can go to C++, then we can use the new std integer types that specify sizes, and aren't compiler dependent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, if we can use these, that would really help - like so std::int64_t - otherwise, some weird compiler will end up biting you. I just got a bug for SafeInt for version 3.0.18 that was fixed in 3.0.20 because I adopted these standard types. But that means C++.

Copy link
Contributor

@redboltz redboltz May 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, from looking at it, I'm not thrilled with the specification. The size limits can be some trouble - from https://github.com/msgpack/msgpack/blob/master/spec.md#limitation, seems that the size of some individual items can be up to 4GB, and you can have 2^32-1 number of items in an array, so put the two of those together and we need the entire 64-bit space to address them.

MessagePack format contains a kind of composite (recursive) types. And the limit of recursion level (depth) is unspecified. That means the format allows ARRAY of ARRAY of ARRAY of ... (infinity). For C++ (not C), user can define limits https://github.com/msgpack/msgpack-c/blob/master/include/msgpack/v1/unpack_decl.hpp#L89

Here are MessagePack types that have up to 32bit length.

type contant type
ARRAY MessagePack(all types)
MAP MessagePack(all types)
STR byte
BIN byte
EXT byte

I think that only ARRAY and MAP require size_t type. They have any type of children. But the implementation always allocates sizeof(msgpack_object). So we can calculate the total size from n and sizeof(msgpack_object).

+------------+------+---------------+---------------+---------------+---------------+
|ARRAY header|length|msgpack_object |msgpack_object |msgpack_object |msgpack_object |
+------------+------+---------------+---------------+---------------+---------------+
                           |               |               |               |
                           |               |               |               |
                           |               |               |               V
       These are allocated |               |               V           +------+
       at different timing |               V           +-------+       |      |
                           V          +---------+      |       |       +------+
                     +----------+     |         |      +-------+
                     |          |     +---------+
                     +----------+

I'm not 100% sure but replacing unsigned int with size_t at ARRAY and MAP and then add appropriate integer overflow checking is a practical approach. In the checking, we don't need to care the (possible) MessagePack format limit. We can depend on OS limit (in C, malloc returns NULL).

@dcleblanc
Copy link
Contributor Author

Based on our conversation, I like this better, and it is very efficient. However, I am worried that there could be problems elsewhere, and will take a look.

BTW, please look at the other issue I opened - I am concerned it could be exploitable.

@dcleblanc
Copy link
Contributor Author

This appears to have changed something - @redboltz , could you please take a look. We have tests that are now failing.

src/unpack.c Outdated
// to check for int overflows.
// Note - while n is constrained to 32-bit, the product of n * sizeof(msgpack_object)
// might not be constrained to 4GB on 64-bit systems
if( n > SIZE_MAX/sizeof(msgpack_object))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the comparison is always false on some 64bit environment.
See https://travis-ci.org/msgpack/msgpack-c/jobs/529103694#L1553
It is a warning but the CI regard it as error.

I don't come up with how to fix it, so far.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed n is unsigned int. So the solution is

if ((size_t)n > > SIZE_MAX/sizeof(msgpack_object))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if that will help. Do we happen to detect somewhere whether we have 64 or 32-bit environment, perhaps set a #define that we can check. If the compiler is smart enough, it will sort this out. I'll try fixing the tab, and try this, we can see what it will do.

Copy link
Contributor

@redboltz redboltz May 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if ((size_t)n > SIZE_MAX/sizeof(msgpack_object))

Extra > is my typo. On typical 64bit environment, SIZE_MAX is 0xffffffffffffffff, sizeof(msgpack_object) is 24. The max value of unsigned int is 0xffffffff.
so 0xffffffff > 0xffffffffffffffff / 24 is always false. Compiler warns it.

If cast is added, the expression becomes 0xffffffffffffffff > 0xffffffffffffffff / 24. The first 0xffffffffffffffff means 0x0 .. 0xffffffffffffffff depends on n. So comparison result is judged on runtime. As the result of that the warning will be disappeared. That is my guessing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's exactly the problem. We'd prefer to only do this check in 32-bit. The issue is that doing it this way causes the check to only be a compile time constant comparison, which is very efficient, and what we want. But in 64-bit, it is too efficient, and the compiler complains instead of discarding the branch completely.

The size_t cast might fool it, let's see. If this were C++, I can easily fix the problem with a template specialization, but this isn't C++ - though it does seem to compile cleanly as C++. I'm trying not to cause large, possibly disruptive changes.

I have another couple of angles I can try that will get us there, possibly an enum. Do we perhaps have a #defined constant that lets me know if it is 64-bit, and then I can just #ifdef around it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the way the cast is defined, doing (size_t)n can only ever result in a value of 0x00000000fffffffff, because n is unsigned, and must zero extend. So maybe the cast will get the clever compiler to be quiet, but maybe not. Let's see what it does with it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, I'm watching travis-ci result (still working). The warnings on clang are disappeared. But gcc is still outputs warnings.

The operator sizeof() is evaluated compile time, not preprocess time, but macro SIZE_MAX is evaluated preprocess time. So something like the following code would help

#if SIZE_MAX > UINT_MAX

src/unpack.c Outdated
// Note - while n is constrained to 32-bit, the product of n * sizeof(msgpack_object)
// might not be constrained to 4GB on 64-bit systems

if(n > SIZE_MAX/sizeof(msgpack_object_kv))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to the comment for template_callback_array.

src/unpack.c Outdated
@@ -189,20 +189,27 @@ static inline int template_callback_false(unpack_user* u, msgpack_object* o)

static inline int template_callback_array(unpack_user* u, unsigned int n, msgpack_object* o)
{
unsigned int size;
size_t size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use spaces instead of TAB.

src/unpack.c Outdated
@@ -222,20 +229,28 @@ static inline int template_callback_array_item(unpack_user* u, msgpack_object* c

static inline int template_callback_map(unpack_user* u, unsigned int n, msgpack_object* o)
{
unsigned int size;
size_t size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use spaces instead of TAB.

@redboltz
Copy link
Contributor

redboltz commented May 9, 2019

I think that error on appveyor(Windows) is server trouble. I restarted the failed build on appveyor.

@redboltz
Copy link
Contributor

Thank you for updating. It looks good to me.
Size 0 case isn't confirmed yet but it is separated issue.
So I merge the PR.

@redboltz redboltz merged commit 9389912 into msgpack:master May 12, 2019
redboltz added a commit to redboltz/msgpack-c that referenced this pull request Jul 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants