-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent integer overflows in encoded message length computations #1196
base: master
Are you sure you want to change the base?
Prevent integer overflows in encoded message length computations #1196
Conversation
Add tests showing feasibility of overflowing computations in encoded_len_repeated for messages and certain scalar types.
f24bb81
to
37c4258
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is impressive work! I see you thought of a lot of the problems that can occur with overflow. The test suite seems extensive.
As I read the protobuf encoding spec, the limit for length encoded fields is 2 GB because the length is saved in a 32-bit signed integer. The protobuf limits docs state that “all implementations” limit the total size of message to “<2 GiB”.
I feel like prost
should return some kind of “message encoding failed because message is too large” error type. So that the application can handle that error instead of panic.
I think we should create a newtype that can only do checked arithmetic and returns this error type on 2 GB overflow. The advantage is that all uses of a protobuf length are easily visible, and it is harder to miss a spot where the code can overflow. That makes maintenance easier in the future.
let mut acc = 0usize #(+ #encoded_len_limited)*; | ||
#(acc = acc.checked_add(#encoded_len_unlimited) | ||
.expect("encoded length overflows usize");)* | ||
acc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what encoded_len_limit
does. It seems to only contain a value for fixed sized fields.
Is this intended as an optimization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. For singular numeric fields, this gives an upper bound on their encoded length, so the derive macro can use this information to sum lengths of such fields without overflow checks, as seen above. The lengths of fields for which a limit cannot be known statically, are added with checked_add
to catch possible overflows.
So if your protobuf message only has fixed-size fields, its encoded_len
method would not need to use checked_add
and its performance would not be impacted.
As mentioned in the description to 4901e1a, the whole limit thing is rather an overkill: it would be enough to tell "this field is known to have a small encoding", so lengths of such fields could be summed without checks, and there should never be so many singular fields in a message that this sum would overflow u32
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the whole limit thing is rather an overkill
To clarify, I coded that for a more comprehensive approach to determining limits on encoded message lengths, where e.g. if a message type would have a known limit derived from its fields, this could be passed on to determine the limit for another message type containing a field of the first type, and so on. But as mentioned in a comment, this raises safety issues when such information is passed to prost-derive via attributes and may be arbitrarily fudged by the macro user.
do | ||
cargo test -p tests-overflow --target i686-unknown-linux-gnu -- \ | ||
--ignored --exact $test | ||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests are run on x86 32-bit. Two memory exhaustive tests are run separate. Do I understand correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. The two isolated tests do not exhaust all addressable memory by themselves, but the size of the allocations makes them abort when selected together with other tests even in serial runs, possibly due to heap fragmentation.
let len = values.iter().map(|$to_uint64_value| { | ||
encoded_len_varint($to_uint64) as u64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reduces the chance of an overflow, but doesn't prevent it, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no practical chance to overflow this in u64
, as the valid slice needed to do this would be impossibly large.
encoded_len_varint($to_uint64) as u64 | ||
}) | ||
.sum::<u64>(); | ||
len.try_into().expect("encoded length overflows usize") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This crashes the application when an overflow is detected. I am not convinced that panic improves the situation.
Should encoded_len
return a Result<u64, MessageTooLargeError>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not convinced that panic improves the situation.
This is better than silently returning an overflowed length, which is the status quo with release builds.
Should encoded_len return a Result<u64, MessageTooLargeError>?
Yes, but I meant this PR to be merged for a 0.13.x release so that current applications could be patched without API breaks.
In a later change, we can change the signature of encoded_len
and grep for all occurrences of .expect("encoded length overflows usize")
(I took care to use this distinctive panic message everywhere) and convert them to error returns.
prost/src/encoding.rs
Outdated
@@ -519,6 +529,10 @@ fixed_width!( | |||
get_i64_le | |||
); | |||
|
|||
const fn max_key_len_fits_in_size_of_elem<T>(_: &[T]) -> bool { | |||
key_len(MAX_TAG) <= mem::size_of::<T>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what is happening here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a check to ensure that the encoding is "non-expanding", that is, it takes at least as much memory to allocate the data to be encoded as what it would take to encode it. So the length computation before which this is asserted cannot overflow usize
and can be performed with unchecked arithmetic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a comment explaining the purpose of the function in the updated commit 699bd9f.
Yes, a newtype enforcing the length would be more bulletproof, but performance of length computations will be impacted (which is the general reason why Rust does not do these checks on every arithmetic operation in optimized builds). This is also a breaking change if you want to use that type also in |
Replace unchecked additions and multiplications of usize values depending on encoded message sizes with either checked arithmetic or operations in u64 domain to prevent integer overflows. In encoding operations reserving buffer capacity, backstop overflows with saturating_add when evaluating the required buffer size.
Generate the Message::encoded_len implementation for messages so that encoded lengths of fields that have no statically known limit are summed with overflow checks. Add Field::encoded_len_limit method and supporting code to return such limits for fields that have them.
Check that the results given by the field::scalar::Ty::encoded_len_limit method in prost-derive are minimal, by providing values that reach the limit in the encoded_len method.
Split the contents of encoded_len file into submodule files for limit and overfow tests.
To test this, add a repeated empty message field to the testbed message definition, so we can use the zero-sized filler vector to push the encoded length of the message payload to just below u32::MAX. Adding the field tag and the length varint then overflows u32.
Use unsorded field order in codegen computation, this will allows better control for overflow testing. Simplify the code splitting limited and unlimited length fields using the partition_map method from itertools.
Add tests for derived Message::encoded_len implementations, organized in a separate module from the tests for functions in prost::encoding. Fix compilation of no_std tests.
Need to optimize tests and disable overflow checks to make the overflow tests perform acceptably.
Large contiguous allocations that are used in some of the tests can fail on 32-bit targets when the tests are run in company with other tests making large allocations.
Move the encoded_len_limit tests to the tests-overflow crate, as well as the encoded_len.proto file used to generate the testbed. I was getting carried away with testing these limits, up to adding Kani proofs, but the added CI complexity is not worth it. It could have been a boolean encoded_len_limited method for what matters, as the number of fields in the message is limited enough so that overflowing unchecked arithmetic on lengths of the numeric fields should never be a concern in practice. Still, the tests-overflow crate is close enough thematically and is not saddled with a dependency on protobuf which takes a long time to build.
Add a test Protobuf message that only has unlimited length fields, to check that the derived encoded_len method implementation is correct in this case, confirmed with a few test cases.
9723f14
to
fe03f58
Compare
The
encoded_len*
functions, as well asencode_packed
in the varint field encoding modules, compute the total encoding length with unchecked arithmetic. As the added tests demonstrate, this makes it possible to cause an integer overflow with feasible crafted payloads on 32-bit targets. As programs are normally compiled without overflow checks in the release mode, this theoretically makes injection scenarios possible where the length marker on an encoded message field falls short of the data that was actually encoded (by e.g. a proxy working on untrusted input, but encoding Protobuf for trusting parties).This, however, does not seem easy to exploit in typical usage. An input capable of causing an overflow in evaluating the length of the encoded message requires memory on the order of gigabytes in non-pathological protocol specifications. Futhermore, it would cause the program to exceed the addressable memory (or rather crash trying) when encoded using one of the library-provided
encode*
functions into a buffer faithfully implementingBufMut
. So, in my understanding, the worst practical threat scenario this presents for typical applications is a hard-to-exploit DoS vector.Change the length computations in
prost::encoding
and in code generated by prost-derive to prevent integer overflows where they can occur, by either using checked arithmetic operations, or computing in theu64
domain where causing an overflow is not thought feasible with realistic RAM and CPU budgets.