Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent integer overflows in encoded message length computations #1196

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

mzabaluev
Copy link
Contributor

@mzabaluev mzabaluev commented Nov 28, 2024

The encoded_len* functions, as well as encode_packed in the varint field encoding modules, compute the total encoding length with unchecked arithmetic. As the added tests demonstrate, this makes it possible to cause an integer overflow with feasible crafted payloads on 32-bit targets. As programs are normally compiled without overflow checks in the release mode, this theoretically makes injection scenarios possible where the length marker on an encoded message field falls short of the data that was actually encoded (by e.g. a proxy working on untrusted input, but encoding Protobuf for trusting parties).

This, however, does not seem easy to exploit in typical usage. An input capable of causing an overflow in evaluating the length of the encoded message requires memory on the order of gigabytes in non-pathological protocol specifications. Futhermore, it would cause the program to exceed the addressable memory (or rather crash trying) when encoded using one of the library-provided encode* functions into a buffer faithfully implementing BufMut. So, in my understanding, the worst practical threat scenario this presents for typical applications is a hard-to-exploit DoS vector.

Change the length computations in prost::encoding and in code generated by prost-derive to prevent integer overflows where they can occur, by either using checked arithmetic operations, or computing in the u64 domain where causing an overflow is not thought feasible with realistic RAM and CPU budgets.

Add tests showing feasibility of overflowing computations in
encoded_len_repeated for messages and certain scalar types.
@mzabaluev mzabaluev force-pushed the encoded-len-integer-overflow branch from f24bb81 to 37c4258 Compare November 28, 2024 22:52
Copy link
Collaborator

@caspermeijn caspermeijn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is impressive work! I see you thought of a lot of the problems that can occur with overflow. The test suite seems extensive.

As I read the protobuf encoding spec, the limit for length encoded fields is 2 GB because the length is saved in a 32-bit signed integer. The protobuf limits docs state that “all implementations” limit the total size of message to “<2 GiB”.

I feel like prost should return some kind of “message encoding failed because message is too large” error type. So that the application can handle that error instead of panic.

I think we should create a newtype that can only do checked arithmetic and returns this error type on 2 GB overflow. The advantage is that all uses of a protobuf length are easily visible, and it is harder to miss a spot where the code can overflow. That makes maintenance easier in the future.

Comment on lines +217 to +220
let mut acc = 0usize #(+ #encoded_len_limited)*;
#(acc = acc.checked_add(#encoded_len_unlimited)
.expect("encoded length overflows usize");)*
acc
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what encoded_len_limit does. It seems to only contain a value for fixed sized fields.

Is this intended as an optimization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. For singular numeric fields, this gives an upper bound on their encoded length, so the derive macro can use this information to sum lengths of such fields without overflow checks, as seen above. The lengths of fields for which a limit cannot be known statically, are added with checked_add to catch possible overflows.

So if your protobuf message only has fixed-size fields, its encoded_len method would not need to use checked_add and its performance would not be impacted.

As mentioned in the description to 4901e1a, the whole limit thing is rather an overkill: it would be enough to tell "this field is known to have a small encoding", so lengths of such fields could be summed without checks, and there should never be so many singular fields in a message that this sum would overflow u32.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the whole limit thing is rather an overkill

To clarify, I coded that for a more comprehensive approach to determining limits on encoded message lengths, where e.g. if a message type would have a known limit derived from its fields, this could be passed on to determine the limit for another message type containing a field of the first type, and so on. But as mentioned in a comment, this raises safety issues when such information is passed to prost-derive via attributes and may be arbitrarily fudged by the macro user.

do
cargo test -p tests-overflow --target i686-unknown-linux-gnu -- \
--ignored --exact $test
done
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All tests are run on x86 32-bit. Two memory exhaustive tests are run separate. Do I understand correctly?

Copy link
Contributor Author

@mzabaluev mzabaluev Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. The two isolated tests do not exhaust all addressable memory by themselves, but the size of the allocations makes them abort when selected together with other tests even in serial runs, possibly due to heap fragmentation.

prost/src/encoding.rs Outdated Show resolved Hide resolved
Comment on lines +281 to +282
let len = values.iter().map(|$to_uint64_value| {
encoded_len_varint($to_uint64) as u64
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reduces the chance of an overflow, but doesn't prevent it, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no practical chance to overflow this in u64, as the valid slice needed to do this would be impossibly large.

encoded_len_varint($to_uint64) as u64
})
.sum::<u64>();
len.try_into().expect("encoded length overflows usize")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This crashes the application when an overflow is detected. I am not convinced that panic improves the situation.

Should encoded_len return a Result<u64, MessageTooLargeError>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced that panic improves the situation.

This is better than silently returning an overflowed length, which is the status quo with release builds.

Should encoded_len return a Result<u64, MessageTooLargeError>?

Yes, but I meant this PR to be merged for a 0.13.x release so that current applications could be patched without API breaks.

In a later change, we can change the signature of encoded_len and grep for all occurrences of .expect("encoded length overflows usize") (I took care to use this distinctive panic message everywhere) and convert them to error returns.

@@ -519,6 +529,10 @@ fixed_width!(
get_i64_le
);

const fn max_key_len_fits_in_size_of_elem<T>(_: &[T]) -> bool {
key_len(MAX_TAG) <= mem::size_of::<T>()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what is happening here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a check to ensure that the encoding is "non-expanding", that is, it takes at least as much memory to allocate the data to be encoded as what it would take to encode it. So the length computation before which this is asserted cannot overflow usize and can be performed with unchecked arithmetic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a comment explaining the purpose of the function in the updated commit 699bd9f.

@mzabaluev
Copy link
Contributor Author

I think we should create a newtype that can only do checked arithmetic and returns this error type on 2 GB overflow. The advantage is that all uses of a protobuf length are easily visible, and it is harder to miss a spot where the code can overflow. That makes maintenance easier in the future.

Yes, a newtype enforcing the length would be more bulletproof, but performance of length computations will be impacted (which is the general reason why Rust does not do these checks on every arithmetic operation in optimized builds). This is also a breaking change if you want to use that type also in Message::encoded_len and elsewhere. I constrained this work so as to be able to patch the current release series.

Replace unchecked additions and multiplications of usize values
depending on encoded message sizes with either checked arithmetic
or operations in u64 domain to prevent integer overflows.

In encoding operations reserving buffer capacity, backstop overflows
with saturating_add when evaluating the required buffer size.
Generate the Message::encoded_len implementation for messages so that
encoded lengths of fields that have no statically known limit are
summed with overflow checks.
Add Field::encoded_len_limit method and supporting code to return
such limits for fields that have them.
Check that the results given by the
field::scalar::Ty::encoded_len_limit method in prost-derive are minimal,
by providing values that reach the limit in the encoded_len method.
Split the contents of encoded_len file into submodule
files for limit and overfow tests.
To test this, add a repeated empty message field to the testbed message
definition, so we can use the zero-sized filler vector to push the
encoded length of the message payload to just below u32::MAX.
Adding the field tag and the length varint then overflows u32.
Use unsorded field order in codegen computation, this will allows better
control for overflow testing. Simplify the code splitting limited and
unlimited length fields using the partition_map method from itertools.
Add tests for derived Message::encoded_len implementations, organized
in a separate module from the tests for functions in prost::encoding.

Fix compilation of no_std tests.
Need to optimize tests and disable overflow checks
to make the overflow tests perform acceptably.
Large contiguous allocations that are used in some of the tests can
fail on 32-bit targets when the tests are run in company with other
tests making large allocations.
Move the encoded_len_limit tests to the tests-overflow crate, as well as
the encoded_len.proto file used to generate the testbed.

I was getting carried away with testing these limits, up to adding Kani
proofs, but the added CI complexity is not worth it. It could have been
a boolean encoded_len_limited method for what matters, as the number
of fields in the message is limited enough so that overflowing unchecked
arithmetic on lengths of the numeric fields should never be a concern
in practice.

Still, the tests-overflow crate is close enough thematically
and is not saddled with a dependency on protobuf which takes a long
time to build.
Add a test Protobuf message that only has unlimited length fields,
to check that the derived encoded_len method implementation is correct
in this case, confirmed with a few test cases.
@mzabaluev mzabaluev force-pushed the encoded-len-integer-overflow branch from 9723f14 to fe03f58 Compare November 29, 2024 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants