Prevent integer overflows in encoded message length computations #1196

mzabaluev · 2024-11-28T21:19:47Z

The encoded_len* functions, as well as encode_packed in the varint field encoding modules, compute the total encoding length with unchecked arithmetic. As the added tests demonstrate, this makes it possible to cause an integer overflow with feasible crafted payloads on 32-bit targets. As programs are normally compiled without overflow checks in the release mode, this theoretically makes injection scenarios possible where the length marker on an encoded message field falls short of the data that was actually encoded (by e.g. a proxy working on untrusted input, but encoding Protobuf for trusting parties).

This, however, does not seem easy to exploit in typical usage. An input capable of causing an overflow in evaluating the length of the encoded message requires memory on the order of gigabytes in non-pathological protocol specifications. Futhermore, it would cause the program to exceed the addressable memory (or rather crash trying) when encoded using one of the library-provided encode* functions into a buffer faithfully implementing BufMut. So, in my understanding, the worst practical threat scenario this presents for typical applications is a hard-to-exploit DoS vector.

Change the length computations in prost::encoding and in code generated by prost-derive to prevent integer overflows where they can occur, by either using checked arithmetic operations, or computing in the u64 domain where causing an overflow is not thought feasible with realistic RAM and CPU budgets.

Add tests showing feasibility of overflowing computations in encoded_len_repeated for messages and certain scalar types.

caspermeijn

This is impressive work! I see you thought of a lot of the problems that can occur with overflow. The test suite seems extensive.

As I read the protobuf encoding spec, the limit for length encoded fields is 2 GB because the length is saved in a 32-bit signed integer. The protobuf limits docs state that “all implementations” limit the total size of message to “<2 GiB”.

I feel like prost should return some kind of “message encoding failed because message is too large” error type. So that the application can handle that error instead of panic.

I think we should create a newtype that can only do checked arithmetic and returns this error type on 2 GB overflow. The advantage is that all uses of a protobuf length are easily visible, and it is harder to miss a spot where the code can overflow. That makes maintenance easier in the future.

caspermeijn · 2024-11-29T13:39:43Z

prost-derive/src/lib.rs

+                let mut acc = 0usize #(+ #encoded_len_limited)*;
+                #(acc = acc.checked_add(#encoded_len_unlimited)
+                    .expect("encoded length overflows usize");)*
+                acc


I don't understand what encoded_len_limit does. It seems to only contain a value for fixed sized fields.

Is this intended as an optimization?

Yes. For singular numeric fields, this gives an upper bound on their encoded length, so the derive macro can use this information to sum lengths of such fields without overflow checks, as seen above. The lengths of fields for which a limit cannot be known statically, are added with checked_add to catch possible overflows.

So if your protobuf message only has fixed-size fields, its encoded_len method would not need to use checked_add and its performance would not be impacted.

As mentioned in the description to 4901e1a, the whole limit thing is rather an overkill: it would be enough to tell "this field is known to have a small encoding", so lengths of such fields could be summed without checks, and there should never be so many singular fields in a message that this sum would overflow u32.

the whole limit thing is rather an overkill

To clarify, I coded that for a more comprehensive approach to determining limits on encoded message lengths, where e.g. if a message type would have a known limit derived from its fields, this could be passed on to determine the limit for another message type containing a field of the first type, and so on. But as mentioned in a comment, this raises safety issues when such information is passed to prost-derive via attributes and may be arbitrarily fudged by the macro user.

caspermeijn · 2024-11-29T13:41:21Z

.github/workflows/ci.yml

+          do
+            cargo test -p tests-overflow --target i686-unknown-linux-gnu -- \
+              --ignored --exact $test
+          done


All tests are run on x86 32-bit. Two memory exhaustive tests are run separate. Do I understand correctly?

That's correct. The two isolated tests do not exhaust all addressable memory by themselves, but the size of the allocations makes them abort when selected together with other tests even in serial runs, possibly due to heap fragmentation.

prost/src/encoding.rs

caspermeijn · 2024-11-29T13:43:41Z

prost/src/encoding.rs

+                let len = values.iter().map(|$to_uint64_value| {
+                    encoded_len_varint($to_uint64) as u64


This reduces the chance of an overflow, but doesn't prevent it, right?

There is no practical chance to overflow this in u64, as the valid slice needed to do this would be impossibly large.

caspermeijn · 2024-11-29T13:50:17Z

prost/src/encoding.rs

+                            encoded_len_varint($to_uint64) as u64
+                        })
+                        .sum::<u64>();
+                len.try_into().expect("encoded length overflows usize")


This crashes the application when an overflow is detected. I am not convinced that panic improves the situation.

Should encoded_len return a Result<u64, MessageTooLargeError>?

I am not convinced that panic improves the situation.

This is better than silently returning an overflowed length, which is the status quo with release builds.

Should encoded_len return a Result<u64, MessageTooLargeError>?

Yes, but I meant this PR to be merged for a 0.13.x release so that current applications could be patched without API breaks.

In a later change, we can change the signature of encoded_len and grep for all occurrences of .expect("encoded length overflows usize") (I took care to use this distinctive panic message everywhere) and convert them to error returns.

caspermeijn · 2024-11-29T13:52:20Z

prost/src/encoding.rs

@@ -519,6 +529,10 @@ fixed_width!(
    get_i64_le
 );

+const fn max_key_len_fits_in_size_of_elem<T>(_: &[T]) -> bool {
+    key_len(MAX_TAG) <= mem::size_of::<T>()


I don't understand what is happening here

This is a check to ensure that the encoding is "non-expanding", that is, it takes at least as much memory to allocate the data to be encoded as what it would take to encode it. So the length computation before which this is asserted cannot overflow usize and can be performed with unchecked arithmetic.

I have added a comment explaining the purpose of the function in the updated commit 699bd9f.

mzabaluev · 2024-11-29T16:12:30Z

I think we should create a newtype that can only do checked arithmetic and returns this error type on 2 GB overflow. The advantage is that all uses of a protobuf length are easily visible, and it is harder to miss a spot where the code can overflow. That makes maintenance easier in the future.

Yes, a newtype enforcing the length would be more bulletproof, but performance of length computations will be impacted (which is the general reason why Rust does not do these checks on every arithmetic operation in optimized builds). This is also a breaking change if you want to use that type also in Message::encoded_len and elsewhere. I constrained this work so as to be able to patch the current release series.

Replace unchecked additions and multiplications of usize values depending on encoded message sizes with either checked arithmetic or operations in u64 domain to prevent integer overflows. In encoding operations reserving buffer capacity, backstop overflows with saturating_add when evaluating the required buffer size.

Generate the Message::encoded_len implementation for messages so that encoded lengths of fields that have no statically known limit are summed with overflow checks. Add Field::encoded_len_limit method and supporting code to return such limits for fields that have them.

Check that the results given by the field::scalar::Ty::encoded_len_limit method in prost-derive are minimal, by providing values that reach the limit in the encoded_len method.

Split the contents of encoded_len file into submodule files for limit and overfow tests.

To test this, add a repeated empty message field to the testbed message definition, so we can use the zero-sized filler vector to push the encoded length of the message payload to just below u32::MAX. Adding the field tag and the length varint then overflows u32.

Use unsorded field order in codegen computation, this will allows better control for overflow testing. Simplify the code splitting limited and unlimited length fields using the partition_map method from itertools.

Add tests for derived Message::encoded_len implementations, organized in a separate module from the tests for functions in prost::encoding. Fix compilation of no_std tests.

Need to optimize tests and disable overflow checks to make the overflow tests perform acceptably.

Large contiguous allocations that are used in some of the tests can fail on 32-bit targets when the tests are run in company with other tests making large allocations.

Move the encoded_len_limit tests to the tests-overflow crate, as well as the encoded_len.proto file used to generate the testbed. I was getting carried away with testing these limits, up to adding Kani proofs, but the added CI complexity is not worth it. It could have been a boolean encoded_len_limited method for what matters, as the number of fields in the message is limited enough so that overflowing unchecked arithmetic on lengths of the numeric fields should never be a concern in practice. Still, the tests-overflow crate is close enough thematically and is not saddled with a dependency on protobuf which takes a long time to build.

Add a test Protobuf message that only has unlimited length fields, to check that the derived encoded_len method implementation is correct in this case, confirmed with a few test cases.

test: Cause overflow in encoded_len_repeated

f4aebfe

Add tests showing feasibility of overflowing computations in encoded_len_repeated for messages and certain scalar types.

mzabaluev force-pushed the encoded-len-integer-overflow branch from f24bb81 to 37c4258 Compare November 28, 2024 22:52

caspermeijn reviewed Nov 29, 2024

View reviewed changes

mzabaluev added 13 commits November 29, 2024 21:08

test: confirm encoded_len_limit

a1b246c

Check that the results given by the field::scalar::Ty::encoded_len_limit method in prost-derive are minimal, by providing values that reach the limit in the encoded_len method.

test: reorg encoded_len tests

fe19197

Split the contents of encoded_len file into submodule files for limit and overfow tests.

refactor(prost-derive): encoded_len codegen

91c8bcb

Use unsorded field order in codegen computation, this will allows better control for overflow testing. Simplify the code splitting limited and unlimited length fields using the partition_map method from itertools.

test: improve and expand encoded_len::overflow

b594f41

Add tests for derived Message::encoded_len implementations, organized in a separate module from the tests for functions in prost::encoding. Fix compilation of no_std tests.

test: Move overflow tests to own crate

1d03665

Need to optimize tests and disable overflow checks to make the overflow tests perform acceptably.

test: overflow for encoded_len of map fields

473479b

ci: run tests-overflow tests on i686

944a683

ci: isolate overflow tests with large allocations

de05493

Large contiguous allocations that are used in some of the tests can fail on 32-bit targets when the tests are run in company with other tests making large allocations.

test: derived encoded_len, only checked fields

fe03f58

Add a test Protobuf message that only has unlimited length fields, to check that the derived encoded_len method implementation is correct in this case, confirmed with a few test cases.

mzabaluev force-pushed the encoded-len-integer-overflow branch from 9723f14 to fe03f58 Compare November 29, 2024 21:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent integer overflows in encoded message length computations #1196

Prevent integer overflows in encoded message length computations #1196

mzabaluev commented Nov 28, 2024 •

edited

Loading

caspermeijn left a comment

caspermeijn Nov 29, 2024

mzabaluev Nov 29, 2024

mzabaluev Nov 29, 2024

caspermeijn Nov 29, 2024

mzabaluev Nov 29, 2024 •

edited

Loading

caspermeijn Nov 29, 2024

mzabaluev Nov 29, 2024

caspermeijn Nov 29, 2024

mzabaluev Nov 29, 2024

caspermeijn Nov 29, 2024

mzabaluev Nov 29, 2024

mzabaluev Nov 30, 2024

mzabaluev commented Nov 29, 2024

		let len = values.iter().map(\|$to_uint64_value\| {
		encoded_len_varint($to_uint64) as u64

Prevent integer overflows in encoded message length computations #1196

Are you sure you want to change the base?

Prevent integer overflows in encoded message length computations #1196

Conversation

mzabaluev commented Nov 28, 2024 • edited Loading

caspermeijn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzabaluev Nov 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzabaluev commented Nov 29, 2024

mzabaluev commented Nov 28, 2024 •

edited

Loading

mzabaluev Nov 29, 2024 •

edited

Loading