-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent integer overflows in encoded message length computations #1196
base: master
Are you sure you want to change the base?
Changes from all commits
f4aebfe
699bd9f
51d2e4a
a1b246c
fe19197
cc84bf1
91c8bcb
b594f41
1d03665
473479b
944a683
de05493
13ac570
fe03f58
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,7 +6,7 @@ extern crate alloc; | |
extern crate proc_macro; | ||
|
||
use anyhow::{bail, Error}; | ||
use itertools::Itertools; | ||
use itertools::{Either, Itertools}; | ||
use proc_macro2::{Span, TokenStream}; | ||
use quote::quote; | ||
use syn::{ | ||
|
@@ -103,9 +103,25 @@ fn try_message(input: TokenStream) -> Result<TokenStream, Error> { | |
) | ||
}; | ||
|
||
let encoded_len = fields | ||
// For encoded_len, split the fields into those that have a known length limit | ||
// and those that don't. The sum of the known lengths should not overflow usize. | ||
// For purposes of testing, we want both lists to be in declaration order. | ||
let mut total_limit = 0usize; | ||
let (encoded_len_limited, encoded_len_unlimited): (Vec<_>, Vec<_>) = unsorted_fields | ||
.iter() | ||
.map(|(field_ident, field)| field.encoded_len(quote!(self.#field_ident))); | ||
.partition_map(move |(field_ident, field)| { | ||
let encoded_len_expr = field.encoded_len(quote!(self.#field_ident)); | ||
match field | ||
.encoded_len_limit() | ||
.and_then(|limit| total_limit.checked_add(limit)) | ||
{ | ||
Some(sum) => { | ||
total_limit = sum; | ||
Either::Left(encoded_len_expr) | ||
} | ||
None => Either::Right(encoded_len_expr), | ||
} | ||
}); | ||
|
||
let encode = fields | ||
.iter() | ||
|
@@ -196,8 +212,12 @@ fn try_message(input: TokenStream) -> Result<TokenStream, Error> { | |
} | ||
|
||
#[inline] | ||
#[allow(unused_mut)] | ||
fn encoded_len(&self) -> usize { | ||
0 #(+ #encoded_len)* | ||
let mut acc = 0usize #(+ #encoded_len_limited)*; | ||
#(acc = acc.checked_add(#encoded_len_unlimited) | ||
.expect("encoded length overflows usize");)* | ||
acc | ||
Comment on lines
+217
to
+220
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand what Is this intended as an optimization? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. For singular numeric fields, this gives an upper bound on their encoded length, so the derive macro can use this information to sum lengths of such fields without overflow checks, as seen above. The lengths of fields for which a limit cannot be known statically, are added with So if your protobuf message only has fixed-size fields, its As mentioned in the description to 4901e1a, the whole limit thing is rather an overkill: it would be enough to tell "this field is known to have a small encoding", so lengths of such fields could be summed without checks, and there should never be so many singular fields in a message that this sum would overflow There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
To clarify, I coded that for a more comprehensive approach to determining limits on encoded message lengths, where e.g. if a message type would have a known limit derived from its fields, this could be passed on to determine the limit for another message type containing a field of the first type, and so on. But as mentioned in a comment, this raises safety issues when such information is passed to prost-derive via attributes and may be arbitrarily fudged by the macro user. |
||
} | ||
|
||
fn clear(&mut self) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests are run on x86 32-bit. Two memory exhaustive tests are run separate. Do I understand correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. The two isolated tests do not exhaust all addressable memory by themselves, but the size of the allocations makes them abort when selected together with other tests even in serial runs, possibly due to heap fragmentation.