Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PE: fix address size incompatibility on 32-bit builds #441

Merged
merged 3 commits into from
Jan 5, 2025

Conversation

kkent030315
Copy link
Contributor

@kkent030315 kkent030315 commented Dec 16, 2024

Since PE::entry, PE::image_base and StandardFields::address_of_entry_point are all pubs this PR is not backward compatible.

@kkent030315 kkent030315 mentioned this pull request Dec 16, 2024
17 tasks
@kkent030315
Copy link
Contributor Author

@m4b Hi there, could I get your review for this PR?
btw happy new year!

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems generally ok but i'd like confirmation on the bug location; this is also a pretty big breaking change (Fields in PE), so i'd like to make sure this doesn't add regressions and it's going to be right for the foreseeable future, etc.

@@ -119,7 +119,7 @@ pub struct StandardFields {
/// * For device drivers, this is the address of the initialization function.
///
/// The entry point function is optional for DLLs. When no entry point is present, this member is zero.
pub address_of_entry_point: u64,
pub address_of_entry_point: u32,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like the only actually required fix, but even that is somewhat unclear to me? everything else seems to push the usize <-> u64/u32 casting down into the pe::utils::find_offset, which I admit does seem more elegant.

However, there was a certain clarity of passing a usize into these parse functions since they are ultimately used to compute 32/64-bit addresses/offsets.

Could you elaborate more on why this gets downgraded to a u32 though?

Copy link
Contributor Author

@kkent030315 kkent030315 Jan 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically, this u64 is fine. However,

  1. Its 4 bytes half is always wasted because this is defined as u32 as per PE specification1.

  2. And as you mentioned, the AddressOfEntryPoint is always an RVA (u32) and consumers are eventually forced to make unnecessary downcast if it is defined as u64, one way or another.

Footnotes

  1. https://learn.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-image_optional_header64

/// The entry point RVA of the binary
pub entry: u32,
/// The binary's VA, or image base - useful for computing virtual addreses
pub image_base: u64,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the actual bug? (that we used a usize but on 32-bit systems, this can be u64?)

If yes, is it really required for entry to be fixed to u32? If there's a compelling reason (For consistency perhaps) then it seems fine, but this is a pretty big breaking change, but it may be worth it if it prevents future issues.

Copy link
Contributor Author

@kkent030315 kkent030315 Jan 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this has been a serious hidden bug for long.

Fortunately, RVA calculations (i.e., TLS callbacks map) work fine even it overflows. (i.e., 0x140000000u64 overflows and trimmed into 0x40000000u32, then subtract the u32 RVA)

On the other hand, this is a significant correctness issue likewise whenever the consumer wants to use it for something like verification.

I guess this has been invisible for this long because no one really cares about 32-bit builds.

@m4b
Copy link
Owner

m4b commented Jan 3, 2025

@kkent030315 thanks for this, and happy new year to you! Apologies for radio silence, i've been offline for a bit, but should be able to get a lot of these patches reviewed and merged in next few days, appreciate your patience as always

// Check if the callback is in the image
if utils::find_offset(callback_rva, sections, file_alignment, opts).is_none() {
if utils::find_offset(callback_rva as usize, sections, file_alignment, opts)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't there technically still an issue on 32-bit systems if callback_rva is > u32, and then it gets casted to a usize (hence truncating a potentially 64 bit number to 32 bits (sizeof pointer))?

tbh i'm not sure we ever supported analyzing 64-bit binaries on on 32-bit systems, simply due to rust &[u8] requiring usize to index, which means if that slice is larger than 32-bits, you cannot access the upper bits in rust.

Copy link
Contributor Author

@kkent030315 kkent030315 Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, where the case the image must be broken. We shouldn't assume any of TLS callback addresses to be a valid VA/RVA. We can maybe put the check there.

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can merge this, and while it's nice to solidify the fields as their explicit bit sizes, i don't think fully supporting 64-bit binary analysis on 32-bit machines has ever been a goal, but we can try our best.

I'm hoping this upcoming release will be one of the last in terms of major field breakage and other things in goblin, and I know i've said this many times in the past, but I'm hoping to put out a 1.0 release in the near future, but first would like to fixup and merge issues like this for better future proofing.

since you've made so many great changes in the PE side of things, if you could keep this in mind (a potential 1.0 and hence no api breakage after) that would be great.

@m4b m4b merged commit 8885422 into m4b:master Jan 5, 2025
6 checks passed
@m4b
Copy link
Owner

m4b commented Jan 5, 2025

NOTE: breaking

@m4b
Copy link
Owner

m4b commented Jan 5, 2025

ref #434

@kkent030315 kkent030315 deleted the usize64 branch January 10, 2025 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants