Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PE: Change Section Table Real Name Handling #438

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

prettyroseslover
Copy link

While trying to parse a particular PE file, I stumbled across an "invalid utf8" error coming from Section Table Real Name. However hard I tried I couldn't find the problem with the file itself. But this simple change in the source code of parse() function finally worked.

@prettyroseslover prettyroseslover changed the title Change Section Table Real Name Handling PE: Change Section Table Real Name Handling Nov 28, 2024
@@ -76,7 +76,9 @@ impl SectionTable {
table.characteristics = bytes.gread_with(offset, scroll::LE)?;

if let Some(idx) = table.name_offset()? {
table.real_name = Some(bytes.pread::<&str>(string_table_offset + idx)?.to_string());
if let Ok(real_name) = bytes.pread::<&str>(string_table_offset + idx) {
table.real_name = Some(real_name.to_string());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would be more idiomatic to do:

table.real_name = bytes.pread::<&str>(string_table_offset + idx).ok().map(String::to_owned);

the real question is whether:

  1. It should be considered a fatal error if the name is not utf8
  2. we should log on error, in which case keeping the if let Ok is better

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took me so long to reply!
As far as I understand, real_name is stored in COFF String Table, but there is nothing stated about the encoding. However? for Symbol Name the following is stated:

By convention, the names are treated as zero-terminated UTF-8 encoded strings.

Maybe, it can be applied to real_name, too?

However, I do not consider it to be a fatal error, so maybe we should stick to if let Ok?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes i think it's ok; the idiomatic suggestion above i wrote is a if let Ok that (ok() maps Result to Option, and then map returns the owned portion if it is some); my only question was whether we should log it or not. if we log it then putting it in a let ok is better. i'll leave it up to you which to do, e.g.:

  1. do the single line ok map
  2. do if let with logging on failure side.

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as suggested, just needs tiny fixup then ready to go, thank you for this!

@@ -76,7 +76,9 @@ impl SectionTable {
table.characteristics = bytes.gread_with(offset, scroll::LE)?;

if let Some(idx) = table.name_offset()? {
table.real_name = Some(bytes.pread::<&str>(string_table_offset + idx)?.to_string());
if let Ok(real_name) = bytes.pread::<&str>(string_table_offset + idx) {
table.real_name = Some(real_name.to_string());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes i think it's ok; the idiomatic suggestion above i wrote is a if let Ok that (ok() maps Result to Option, and then map returns the owned portion if it is some); my only question was whether we should log it or not. if we log it then putting it in a let ok is better. i'll leave it up to you which to do, e.g.:

  1. do the single line ok map
  2. do if let with logging on failure side.

@kkent030315
Copy link
Contributor

kkent030315 commented Jan 10, 2025

Excuse me for interrupting.

I encountered the similar thing before and this is the minimum reproducible test case I wrote that time. Perhaps adding some kind of tests for such would be awesome and hope it helps!

static PE64_INDIRECT_SECTION_NAME: &[u8; 552]
    static PE64_INDIRECT_SECTION_NAME: &[u8; 552] = &[
        0x4D, 0x5A, 0x78, 0x00, 0x01, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x78, 0x00, 0x00, 0x00, 0x0E, 0x1F, 0xBA, 0x0E, 0x00, 0xB4, 0x09, 0xCD, 0x21, 0xB8, 0x01,
        0x4C, 0xCD, 0x21, 0x54, 0x68, 0x69, 0x73, 0x20, 0x70, 0x72, 0x6F, 0x67, 0x72, 0x61, 0x6D,
        0x20, 0x63, 0x61, 0x6E, 0x6E, 0x6F, 0x74, 0x20, 0x62, 0x65, 0x20, 0x72, 0x75, 0x6E, 0x20,
        0x69, 0x6E, 0x20, 0x44, 0x4F, 0x53, 0x20, 0x6D, 0x6F, 0x64, 0x65, 0x2E, 0x24, 0x00, 0x00,
        0x50, 0x45, 0x00, 0x00, 0x64, 0x86, 0x01, 0x00, 0xE1, 0x91, 0x81, 0x67, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0xF0, 0x00, 0x22, 0x20, 0x0B, 0x02, 0x0E, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x01, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00,
        0x08, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x20, 0x02, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x03, 0x00, 0x60, 0x01, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x2F, 0x35, 0x31, 0x32, 0x00, 0x00,
        0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x20,
        0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x40, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x73, 0x75, 0x70, 0x65, 0x72, 0x6C, 0x6F, 0x6E, 0x67, 0x73, 0x65, 0x63, 0x74,
        0x69, 0x6F, 0x6E, 0x6E, 0x61, 0x6D, 0x65, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    ];
    #[test]
    fn test_indirect_section_name() {
        let pe = PE::parse(&*PE64_INDIRECT_SECTION_NAME).unwrap();
        assert_eq!(pe.sections[0].name_offset().unwrap(), Some(0x200));
        assert_eq!(pe.sections[0].name().unwrap(), "superlongsectionname");

        // try to insert bad utf-8 at offset 0x200
        let mut data = PE64_INDIRECT_SECTION_NAME.to_vec();
        data[0x200..0x200 + 4].copy_from_bytes(&[0xFF, 0xFE, 0xFD, 0x00]);
        match PE::parse(&data) {
            Ok(_) => unreachable!(),
            Err(err) => match err {
                crate::error::Error::Scroll(ref why) => {
                    assert_eq!(why.to_string(), "bad input invalid utf8 (40)");
                }
                _ => unreachable!(),
            },
        }
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants