Skip to content

Error in identifying the start of a UTF-16 string when there is a printable character before it. #89

@itai-delphos

Description

@itai-delphos

When a Unicode 16 string is preceded by a printable character, the detected string is missing the first character.

import rust_strings

enc = ["utf-16le"]
bstr = "test".encode("utf-16le") + b"\x00\x00"
print(rust_strings.strings(bytes=b"\xD0" + bstr, min_length=2, encodings=enc))
print(rust_strings.strings(bytes=b"A" + bstr, min_length=2, encodings=enc))

Output:

[('test', 1)]
[('est', 3)]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions