Skip to content

Conversation

@Pragnasri-363
Copy link

This PR introduces a minimal, developer run fuzzing harness for BHV.

Why Fuzz?

Fuzzing helps us feed the program with malformed input images to check if anything crashes or if there are any vulnerabilities in program.

What this does?

The harness intentionally truncates valid image seeds at the byte level and passes the resulting files through existing image verification logic.

In practice, the flow is:

  1. Take a valid image
  2. Truncate the image by removing a random number of bytes
  3. Feed the program the truncated image
  4. Observe and handle parsing failures
    In simple terms: verify --> truncate --> test

How this helps?

This helps us find unsafe assumptions in image parsing and ensures malformed uploads are handled gracefully, without introducing production dependencies or coverage-guided fuzzing infrastructure.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Pragnasri-363, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new, minimal fuzzing harness designed for developers to manually test the robustness of BHV's image handling logic. By systematically truncating valid image files and feeding these malformed inputs into the system, the harness aims to uncover potential vulnerabilities, crashes, or unexpected behaviors in image parsing and validation. This initiative strengthens the application's security posture by ensuring graceful error handling for corrupted or adversarial image uploads, without relying on complex, production-level fuzzing infrastructure.

Highlights

  • New Fuzzing Harness: Introduces a minimal, developer-run fuzzing harness specifically for BHV's image processing.
  • Truncation-Based Fuzzing: The harness employs a truncation method to create malformed image inputs from valid seeds, testing how the system handles incomplete data.
  • Security Enhancement: Aims to identify crashes, hangs, or vulnerabilities in image parsing and validation logic when processing malformed inputs.
  • Graceful Error Handling: Ensures that the application handles corrupted or adversarial image uploads gracefully, preventing unexpected failures.
  • Local and Minimal Scope: Designed as a local, developer-run script, intentionally avoiding complex production-grade fuzzing infrastructure or coverage guarantees.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a minimal but effective truncation-based fuzzing harness for image uploads. This is a great initiative to improve the robustness and security of image parsing logic. The implementation is straightforward and well-documented in the new security/README.md.

My review includes a critical fix for the instructions in the README to ensure they are runnable, and a refactoring suggestion for the truncate_image function in the Python script to improve its clarity, documentation, and adherence to Python conventions. Overall, this is a valuable addition.

print(f"[!] Bad seed image {image_path}: {e}")
return None

def truncate_image(ip_path,op_path):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The truncate_image function can be improved for clarity and adherence to Python conventions:

  • Inconsistent Return Types: The function returns False on one path and bytes on another. Since the caller doesn't use the return value, it's more idiomatic for functions that primarily cause side effects (like writing a file) to have no return value.
  • Code Formatting: There are extra blank lines that can be removed for conciseness.
  • Documentation: The explanatory comment on line 64 is very helpful but contains a typo ('alway'). It would be better as a docstring inside the truncate_image function. This makes the documentation directly part of the function and accessible via tools like help().

Here is a suggested refactoring that addresses these points and would replace lines 48-64:

def truncate_image(ip_path, op_path):
    """Truncates an image by removing a random number of bytes from the end.

    This helps find cases where the program assumes data will always be a
    certain length. It can also be used to shrink massive inputs that crash
    the program.
    """
    with open(ip_path, "rb") as f:
        data = f.read()

    if len(data) < 2:
        return

    cut = random.randint(1, len(data) - 1)
    truncated = data[:-cut]

    with open(op_path, "wb") as f:
        f.write(truncated)

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@mdxabu mdxabu added the on hold Not merging this PR now. label Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

on hold Not merging this PR now.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants