Skip to content

Conversation

cosmicexplorer
Copy link
Contributor

@cosmicexplorer cosmicexplorer commented Sep 30, 2025

Problem

Our current fuzzing instructions do not initiate an afl fuzzing analysis, nor can they be used for one-off testing over stdin.

We need the fuzz crates to be in a distinct workspace from the top-level zip crate, because the cargo afl dependency requires the Cargo 2024 edition functionality. However, we don't need each fuzz crate to be in its own workspace.

As we already know #235 will require creating new workspaces for the CLI tools, it would make sense to use this opportunity to do some prep work to support multiple workspaces in CI separately from the CLI workflow.

Solution

  • Add categories strings to our Cargo.toml for greater discoverability on crates.io.
  • Add a stdin/stdout implementation for fuzz_read and fuzz_write, so the fuzzing logic can be triggered for specific inputs, and not just from AFL itself.
  • Expand on how to trigger the fuzzing logic in the README.
  • Reformat the README to use #-style headers in order to enable subsection nesting.
  • Use cfg(feature = "arbitrary") to trigger Arbitrary implementations and to disable deprecation warnings, as that usually signals being employed in randomized testing.

This next part contributed to the very large number of changed files, and I'm highlighting it here because it was an intentional decision:

  • Unify fuzz_read, and fuzz_write into a single workspace to declare dependencies like afl in one place.
    • This required moving those crates into a subdirectory of the new workspace directory fuzz/, which is not a package in itself but just a Cargo.toml.
    • This required modifying CI to incorporate the fuzz workspace separate from the main library workspace.
    • For zip-cli binary crate #235, we are going to want to introduce two new workspaces for zip-cli and zip-clite, which will also require this sort of special handling.
      • This change takes the step of supporting separate workspaces in CI so that zip-cli binary crate #235 can use this framework instead of having to solve that problem itself.

Result

Our fuzzing instructions actually work to fuzz! And we are more prepared for adding the multiple CLI workspaces from #235.

Copy link
Contributor

Code review in progress. Analyzing for code quality issues and best practices. Detailed findings will be posted upon completion.

Using Amazon Q Developer for GitHub

Amazon Q Developer1 is an AI-powered assistant that integrates directly into your GitHub workflow, enhancing your development process with intelligent features for code development, review, and transformation.

Slash Commands

Command Description
/q <message> Chat with the agent to ask questions or request revisions
/q review Requests an Amazon Q powered code review
/q help Displays usage information

Features

Agentic Chat
Enables interactive conversation with Amazon Q to ask questions about the pull request or request specific revisions. Use /q <message> in comment threads or the review body to engage with the agent directly.

Code Review
Analyzes pull requests for code quality, potential issues, and security concerns. Provides feedback and suggested fixes. Automatically triggered on new or reopened PRs (can be disabled for AWS registered installations), or manually with /q review slash command in a comment.

Customization

You can create project-specific rules for Amazon Q Developer to follow:

  1. Create a .amazonq/rules folder in your project root.
  2. Add Markdown files in this folder to define rules (e.g., cdk-rules.md).
  3. Write detailed prompts in these files, such as coding standards or best practices.
  4. Amazon Q Developer will automatically use these rules when generating code or providing assistance.

Example rule:

All Amazon S3 buckets must have encryption enabled, enforce SSL, and block public access.
All Amazon DynamoDB Streams tables must have encryption enabled.
All Amazon SNS topics must have encryption enabled and enforce SSL.
All Amazon SNS queues must enforce SSL.

Feedback

To provide feedback on Amazon Q Developer, create an issue in the Amazon Q Developer public repository.

For more detailed information, visit the Amazon Q for GitHub documentation.

Footnotes

  1. Amazon Q Developer uses generative AI. You may need to verify generated code before using it in your environment. See the AWS Responsible AI Policy.

Copy link
Contributor

Summary of Changes

Hello @cosmicexplorer, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refines the project's fuzzing infrastructure and documentation. It consolidates the main zip crate and its fuzzing sub-crates into a unified workspace, streamlining dependency management. A key improvement is the introduction of stdin/stdout support for the fuzzing targets, allowing developers to easily test specific inputs and debug fuzzing logic locally. The accompanying documentation has been thoroughly updated to reflect these changes, making the fuzzing process more accessible and effective.

Highlights

  • Workspace Consolidation: The zip crate and its associated fuzzing sub-crates (fuzz_read, fuzz_write) have been integrated into a single Cargo workspace, centralizing dependency management and simplifying the project structure.
  • Enhanced Fuzzing Documentation: The README.md has been significantly updated with clearer and more comprehensive instructions for setting up and running fuzzing, including new examples for one-off testing via stdin/stdout.
  • Stdin/Stdout Fuzzing Support: Both fuzz_read and fuzz_write now support receiving input via stdin, enabling easier local testing and debugging of fuzzing logic without requiring full AFL integration.
  • Feature-Gated arbitrary Derives: The arbitrary crate derives are now explicitly enabled via an arbitrary feature flag in the main zip crate, providing more granular control over its inclusion and usage.
  • Crates.io Discoverability: The main Cargo.toml now includes categories metadata, which will improve the discoverability of the crate on crates.io.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@amazon-q-developer amazon-q-developer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This PR successfully consolidates the fuzzing infrastructure into a unified workspace and significantly improves the fuzzing documentation. The changes are well-structured and follow most project guidelines.

Key Improvements

  • Workspace Consolidation: Properly unifies the main crate and fuzzing crates into a single workspace, eliminating duplicate dependency declarations
  • Feature Flag Implementation: Correctly introduces the arbitrary feature flag to gate fuzzing-related code, following project guidelines
  • Documentation Enhancement: Substantially improves fuzzing instructions with clear, actionable commands and stdio testing capabilities
  • Code Organization: Good refactoring of fuzzing code to support both AFL fuzzing and stdin input for debugging

Critical Issue

There's a compilation error in both fuzzing crates where use afl::fuzz; is conditionally compiled with #[cfg(fuzzing)] but the fuzz! macro is still referenced in the conditional blocks. This needs to be fixed before merging.

Minor Note

The PR title should follow Conventional Commits format (e.g., chore: incorporate fuzzing crates into single workspace and improve docs) as required by project guidelines.

Overall, this is a valuable contribution that will make fuzzing more accessible and maintainable. Once the compilation issue is resolved, this should be ready to merge.

# Any change to rust-version must be reflected also in `README.md` and `.github/workflows/ci.yaml`.
# The MSRV policy is documented in `README.md`.
rust-version = "1.83.0"
categories = ["compression", "filesystem", "parser-implementations"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good addition of categories for better discoverability on crates.io. The selected categories "compression", "filesystem", and "parser-implementations" are appropriate for a ZIP library.

Cargo.toml Outdated
[workspace]
members = [".", "fuzz_read", "fuzz_write"]
default-members = ["."]
resolver = "2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workspace configuration looks well-structured. Using default-members = ["."] ensures that cargo build in the root only builds the main crate by default, which is the expected behavior for users.

clap = { version = "=4.4.18", features = ["derive"] }
tempfile = "3.15"

[features]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new arbitrary feature flag follows the project guidelines for feature-gating new functionality. This properly gates the Arbitrary trait implementations behind a feature flag as required.

/// [`crate::write::FileOptions::compression_method`]
#[derive(Copy, Clone, PartialEq, Eq, Debug)]
#[cfg_attr(fuzzing, derive(arbitrary::Arbitrary))]
#[cfg_attr(feature = "arbitrary", derive(arbitrary::Arbitrary))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good change to use the feature flag instead of the cfg(fuzzing) attribute. This properly gates the Arbitrary derive behind the arbitrary feature as intended by the workspace changes.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively unifies the fuzzing crates into a single workspace, which simplifies dependency management and the overall build process. The improvements to the fuzzing documentation, including instructions for using stdin, are a valuable addition for developers. The code changes are generally well-structured and align with the PR's goals. I've identified a few areas for improvement, including a potential panic in a drop implementation, an opportunity to simplify a function signature for better readability, and a minor formatting issue in the README. Overall, this is a solid contribution that enhances the project's testing infrastructure.

@cosmicexplorer
Copy link
Contributor Author

Want to note that the changes to ci.yaml which generate separate shards for the fuzz and top-level workspaces now seem to result in over 100 total CI jobs (the fix I just made shaved off a handful though). I am by no means a github actions expert, but I don't think that combinatorial explosion is frivolous, since the rust environment and even the cargo edition differs between the top-level zip crate and the fuzz workspace. But I don't want to start racking up someone else's CI bill.

And as an aside, I understand changes to CI will require more careful review, so I'm not at all pushing for this PR to be merged soon. This PR is mostly about documenting how to fuzz (after fixing the documentation we link to), and the workspace separation is I think within scope.

I don't believe this workspace separation reduces any unwanted dependencies or anything--the current code did that already. The fuzz subdirectory just makes it possible to specify shared dependencies for fuzz testing like afl, while maintaining the current separation from the top-level zip library.

- add publish = false to fuzz subcrates
- move fuzzing to a subdirectory in order to share a workspace
- break out separate shards for specific named features, but only for the top-level workspace
@Pr0methean Pr0methean changed the title incorporate fuzzing crates into a single workspace, and improve fuzzing docs test(fuzz): incorporate fuzzing crates into a single workspace, and improve fuzzing docs Oct 9, 2025
Copy link
Member

@Pr0methean Pr0methean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good; just one question.

tempfile = "3.15"

[features]
arbitrary = ["dep:arbitrary"]
Copy link
Member

@Pr0methean Pr0methean Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming this feature wiithout an underscore prefix will make it part of the public API, meaning e.g. that we can't delete it without bumping the major version. Are you sure that's wise? It seems to me that even if someone needs the implementation for the purpose of fuzzing a downstream crate, then they should either copy it or accept the risk of an incompatible change.

Copy link
Member

@Pr0methean Pr0methean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous review comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants