perf: parallelize ConcatSource hashing #213

SyMind · 2026-02-09T07:13:43Z

No description provided.

gemini-code-assist · 2026-02-09T07:13:58Z

Summary of Changes

Hello @SyMind, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on optimizing the ConcatSource hashing process by introducing parallelism. By leveraging the rayon crate, the computation of individual child hashes is now distributed across multiple threads, which is expected to yield performance improvements, especially for sources composed of many children. The change also includes a new test to confirm that the parallel hashing remains deterministic and consistent.

Highlights

Parallel Hashing: The ConcatSource hashing mechanism has been refactored to utilize parallel processing for its children, significantly improving performance for complex source structures.
Dependency Update: The rayon crate, a library for data parallelism, has been added as a direct dependency to enable parallel execution.
Test Coverage: A new test case has been introduced to specifically verify the deterministic nature of the ConcatSource hashing when multiple children are involved, ensuring correctness after parallelization.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

Cargo.toml
- Added rayon = "1.11.0" as a direct dependency.
- Removed rayon from the criterion development dependencies section, promoting it to a top-level dependency.
src/concat_source.rs
- Imported rayon::prelude::* and rustc_hash::FxHasher to support parallel hashing.
- Modified the impl Hash for ConcatSource to iterate over children in parallel using par_iter().
- Each child's hash is now computed independently using FxHasher and then combined sequentially into a final hash.
- Added a new test test_hash_is_deterministic_for_many_children to ensure the parallel hashing produces consistent results.

Activity

No specific activity (comments, reviews, etc.) has been recorded for this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request parallelizes the hashing of ConcatSource children to improve performance, using the rayon crate. The implementation computes hashes of children in parallel and then combines them. While this is a good performance optimization for sources with many children, I've pointed out a potential performance regression for sources with few children due to the overhead of parallelization. I've suggested introducing a threshold to switch between sequential and parallel hashing to address this.

gemini-code-assist · 2026-02-09T07:15:41Z

src/concat_source.rs

+    let child_hashes: Vec<u64> = children
+      .par_iter()
+      .map(|child| {
+        let mut hasher = FxHasher::default();
+        child.hash(&mut hasher);
+        hasher.finish()
+      })
+      .collect();


Using par_iter unconditionally can lead to performance degradation for ConcatSource with a small number of children due to the overhead of parallelization. This could be a performance regression in those cases.

Consider adding a threshold to switch between sequential and parallel hashing. For example, you could use iter() for a small number of children and par_iter() for a larger number.

To maintain hash consistency, the sequential path must use the same hashing logic as the parallel one (i.e., creating an FxHasher for each child and then combining the hashes). A possible implementation could look like this:

const PAR_HASH_THRESHOLD: usize = 16; // Should be benchmarked let child_hashes: Vec<u64> = if children.len() < PAR_HASH_THRESHOLD { children .iter() .map(|child| { let mut hasher = FxHasher::default(); child.hash(&mut hasher); hasher.finish() }) .collect() } else { children .par_iter() .map(|child| { let mut hasher = FxHasher::default(); child.hash(&mut hasher); hasher.finish() }) .collect() };

codspeed-hq · 2026-02-09T07:15:54Z

Merging this PR will not alter performance

✅ 11 untouched benchmarks
⏩ 6 skipped benchmarks¹

_{Comparing compute-hash-by-rayon (27c3f66) with main (74beb79)}

6 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

Copilot

Pull request overview

This PR parallelizes ConcatSource’s Hash implementation by hashing each optimized child source concurrently and then combining those hashes, requiring rayon as a normal dependency.

Changes:

Add rayon and use par_iter() to compute per-child hashes in parallel for ConcatSource.
Combine per-child hashes into a single value that is fed into the caller-provided Hasher.
Add a new unit test intended to assert hash determinism for many children.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`src/concat_source.rs`	Parallelizes `ConcatSource` hashing via Rayon and adds a determinism-oriented unit test.
`Cargo.toml`	Moves/adds `rayon` under `[dependencies]` to support use in library code.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-09T07:17:50Z

src/concat_source.rs

+    let mut hasher1 = FxHasher::default();
+    source.hash(&mut hasher1);
+    let hash1 = hasher1.finish();
+
+    let mut hasher2 = FxHasher::default();
+    source.hash(&mut hasher2);
+    let hash2 = hasher2.finish();
+


FxHasher::finish() is a method from the std::hash::Hasher trait, but Hasher isn’t imported in this test module. As written, the calls to hasher1.finish() / hasher2.finish() won’t compile unless std::hash::Hasher is brought into scope (or std::hash::Hasher::finish(&hasher) is used).

Copilot · 2026-02-09T07:17:51Z

src/concat_source.rs

+    let mut combined = FxHasher::default();
+    for child_hash in child_hashes {
+      child_hash.hash(&mut combined);
    }
+    combined.finish().hash(state);


After computing child_hashes, the extra combined FxHasher pass adds another hashing round and a finalization before writing into state. You can hash the u64 values directly into state in order (after the parallel collection) to reduce work and simplify the implementation.

Suggested change

let mut combined = FxHasher::default();

for child_hash in child_hashes {

child_hash.hash(&mut combined);

}

combined.finish().hash(state);

for child_hash in child_hashes {

child_hash.hash(state);

}

perf: using rayon to compute hash

27c3f66

Copilot AI review requested due to automatic review settings February 9, 2026 07:13

Copilot started reviewing on behalf of SyMind February 9, 2026 07:14 View session

gemini-code-assist bot reviewed Feb 9, 2026

View reviewed changes

Copilot AI reviewed Feb 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: parallelize ConcatSource hashing #213

perf: parallelize ConcatSource hashing #213

Uh oh!

SyMind commented Feb 9, 2026

Uh oh!

gemini-code-assist bot commented Feb 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 9, 2026

Uh oh!

codspeed-hq bot commented Feb 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

perf: parallelize ConcatSource hashing #213

Are you sure you want to change the base?

perf: parallelize ConcatSource hashing #213

Uh oh!

Conversation

SyMind commented Feb 9, 2026

Uh oh!

gemini-code-assist bot commented Feb 9, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

codspeed-hq bot commented Feb 9, 2026

Merging this PR will not alter performance

Footnotes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant