Skip to content

Conversation

@alpe
Copy link
Contributor

@alpe alpe commented Nov 20, 2025

Overview

Speed up cache write/loads via parallel execution.

Pulled from #2836

@github-actions
Copy link
Contributor

github-actions bot commented Nov 20, 2025

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedNov 21, 2025, 9:37 AM

@claude
Copy link
Contributor

claude bot commented Nov 20, 2025

Claude finished @alpe's task —— View job


Code Review Complete ✓

I've reviewed the parallel cache serialization/deserialization changes. This is a solid performance optimization with good concurrency patterns. Here are my findings:


Summary

This PR successfully parallelizes cache I/O operations using errgroup.Group, which should provide meaningful performance improvements for disk operations. The implementation is generally well-designed with three key changes:

  1. Parallel SaveToDisk: Three goroutines now save different cache components concurrently
  2. Parallel LoadFromDisk: Three goroutines load cache components concurrently
  3. Enhanced saveMapGob: Added buffered I/O and improved error handling

Issues & Recommendations

🔴 Critical: Variable Shadowing in saveMapGob (block/internal/cache/generic_cache.go:155)

Problem: The inner err variable shadows the named return parameter, which is confusing and error-prone:

func saveMapGob[K comparable, V any](filePath string, data map[K]V) (err error) {
    // ...
    if err := gob.NewEncoder(writer).Encode(data); err != nil {  // shadows named return
        return fmt.Errorf("failed to encode to file %s: %w", filePath, err)
    }
    return nil
}

Impact: While this works correctly (the shadowed variable is returned immediately), it creates maintenance risk and reduces code clarity.

Recommendation: Use a different variable name:

if encodeErr := gob.NewEncoder(writer).Encode(data); encodeErr != nil {
    return fmt.Errorf("failed to encode to file %s: %w", filePath, encodeErr)
}
return nil

🟡 Medium: Inconsistent Error Message Formatting (block/internal/cache/generic_cache.go:254, 265, 276)

Problem: Extra spaces before colons in LoadFromDisk error messages:

return fmt.Errorf("failed to load %s : %w", itemsByHeightFilename, err)  // space before :

Impact: Inconsistent with SaveToDisk error messages which use save %s: %w (no space).

Recommendation: Remove extra spaces for consistency:

return fmt.Errorf("failed to load %s: %w", itemsByHeightFilename, err)

🟡 Medium: Missing maxDAHeight Update During Load (Partial Issue)

Observation: The LoadFromDisk function updates maxDAHeight during the daIncluded load (lines 281-284), but uses a simple Store instead of the compare-and-swap logic used in setDAIncluded.

Current Code:

for k, v := range daIncludedMap {
    c.daIncluded.Store(k, v)
    // Update max DA height during load
    current := c.maxDAHeight.Load()
    if v > current {
        c.maxDAHeight.Store(v)  // Not atomic with the check above
    }
}

Issue: This creates a race condition if the cache is being accessed during load. The check and store are not atomic. While loading is likely done during initialization before concurrent access, this pattern is inconsistent with the robust compare-and-swap logic in setDAIncluded.

Recommendation: Either:

  1. Use compare-and-swap for consistency and safety:
for k, v := range daIncludedMap {
    c.daIncluded.Store(k, v)
    // Atomic update of max DA height
    for {
        current := c.maxDAHeight.Load()
        if v <= current || c.maxDAHeight.CompareAndSwap(current, v) {
            break
        }
    }
}
  1. Or document that LoadFromDisk must be called before concurrent access

🟢 Minor: Error Message Context Loss in SaveToDisk

Observation: The new error wrapping in SaveToDisk changes error messages. Before:

return saveMapGob(...) // returns "failed to create file..." or "failed to encode..."

After:

return fmt.Errorf("save %s: %w", itemsByHeightFilename, err)

Impact: Less specific error messages. A file creation error now shows as "save items_by_height.gob: failed to create file..." instead of just "failed to create file...". This is actually better as it adds context about which operation failed.

Verdict: This is an improvement, not an issue.


Positive Observations

Excellent concurrency pattern: errgroup.Group is the right choice for parallel I/O with error aggregation

Proper buffered I/O: The addition of bufio.Writer in saveMapGob is a performance win

Robust error handling: The errors.Join pattern in the deferred cleanup is excellent Go practice

Safe concurrent access: Using sync.Map.Range to snapshot data before serialization prevents holding locks during I/O

Good test coverage: Tests cover save/load roundtrips, error paths, and large datasets

No goroutine leaks: errgroup.Group.Wait() ensures all goroutines complete

Maintains backward compatibility: File formats unchanged


Performance Analysis

Expected speedup: With three independent I/O operations:

  • Best case: ~3x faster (if CPU-bound or SSD with high parallelism)
  • Realistic: 1.5-2x faster (typical HDD or moderate SSD)
  • Worst case: Similar speed (single slow HDD with heavy contention)

The buffered writes should provide an additional 10-20% improvement.


Security & Reliability

✅ No security issues identified
✅ No data race issues (beyond the minor maxDAHeight load issue noted)
✅ Proper file permissions (0o755 for directories)
✅ Error propagation is correct
✅ No resource leaks


Test Coverage

The codecov report shows 83.33% coverage with 11 lines missing. Based on the test file review:

Covered paths:

  • Normal save/load roundtrip ✓
  • Corrupt gob file handling ✓
  • Invalid directory path ✓
  • Large dataset handling ✓
  • maxDAHeight persistence ✓

Likely uncovered paths (11 lines):

  • Some error return paths in the parallel goroutines
  • File sync/flush errors in the defer
  • Type assertion failures in Range callbacks

Recommendation: Consider adding tests for:

  1. Concurrent writes during save/load (though this may be by design not supported)
  2. Partial failure scenarios (one goroutine fails in errgroup)

Final Verdict

This is a well-implemented performance optimization that follows Go best practices. The parallelization strategy is sound and the code quality is high.

Required changes before merge:

  1. Fix variable shadowing in saveMapGob (line 155)
  2. Fix inconsistent error message spacing (lines 254, 265, 276)
  3. Address the maxDAHeight race condition during load (lines 281-284) or document the initialization-only constraint

Nice to have:

  • Additional test coverage for error paths
  • Performance benchmarks to quantify the speedup

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @alpe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces significant performance improvements to the cache's persistence mechanisms by implementing parallel execution for both saving and loading cache data to and from disk. By concurrently handling different parts of the cache, the aim is to reduce the overall time taken for these operations, leading to a more responsive system.

Highlights

  • Parallel Cache Serialization: The SaveToDisk function has been refactored to save different components of the cache (items by height, hashes, and included data) concurrently using errgroup.Group, significantly speeding up the cache writing process.
  • Parallel Cache Deserialization: Similarly, the LoadFromDisk function now loads the various cache components in parallel, leveraging errgroup.Group to accelerate the cache loading operation.
  • Improved File I/O for saveMapGob: The saveMapGob utility function has been enhanced to use bufio.NewWriter for buffered writes, which can improve performance for disk I/O, and errors.Join for more robust deferred error handling during file operations.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces parallel execution for cache serialization and deserialization using errgroup, which should improve performance for these I/O-bound operations. The implementation is solid, including robust error handling for file operations in saveMapGob. I've made a few suggestions to improve code clarity in saveMapGob and to ensure consistency in error message formatting within LoadFromDisk.

Comment on lines +133 to 136
if err := gob.NewEncoder(writer).Encode(data); err != nil {
return fmt.Errorf("failed to encode to file %s: %w", filePath, err)
}
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of if err := ... shadows the named return parameter err. While this works correctly because a return <value> statement sets the named return parameter before defer is executed, it is confusing and can be brittle. It's better to avoid shadowing the named return parameter for improved code clarity and maintainability.

    if encodeErr := gob.NewEncoder(writer).Encode(data); encodeErr != nil {
        err = fmt.Errorf("failed to encode to file %s: %w", filePath, encodeErr)
    }
    return

wg.Go(func() error {
itemsByHeightMap, err := loadMapGob[uint64, *T](filepath.Join(folderPath, itemsByHeightFilename))
if err != nil {
return fmt.Errorf("failed to load %s : %w", itemsByHeightFilename, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's an extra space before the colon in the error message. For consistency with other error messages in this file (e.g., in SaveToDisk), it should be removed.

            return fmt.Errorf("failed to load %s: %w", itemsByHeightFilename, err)

wg.Go(func() error {
hashesMap, err := loadMapGob[string, bool](filepath.Join(folderPath, hashesFilename))
if err != nil {
return fmt.Errorf("failed to load %s : %w", hashesFilename, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's an extra space before the colon in the error message. For consistency with other error messages in this file (e.g., in SaveToDisk), it should be removed.

            return fmt.Errorf("failed to load %s: %w", hashesFilename, err)

wg.Go(func() error {
daIncludedMap, err := loadMapGob[string, uint64](filepath.Join(folderPath, daIncludedFilename))
if err != nil {
return fmt.Errorf("failed to load %s : %w", daIncludedFilename, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's an extra space before the colon in the error message. For consistency with other error messages in this file (e.g., in SaveToDisk), it should be removed.

            return fmt.Errorf("failed to load %s: %w", daIncludedFilename, err)

@codecov
Copy link

codecov bot commented Nov 20, 2025

Codecov Report

❌ Patch coverage is 83.58209% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.94%. Comparing base (d7eda60) to head (901c488).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
block/internal/cache/generic_cache.go 83.58% 5 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2868      +/-   ##
==========================================
- Coverage   64.97%   64.94%   -0.04%     
==========================================
  Files          81       81              
  Lines        7250     7268      +18     
==========================================
+ Hits         4711     4720       +9     
- Misses       1998     2006       +8     
- Partials      541      542       +1     
Flag Coverage Δ
combined 64.94% <83.58%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@alpe alpe changed the title Parallel cache de/serialization chore: parallel cache de/serialization Nov 20, 2025
tac0turtle
tac0turtle previously approved these changes Nov 20, 2025
julienrbrt
julienrbrt previously approved these changes Nov 20, 2025
* main:
  refactor: use state da height as well (#2872)
  refactor: retrieve highest da height in cache (#2870)
  chore: change from event count to start and end height (#2871)
@alpe alpe dismissed stale reviews from julienrbrt and tac0turtle via 837a887 November 21, 2025 08:56
@alpe alpe enabled auto-merge November 21, 2025 08:57
* main:
  refactor(syncer,cache): use compare and swap loop and add comments (#2873)
@alpe alpe disabled auto-merge November 21, 2025 09:33
@julienrbrt julienrbrt added this pull request to the merge queue Nov 21, 2025
Merged via the queue into main with commit 1644d37 Nov 21, 2025
30 checks passed
@julienrbrt julienrbrt deleted the alex/cache_io branch November 21, 2025 10:49
alpe added a commit that referenced this pull request Nov 24, 2025
* main:
  chore: remove extra github action yml file (#2882)
  fix(execution/evm): verify payload status (#2863)
  feat: fetch included da height from store (#2880)
  chore: better output on errors (#2879)
  refactor!: create da client and split cache interface (#2878)
  chore!: rename `evm-single` and `grpc-single` (#2839)
  build(deps): Bump golang.org/x/crypto from 0.42.0 to 0.45.0 in /tools/da-debug in the go_modules group across 1 directory (#2876)
  chore: parallel cache de/serialization (#2868)
  chore: bump blob size (#2877)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants