Skip to content

feat: complete reorg handling#401

Open
zannis wants to merge 31 commits intojoshstevens19:masterfrom
zannis:feat/reorg-handling
Open

feat: complete reorg handling#401
zannis wants to merge 31 commits intojoshstevens19:masterfrom
zannis:feat/reorg-handling

Conversation

@zannis
Copy link
Copy Markdown
Contributor

@zannis zannis commented Apr 10, 2026

Reactive Reorg Handling for rindexer

rindexer is now reactive to chain reorganizations. Instead of relying solely on safe distance offsets to avoid reorged data, the indexer actively monitors the block hash chain and catches reorgs the moment they happen. When one is detected, indexing pauses, stale state is corrected, and indexing resumes — all within seconds.

The existing reorg_safe_distance configuration remains fully supported. Users who prefer the conservative approach of only indexing finalized blocks can continue using it as before — the two mechanisms are complementary. Safe distance prevents indexing unconfirmed blocks; reorg handling corrects state if a reorg slips through. They can be used independently or together.

This means three operating modes for downstream consumers:

For latency-sensitive streams (webhooks, Kafka, SNS, etc.) — events are delivered instantly as before, but now with a safety net. If a reorg invalidates previously-delivered events, a reorg notification is published through the same channels so consumers can reconcile. This is the instant delivery mode (default).

For correctness-critical streams — a new finalized delivery mode buffers events until they're past the chain's finality window before publishing. Events that survive the reorg safe distance are guaranteed canonical. No reconciliation needed.

For maximum safety — combine reorg_safe_distance on the contract with reorg_handling on the network. The safe distance delays indexing by N blocks, and reorg handling catches anything that slips past that window. Belt and suspenders.

networks:
  - name: ethereum
    rpc: https://...
    reorg_handling:
      enabled: true       # active detection + correction
      window_size: 256

contracts:
  - name: MyContract
    reorg_safe_distance: true  # also wait N blocks before indexing (still works as before)
    # ...

streams:
  - type: webhook
    endpoint: https://example.com/realtime
    delivery: instant      # fast, with reorg notifications

  - type: kafka
    brokers: [...]
    delivery: finalized    # delayed, but guaranteed correct

How it works

A per-network ReorgCoordinator validates the parent hash of every new block against a persisted sliding window of recent block hashes. On mismatch:

  1. Fork point is identified via a single batch RPC call against the window
  2. Stale events are deleted and affected tx hashes collected (single atomic PG transaction)
  3. Block hash window and indexing checkpoints are corrected
  4. on_reorg callback fires with invalidated tx hashes for user-side reconciliation
  5. Reorg notification published through instant-mode streams
  6. Indexing filter rewinds to the fork point — the next loop iteration re-fetches corrected events naturally

The entire recovery is blocking — no callbacks or dependent events see stale data.

Three detection paths converge on the same recovery flow:

  • Parent hash validation (RPC polling) — catches reorgs on every new block
  • Removed logs (log.removed == true from RPC) — catches reorgs surfaced during log fetching
  • Reth ExEx notifications — instant detection when running an embedded reth node

On restart, startup validation compares the persisted window against the canonical chain and handles any reorgs that occurred while the indexer was offline.

What changed architecturally

The monolithic reorg.rs (1,790 lines) is replaced by a modular reorg/ module:

  • coordinator.rs — Per-network reorg detection and orchestration
  • task.rs — Rollback execution (delete, correct, rewind, notify)
  • window.rs — In-memory BlockChainWindow backed by a persisted latest_blocks table
  • persistence.rs — DB read/write for the block hash window

A ReorgContext struct bundles DB clients, callback registry, and stream clients through the detection → recovery chain.

Crash recovery

The latest_blocks update is inside the PG rollback transaction. If the process crashes mid-recovery, the transaction rolls back and startup validation re-detects the reorg on next start.

New metrics

  • rindexer_reorg_handling_duration_seconds{network}
  • rindexer_reorg_events_deleted_total{network}
  • rindexer_reorg_events_reindexed_total{network}
  • rindexer_reorg_detection_source_total{network, source}
  • rindexer_reorg_cascade_total{network}

Deferred

  • FinalizedBuffer wiring — Infrastructure ready, pipeline integration in follow-up
  • on_reorg callback wiring from no-code setup — Types and firing logic implemented, needs registry threading
  • CSV invalidation file — Writer implemented, needs project path threading

zannis added 25 commits April 10, 2026 00:04
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 10, 2026

@zannis is attempting to deploy a commit to the joshaavecom's projects Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant