Skip to content

ExoMonk/hypercore-indexer

Repository files navigation

hypercore-indexer

Blazing-fast Rust indexer for Hyperliquid's HyperCore EVM. Ingests block data directly from S3 at 450 blocks/sec, decodes transactions with dual phantom hash computation, and stores everything in SQLite, PostgreSQL, or ClickHouse.

Built for HIP4 prediction market indexing. Testnet live now — when HIP4 goes mainnet, one config change enables everything.

Pipeline

Installation

Docker (recommended)

Pre-built image, no Rust toolchain needed:

docker pull ghcr.io/exomonk/hypercore-indexer:latest
docker run --rm ghcr.io/exomonk/hypercore-indexer --help

Run with config + AWS credentials:

# Generate a config file
docker run --rm ghcr.io/exomonk/hypercore-indexer init > hypercore.toml

# Live index (mounts config + AWS credentials)
docker run --rm \
  -v $(pwd)/hypercore.toml:/app/hypercore.toml \
  -v $HOME/.aws:/root/.aws:ro \
  ghcr.io/exomonk/hypercore-indexer live

Install from source

Requires Rust 1.91+:

git clone https://github.com/ExoMonk/hypercore-indexer.git
cd hypercore-indexer
cargo install --path .
hypercore-indexer init

Or use the install script:

curl -sSL https://raw.githubusercontent.com/ExoMonk/hypercore-indexer/main/install.sh | bash

After installation, hypercore-indexer is available as a CLI:

hypercore-indexer --help
hypercore-indexer init                    # generate hypercore.toml
hypercore-indexer live                    # start indexing from chain tip
hypercore-indexer backfill --from 5000000 # backfill from block, then live

Examples

Example What How
live-sqlite Live indexing, zero infra ./run.sh
backfill-postgres Backfill into PostgreSQL docker compose up
live-hip4-clickhouse HIP4 prediction markets + ClickHouse docker compose up

Quick Start

# CLI: live index into SQLite (from chain tip, zero setup)
hypercore-indexer init
hypercore-indexer live

# CLI: backfill a range then auto-switch to live
hypercore-indexer backfill --from 5000000

# Docker Compose: HIP4 prediction markets with ClickHouse
cd examples/live-hip4-clickhouse
docker compose up

Terminal Examples

Commands

hypercore-indexer init                                     # Generate hypercore.toml
hypercore-indexer fetch-block <N>                          # Fetch + decode a single block (--raw, --system-txs)
hypercore-indexer decode-block <N>                         # Full decode with hash computation
hypercore-indexer backfill --from <N> --to <N>             # Parallel S3 ingestion + storage
hypercore-indexer live                                     # Follow chain tip, index new blocks continuously
hypercore-indexer fills-backfill --from-date YYYYMMDD --to-date YYYYMMDD   # Backfill trade fills from S3
hypercore-indexer hip4-poll                                # Run HIP4 API poller (markets + prices)

Global flags: --network testnet|mainnet, --config <path>, --region <aws-region>

Resume is automatic -- omit --from and the indexer picks up from the last cursor.

Config

Generated by hypercore-indexer init. Lives at hypercore.toml. CLI flags and env vars (DATABASE_URL, HL_NETWORK) override config values.

[network]
name = "mainnet"                    # "mainnet" (chain 999) or "testnet" (chain 998, HIP4)
region = "ap-northeast-1"           # AWS region for S3 bucket

[storage]
url = "sqlite:./hypercore.db"       # SQLite (zero setup), PostgreSQL, or ClickHouse
# url = "postgres://user:pass@host/db"              # production
# url = "http://localhost:8123"                      # analytics (ClickHouse)
# url = "postgres://${PGUSER}:${PGPASSWORD}@host/db" # env var refs for credentials
batch_size = 100                    # blocks buffered before flushing

[pipeline]
workers = 64                        # concurrent S3 fetches (increase to 256-512 for max speed)
channel_size = 1024                 # backpressure buffer
retry_attempts = 3                  # exponential backoff on S3 errors
retry_delay_ms = 1000

[live]
poll_interval_ms = 1000             # ~1s matches Hyperliquid block time
min_poll_interval_ms = 200          # adaptive backoff floor
poll_decay = 0.67                   # shrink factor when no new block
gap_threshold = 100                 # blocks behind → switch to parallel backfill
backfill_workers = 64

[hip4]                              # disabled by default
enabled = false
# contest_address = "0x4fd772e5708da2a7f097f51b3127e515a72744bd"  # testnet
# api_url = "https://api.hyperliquid-testnet.xyz/info"
# meta_poll_interval_s = 60         # market metadata refresh
# price_poll_interval_s = 5         # implied probability refresh

[fills]                             # disabled by default
enabled = false
# bucket = "hl-mainnet-node-data"
# mirror_hip4 = true                # auto-copy #-prefixed fills to hip4_trades

Recommended progression: Start with SQLite for prototyping → PostgreSQL for production → ClickHouse when you need sub-second aggregation over millions of rows. Switch by changing one URL.

See hypercore.example.toml for the full annotated reference.

Storage

Three backends, same Storage trait. Auto-detected from the URL:

Backend URL Best for
SQLite sqlite:./hypercore.db Local dev, single-user, zero setup
PostgreSQL postgres://user:pass@host/db Production, concurrent queries
ClickHouse http://localhost:8123 Analytics, large-scale aggregation

All backends use batch inserts, idempotent writes (ON CONFLICT DO NOTHING / ReplacingMergeTree), and atomic cursor updates.

HIP4 Prediction Markets

HIP4 is Hyperliquid's binary prediction market system. Markets trade on the HyperCore L1 order book (perps with #-prefixed coin names like #90), while deposits, claims, and refunds happen on the HyperEVM contest contract.

hypercore-indexer captures both layers:

Data Source Table
Contest deposits HyperEVM event logs (Deposit event) hip4_deposits
Contest claims/refunds HyperEVM event logs (Claimed event) hip4_claims
Market metadata HyperCore outcomeMeta API hip4_markets
Price snapshots HyperCore allMids API (# coins) hip4_prices
Trade fills S3 node_fills / mirrored from fills hip4_trades

Status: HIP4 is live on testnet. When it launches on mainnet, change [network] name = "mainnet" and update the contest_address -- everything else works automatically.

Enable HIP4:

[hip4]
enabled = true
contest_address = "0x4fd772e5708da2a7f097f51b3127e515a72744bd"
api_url = "https://api.hyperliquid-testnet.xyz/info"

Standalone poller: Run hip4-poll to poll market metadata and prices without running the full indexer pipeline. Useful for keeping hip4_markets and hip4_prices up to date independently.

Trade Fills

The fills-backfill command ingests all Hyperliquid trade history from S3 node_fills data. Every perp, spot, commodity, and HIP4 trade is stored in the fills table.

# Backfill all trades for a date range
hypercore-indexer fills-backfill --from-date 20260301 --to-date 20260315

When [fills] mirror_hip4 = true (the default), fills with #-prefixed coins (HIP4 prediction markets) are automatically mirrored to the hip4_trades table. When HIP4 goes mainnet, prediction market trade data appears in hip4_trades with zero additional configuration.

Why

  • S3-native: Hyperliquid publishes block data to S3 as LZ4+MessagePack, not via RPC. Standard EVM indexers (rindexer, Ponder, etc.) can't ingest this format. hypercore-indexer reads it natively.
  • Dual phantom hashes: System transactions (HyperCore-to-HyperEVM bridge) have two valid transaction hashes depending on who computed them. We store both so lookups work against the official RPC and block explorers.
  • Parallel ingestion: Configurable worker pool (up to 512 concurrent S3 fetches) with backpressure, exponential retry, and cursor-based resume. Crash-safe with atomic batch writes.

Architecture

Pipeline

Three data sources flow into the same storage:

S3 EVM blocks → parallel fetch workers → LZ4 decompress → MessagePack decode → RLP hash → batch insert

S3 node_fills → JSONL parse → fills table → hip4_trades mirror (for #-prefixed coins)

HyperCore API → background poller → hip4_markets + hip4_prices

Key design choices:

  • Custom serde instead of reth types — the S3 MessagePack format has quirks (camelCase fields, positional signatures, string tx_type) that don't match reth's serde
  • Contiguous cursor — tracks the highest block where all preceding blocks are complete, not just the max seen. Safe to resume after crashes
  • Crash recovery — cursor and data commit atomically. Restart = resume. Idempotent inserts across all backends
  • rustls everywhere — no OpenSSL dependency

Performance

Remote benchmarks (MacBook to ap-northeast-1)

Pipeline Workers Throughput 10k blocks
S3 fetch + decode 128 308 blocks/sec 32s
S3 fetch + decode 256 388 blocks/sec 25s
S3 + decode + PostgreSQL 256 388 blocks/sec 25s
S3 + decode (no storage) 512 450 blocks/sec 22s

Storage adds near-zero overhead -- UNNEST batch inserts (5 SQL statements per 100 blocks) complete faster than a single S3 round-trip.

Worker scaling

Workers Blocks/sec Notes
64 130 Conservative default
128 308 Good for remote
256 388 Recommended remote
512 450 Sweet spot (remote)
1024 444 Plateau -- TCP overhead
2048 433 Diminishing returns

Where the time goes

S3 GET round-trip:   ~1.1s per request  <- 99% of wall time
LZ4 decompress:       <1ms/block
MessagePack decode:    <1ms/block
RLP + keccak256 hash:  <1ms/block
PostgreSQL insert:     <1ms/block

The indexer is S3-latency-limited, not compute-limited. Each block is a separate 2-4KB S3 object -- there's no multi-object GET or batch download API. Throughput = workers_in_flight / avg_round_trip.

Deployment recommendation

Run the initial backfill from EC2 in ap-northeast-1 (same region as the S3 bucket) for maximum throughput. Once caught up, incremental indexing from anywhere keeps pace easily — Hyperliquid produces ~1 block/sec.

Networks

Mainnet Testnet
Chain ID 999 998
S3 Bucket hl-mainnet-evm-blocks hl-testnet-evm-blocks
RPC rpc.hyperliquid.xyz/evm rpc.hyperliquid-testnet.xyz/evm
HIP4 Not yet Live
S3 start block 0 18,000,000

Cost

The S3 bucket is requester-pays -- you pay for GET requests and data transfer. Pricing is based on ap-northeast-1 rates.

Full backfill (~30M blocks, ~27GB)

Cost Type Remote EC2 same-region
S3 GET requests (30M) $11.10 $11.10
Data transfer (27GB) $3.08 $0 (free same-region)
Total ~$14 ~$11

Ongoing (after backfill)

Hyperliquid produces ~1 block/sec = ~86,400 blocks/day:

Cost Type Monthly
S3 GET requests (~2.6M/month) ~$0.96
Data transfer (~7.5GB/month) ~$0.86
Total ~$2/month

Full backfill costs about the price of a coffee. Ongoing indexing is negligible.

Requirements

  • Rust 1.91+
  • AWS credentials configured (S3 bucket is requester-pays, region ap-northeast-1)
  • Docker (optional, for PostgreSQL dev environment via make dev)

Data Model

Core (EVM blocks from S3)

blocks
├── block_number (PK)
├── block_hash, parent_hash
├── timestamp, gas_used, gas_limit, base_fee_per_gas
└── tx_count, system_tx_count

transactions
├── block_number, tx_index (PK)
├── tx_hash                          ← computed via RLP + keccak256
├── tx_type (Legacy/Eip2930/Eip1559)
├── from, to, value, input
├── gas_limit, gas_used, success
└── FK → blocks

event_logs
├── block_number, tx_index, log_index (PK)
├── address, topic0..topic3, data
└── FK → transactions

system_transfers
├── block_number, tx_index (PK)
├── official_hash, explorer_hash     ← dual phantom hashes
├── system_address, asset_type, asset_index
├── recipient, amount_wei
└── FK → blocks

Trade Fills (from S3 node_fills_by_block)

fills
├── trade_id, user_address (PK)
├── block_number, block_time
├── coin                              ← "BTC", "ETH", "@230", "#90"
├── price, size, side, direction
├── closed_pnl, fee, fee_token
└── fill_time

hip4_trades                           ← mirror of fills where coin starts with #
└── same schema as fills

HIP4 Prediction Markets

hip4_deposits                         ← decoded from EVM Deposit events
├── block_number, tx_index, log_index (PK)
├── contest_id, side_id
├── depositor, amount_wei
└── FK → event_logs

hip4_claims                           ← decoded from EVM Claimed events
├── same structure as hip4_deposits
└── claimer, amount_wei

hip4_contest_creations                ← decoded from createContest() calls
├── block_number, tx_index (PK)
└── contest_id, param2

hip4_refunds                          ← decoded from refund() calls
├── block_number, tx_index (PK)
└── contest_id, side_id, user_address

hip4_sweeps                           ← decoded from sweepUnclaimed() calls
├── block_number, tx_index (PK)
└── contest_id

hip4_markets                          ← polled from outcomeMeta API
├── outcome_id (PK)
├── name, description, side_specs (JSON)
└── question_id, question_name

hip4_prices                           ← polled from allMids API
├── coin, timestamp (PK)              ← "#90", "#91", "#11760"
└── mid_price                         ← 0.0 to 1.0 (implied probability)

indexer_cursor
├── network (PK)                      ← "mainnet" or "testnet"
└── last_block, updated_at

Data Flow

S3 EVM blocks ──→ blocks → transactions → event_logs → system_transfers
                                       └→ hip4_deposits, hip4_claims (decoded)
                                       └→ hip4_contest_creations, hip4_refunds, hip4_sweeps

S3 node_fills ──→ fills ──→ hip4_trades (# coins mirrored)

HyperCore API ──→ hip4_markets (metadata)
               └→ hip4_prices (implied probability snapshots)

About

HyperCore with HIP4 HL rust optimized indexer

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages