Skip to content

step-finance/solana-snapshot-etl

 
 

Repository files navigation

Solana Snapshot ETL 📸

crates.io docs.rs license

solana-snapshot-etl efficiently extracts all accounts in a snapshot to load them into an external system.

Motivation

Solana nodes periodically backup their account database into a .tar.zst "snapshot" stream. If you run a node yourself, you've probably seen a snapshot file such as this one already:

snapshot-139240745-D17vR2iksG5RoLMfTX7i5NwSsr4VpbybuX1eqzesQfu2.tar.zst

A full snapshot file contains a copy of all accounts at a specific slot state (in this case slot 139240745).

Historical accounts data is relevant to blockchain analytics use-cases and event tracing. Despite archives being readily available, the ecosystem was missing an easy-to-use tool to access snapshot data.

Building

cargo install --git https://github.com/rpcpool/solana-snapshot-etl --bins

Usage

The ETL tool can extract snapshots from a variety of streaming sources and load them into one of the supported storage backends.

The basic command-line usage is as follows:

$ solana-snapshot-etl --help
Efficiently unpack Solana snapshots

Usage: solana-snapshot-etl --source <SOURCE> <COMMAND>

Commands:
  noop   Load accounts and do nothing
  kafka  Filter accounts with gRPC plugin filter and send them to Kafka
  help   Print this message or the help of the given subcommand(s)

Options:
      --source <SOURCE>  Snapshot source (unpacked snapshot, archive file, or HTTP link)
  -h, --help             Print help
  -V, --version          Print version

Sources

Extract from a local snapshot file:

solana-snapshot-etl --source /path/to/snapshot-*.tar.zst noop

Extract from an unpacked snapshot:

# Example unarchive command
tar -I zstd -xvf snapshot-*.tar.zst ./unpacked_snapshot/

solana-snapshot-etl --source ./unpacked_snapshot/ noop

Stream snapshot from HTTP source or S3 bucket:

solana-snapshot-etl 'https://my-solana-node.bdnodes.net/snapshot.tar.zst?auth=xxx' noop

Targets

noop

Do nothing, only load snapshot, parse accounts.

kafka

solana-snapshot-etl --source /path/to/snapshot-*.tar.zst kafka --config kafka-config.json

Load snapshot, parse account, filter with Solana Geyser gRPC Plugin filter and send filtered accounts to Kafka.

About

Rust tool to efficiently unpack Solana snapshots

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 100.0%