The solana-geyser-plugin-bigtable
crate implements a plugin storing
account, block, transaction data to a Bigtable database using the
Geyser Plugin Framework.
The plugin is configured using the input configuration file. An example configuration file looks like the following:
{
"libpath": "/solana/target/release/libsolana_geyser_plugin_bigtable.so",
"credential_path": "/home/solana/geyser-big-table-creds.json",
"instance": "geyser-bigtable",
"threads": 80,
"batch_size": 20,
"panic_on_db_errors": true,
"accounts_selector" : {
"accounts" : ["*"]
},
}
credential_path
specifies the path of the Bigtable credential JSON file.
Please refer to https://cloud.google.com/docs/authentication/getting-started
on creating a service account key.
It must be in the format specified by https://github.com/durch/rust-goauth. For example:
{
"type": "service_account",
"project_id": "dummy",
"private_key_id": "dummy",
"private_key": "-----BEGIN PRIVATE KEY-----\nMIIEvwIBADANBgkqhkiG9w0BAQEFAASCBKkwggSlAgEAAoIBAQDNk6cKkWP/4NMu\nWb3s24YHfM639IXzPtTev06PUVVQnyHmT1bZgQ/XB6BvIRaReqAqnQd61PAGtX3e\n8XocTw+u/ZfiPJOf+jrXMkRBpiBh9mbyEIqBy8BC20OmsUc+O/YYh/qRccvRfPI7\n3XMabQ8eFWhI6z/t35oRpvEVFJnSIgyV4JR/L/cjtoKnxaFwjBzEnxPiwtdy4olU\nKO/1maklXexvlO7onC7CNmPAjuEZKzdMLzFszikCDnoKJC8k6+2GZh0/JDMAcAF4\nwxlKNQ89MpHVRXZ566uKZg0MqZqkq5RXPn6u7yvNHwZ0oahHT+8ixPPrAEjuPEKM\nUPzVRz71AgMBAAECggEAfdbVWLW5Befkvam3hea2+5xdmeN3n3elrJhkiXxbAhf3\nE1kbq9bCEHmdrokNnI34vz0SWBFCwIiWfUNJ4UxQKGkZcSZto270V8hwWdNMXUsM\npz6S2nMTxJkdp0s7dhAUS93o9uE2x4x5Z0XecJ2ztFGcXY6Lupu2XvnW93V9109h\nkY3uICLdbovJq7wS/fO/AL97QStfEVRWW2agIXGvoQG5jOwfPh86GZZRYP9b8VNw\ntkAUJe4qpzNbWs9AItXOzL+50/wsFkD/iWMGWFuU8DY5ZwsL434N+uzFlaD13wtZ\n63D+tNAxCSRBfZGQbd7WxJVFfZe/2vgjykKWsdyNAQKBgQDnEBgSI836HGSRk0Ub\nDwiEtdfh2TosV+z6xtyU7j/NwjugTOJEGj1VO/TMlZCEfpkYPLZt3ek2LdNL66n8\nDyxwzTT5Q3D/D0n5yE3mmxy13Qyya6qBYvqqyeWNwyotGM7hNNOix1v9lEMtH5Rd\nUT0gkThvJhtrV663bcAWCALmtQKBgQDjw2rYlMUp2TUIa2/E7904WOnSEG85d+nc\norhzthX8EWmPgw1Bbfo6NzH4HhebTw03j3NjZdW2a8TG/uEmZFWhK4eDvkx+rxAa\n6EwamS6cmQ4+vdep2Ac4QCSaTZj02YjHb06Be3gptvpFaFrotH2jnpXxggdiv8ul\n6x+ooCffQQKBgQCR3ykzGoOI6K/c75prELyR+7MEk/0TzZaAY1cSdq61GXBHLQKT\nd/VMgAN1vN51pu7DzGBnT/dRCvEgNvEjffjSZdqRmrAVdfN/y6LSeQ5RCfJgGXSV\nJoWVmMxhCNrxiX3h01Xgp/c9SYJ3VD54AzeR/dwg32/j/oEAsDraLciXGQKBgQDF\nMNc8k/DvfmJv27R06Ma6liA6AoiJVMxgfXD8nVUDW3/tBCVh1HmkFU1p54PArvxe\nchAQqoYQ3dUMBHeh6ZRJaYp2ATfxJlfnM99P1/eHFOxEXdBt996oUMBf53bZ5cyJ\n/lAVwnQSiZy8otCyUDHGivJ+mXkTgcIq8BoEwERFAQKBgQDmImBaFqoMSVihqHIf\nDa4WZqwM7ODqOx0JnBKrKO8UOc51J5e1vpwP/qRpNhUipoILvIWJzu4efZY7GN5C\nImF9sN3PP6Sy044fkVPyw4SYEisxbvp9tfw8Xmpj/pbmugkB2ut6lz5frmEBoJSN\n3osZlZTgx+pM3sO6ITV6U4ID2Q==\n-----END PRIVATE KEY-----\n",
"client_email": "dummy@developer.gserviceaccount.com",
"client_id": "dummy",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/457015483506-compute%40developer.gserviceaccount.com"
}
The instance
specifies the Bigtable instance name.
To improve the throughput to the database, the plugin supports connection pooling
using multiple threads, each maintaining a connection to the PostgreSQL database.
The count of the threads is controlled by the threads
field. A higher thread
count usually offers better performance.
To further improve performance when saving large numbers of accounts at
startup, the plugin uses bulk inserts. The batch size is controlled by the
batch_size
parameter. This can help reduce the round trips to the database.
The panic_on_db_errors
can be used to panic the validator in case of database
errors to ensure data consistency.
The accounts_selector
can be used to filter the accounts that should be persisted.
For example, one can use the following to persist only the accounts with particular Base58-encoded Pubkeys,
"accounts_selector" : {
"accounts" : ["pubkey-1", "pubkey-2", ..., "pubkey-n"],
}
Or use the following to select accounts with certain program owners:
"accounts_selector" : {
"owners" : ["pubkey-owner-1", "pubkey-owner-2", ..., "pubkey-owner-m"],
}
To select all accounts, use the wildcard character (*):
"accounts_selector" : {
"accounts" : ["*"],
}
The Cloud BigTable emulator can be used during development/test. See https://cloud.google.com/bigtable/docs/emulator for general setup information.
Process:
- Make sure install GCP CLI, see https://cloud.google.com/sdk/docs/install-sdk.
- Install the Bigtable CLI CBT https://cloud.google.com/bigtable/docs/cbt-overview.
- Run
gcloud beta emulators bigtable start
in the background - Run
$(gcloud beta emulators bigtable env-init)
to establish theBIGTABLE_EMULATOR_HOST
environment variable - Run
./scripts/init-bigtable.sh
to configure the emulator - Develop/test
Export a standard GOOGLE_APPLICATION_CREDENTIALS
environment variable to your
service account credentials. The project should contain a BigTable instance
configured in the config file must have been initialized by running the ./init-bigtable.sh
script.
Depending on what operation mode is required, either the
https://www.googleapis.com/auth/bigtable.data
or
https://www.googleapis.com/auth/bigtable.data.readonly
OAuth scope will be
requested using the provided credentials.
Account and slot metadata are supported with plan to support transaction data, block metadata and account secondary indexes.
The storage-proto contains the gRPC models for the objects. For example for accounts:
message account {
bytes pubkey = 1;
bytes owner = 2;
uint64 lamports = 3;
uint64 slot = 4;
bool executable = 5;
uint64 rent_epoch = 6;
bytes data = 7;
uint64 write_version = 8;
UnixTimestamp updated_on = 9;
}
The following are the tables in the Postgres database
Table | Description |
---|---|
account | Account data |
slot | Slot metadata |
The model data is encoded into binary format and then compressed using compress_best
src/compression.rs.