jam-etl transforms raw JAM telemetry .log
files into JSONL files compatible with Dune Analytics ingestion and Google BigQuery ingestion.
This is a Proof of Concept right now, based on JAM DUNA's tinytestnet data.
.
βββ logs/ # RAW .log files (each representing ~1 minute of telemetry of multiple msg types)
βββ dune_exports/ # Output JSONL files, one per table -- this goes into Dune
βββ schema/ # BigQuery table schema -- used to set up BQ tables and ingest into them
βββ jam-to-bigquery.js
βββ jam-to-dune.js
βββ README.md
Table | msg_type | Description | Elapsed | Samples |
---|---|---|---|---|
blocks |
128 | Block headers and extrinsics metadata | author time | Sample |
validatorstatistics |
13 | Per-validator block stats | - | Sample |
corestatistics |
13 | Per-core statistics | - | Sample |
servicestatistics |
13 | Per-service processing metrics | - | Sample |
workreports |
255 | Work report generation (both refining and auditing, role) | refining+audit time | Sample |
preimages |
142 | Preimages submitted | - | Sample |
assurances |
141 | Validator assurances | assurance time | Sample |
tickets |
131 | Validator-generated tickets | generation time | Sample |
workpackages |
133 | Work package bundles | - | Sample |
services |
143 | Registered service metadata | - | Sample |
segments |
135 | Segment payloads from the erasure-coded DAG | - | Sample |
All tables contain common fields:
time
team
sender_id
elapsed
codec_encoded
(raw binary for audit)
- Node.js v16+
- Input
.log
files in./data/
, each being line-delimited JSON
Reads logs and exports telemetry to dune_exports/*.jsonl
:
This will:
- Read all
.log
files in./logs
, in sorted order. - Parse each line as a JAM telemetry blob.
- Dispatch to the appropriate handler based on
msg_type
. - Append one row per record to the appropriate table in
dune_exports/
.
Run with:
node jam-to-dune.js
# View exported files
ls dune_exports/*.jsonl
The generated .jsonl
files can be ingested into Dune via their ingestion pipelines or manually uploaded via the Dune UI or API.
Each message is parsed based on its msg_type
and streamed into one of 11 BigQuery tables within the crypto_jam
dataset:
Message Type | Table Name | Description |
---|---|---|
128 | blocks |
Full block header + extrinsics |
13 | validatorstatistics , corestatistics , servicestatistics |
Breakdown of Statistics |
255 | workreports |
Results of refinement/guaranteeing |
142 | preimages |
Preimages |
141 | assurances |
Validator assurances |
131 | tickets |
Tickets submitted |
0 | workpackages |
Raw work packages |
143 | services |
New services created from bootstrap services |
3 | segments |
D3L Segments exported |
To stream data into Bigquery:
node jam-to-bigquery.js [filename]
This will:
- Read
.log
files from./logs/
- Parse each line into structured output
- Stream data to Bigquery
Use the following CLI commands (requires bq
CLI configured):
bq mk --table crypto_jam.blocks blocks.json
bq mk --table crypto_jam.validatorstatistics validatorstatistics.json
bq mk --table crypto_jam.corestatistics corestatistics.json
bq mk --table crypto_jam.servicestatistics servicestatistics.json
bq mk --table crypto_jam.workreports workreports.json
bq mk --table crypto_jam.preimages preimages.json
bq mk --table crypto_jam.assurances assurances.json
bq mk --table crypto_jam.tickets tickets.json
bq mk --table crypto_jam.workpackages workpackages.json
bq mk --table crypto_jam.services services.json
bq mk --table crypto_jam.segments segments.json
Each JSON schema file corresponds to a table with the same name.
Our implementation of the telemetry server is basically nothing but syslog-ng:
@version: 3.27
@include "scl.conf"
# systemctl restart syslog-ng.service
# global options.
options { chain_hostnames(off); flush_lines(0); use_dns(yes); use_fqdn(no);
dns_cache(yes); owner("root"); group("adm"); perm(0640);
stats_freq(0); keep_hostname(yes); log_fifo_size(100000);
log_msg_size(25000000); # Support 12MB wps, preimages.
};
source s_network {
tcp(ip(0.0.0.0) port(5000) keep_hostname(yes) use-dns(yes));
};
destination d_logs {
file("/var/log/jam/${YEAR}/${MONTH}/${DAY}/${HOUR}${MIN}.log" template("${MESSAGE}\n") create-dirs(yes));
};
filter f_jamduna { match("^jamtart" value("PROGRAM")); };
log {
source(s_network);
filter(f_jamduna);
destination(d_logs);
};
package main
import (
"log"
"log/syslog"
)
// TODO: put the JSON envelope here
func main() {
// Establish a connection to the remote syslog server over TCP
writer, err := syslog.Dial("tcp", "TODO.jamduna.org:5000", syslog.LOG_INFO|syslog.LOG_USER, "jamtart")
if err != nil {
log.Fatalf("Failed to connect to syslog server: %v", err)
}
defer writer.Close()
// Send an informational log message
err = writer.Info("Telemetry event: workpackage submitted")
if err != nil {
log.Printf("Syslog write error: %v", err)
}
}
[dependencies]
syslog = "6.0"
use syslog::{Facility, Formatter3164};
fn main() {
// Define syslog configuration
let formatter = Formatter3164 {
facility: Facility::LOG_USER,
hostname: None,
process: "jamtart".into(),
pid: 0,
};
// TODO: put the JSON envelope here
// Connect to the syslog server (TCP with rfc5424 format)
match syslog::tcp(formatter, "tbd.jamduna.org:5000") {
Err(e) => eprintln!("Failed to connect to syslog: {}", e),
Ok(mut logger) => {
if let Err(e) = logger.info("Telemetry event: workpackage submitted") {
eprintln!("Failed to send syslog message: {}", e);
}
}
}
}
To get on the Dune JAM Implementer leaderboard use the following endpoint:
- JAM DUNA Endpoint: TBD (use tbd.jamduna.org as a placeholder for now)
Each implementation should follow this:
sender_id
- use the dev public keys (see key) for now. A more sophisticated schemeteam
- use the name you supplied in clientselapsed
- measured in microseconds: (1s = 1000000 microseconds)- blocks (128), statistics (13): microseconds taken to make a block and compute the new state root, which includes all ordered accumulation
- workreports (255): microseconds taken to refine a work packages. No communications
- assurances (141): time to assure the data, from the moment a new block is known to the time when a CE141 is submitted
- tickets (131): time to generate the ticket. No communications.
- preimages (142): zero for now.
- workpackages (0): zero for now
- services (143): zero for now
- segments (3): zero for now
Find us on the JAM DAO Discord in #implementers or #tinytestnets (100% public, note: all communications will be archived) to start sending some telemetry data.