From ea795440aea068d80b79f07407c9e904583be078 Mon Sep 17 00:00:00 2001 From: Paul Steckler Date: Wed, 4 Oct 2023 16:54:28 -0700 Subject: [PATCH 1/4] Internal docs for archive db apps --- src/app/archive_blocks/archive_blocks.ml | 21 ++++++++++++++++++- src/app/extract_blocks/extract_blocks.ml | 14 +++++++++++++ .../missing_blocks_auditor.ml | 11 +++++++++- 3 files changed, 44 insertions(+), 2 deletions(-) diff --git a/src/app/archive_blocks/archive_blocks.ml b/src/app/archive_blocks/archive_blocks.ml index caa4203939e..4dc565f9ff2 100644 --- a/src/app/archive_blocks/archive_blocks.ml +++ b/src/app/archive_blocks/archive_blocks.ml @@ -1,4 +1,23 @@ -(* archive_blocks.ml -- archive precomputed or extensional blocks to Postgresql *) +(* archive_blocks.ml *) + +(* archive_blocks adds blocks in either "precomputed" or "extensional" format + to the archive database. + + Precomputed blocks are stored in the bucket `mina_network_block_data` + on Google Cloud Storage. Blocks are named NETWORK-HEIGHT-STATEHASH.json. + Example: mainnet-100000-3NKLvMCimUjX1zjjiC3XPMT34D1bVQGzkKW58XDwFJgQ5wDQ9Tki.json. + + Extensional blocks are extracted from other archive databases using + the `extract_blocks` app. + + As many blocks as are available can be added at a time, but all blocks + must be in the same format. + + Except for blocks from the original mainnet, both precomputed and + extensional blocks have a version in their JSON + representation. That version must match the corresponding OCaml + type in the code when this app was built. +*) open Core_kernel open Async diff --git a/src/app/extract_blocks/extract_blocks.ml b/src/app/extract_blocks/extract_blocks.ml index f4fa0ac2e77..a97cf7b712f 100644 --- a/src/app/extract_blocks/extract_blocks.ml +++ b/src/app/extract_blocks/extract_blocks.ml @@ -1,4 +1,18 @@ (* extract_blocks.ml -- dump extensional blocks from archive db *) + +(* extract_blocks pulls out individual blocks from an archive database + in "extensional" format + + Such blocks can be added to other archive databases using the + `archive_blocks` app + + Blocks are extracted into files with name .json + + The app offers the choice to extract all canonical blocks, + or a subchain specified with starting state hash, + or a subchain specified with starting and ending state hashes +*) + [@@@coverage exclude_file] open Core_kernel diff --git a/src/app/missing_blocks_auditor/missing_blocks_auditor.ml b/src/app/missing_blocks_auditor/missing_blocks_auditor.ml index 5bce995ea28..5ed06a0be22 100644 --- a/src/app/missing_blocks_auditor/missing_blocks_auditor.ml +++ b/src/app/missing_blocks_auditor/missing_blocks_auditor.ml @@ -1,4 +1,13 @@ -(* missing_blocks_auditor.ml -- report missing blocks from an archive db *) +(* missing_blocks_auditor.ml *) + +(* missing_blocks_auditor looks for blocks without parent blocks in an + archive database. + + The app also looks for blocks marked as pending that are lower + (have a lesser height) than the highest (most recent) canonical + block. There can be such blocks if blocks are added when there are + missing blocks in the database. +*) open Core_kernel open Async From 0ce11c82371c9ee5ababca9b9f33dbc55ff9fdc0 Mon Sep 17 00:00:00 2001 From: Paul Steckler Date: Wed, 4 Oct 2023 17:50:25 -0700 Subject: [PATCH 2/4] Add README for replayer cron jobs --- helm/cron_jobs/README.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) create mode 100644 helm/cron_jobs/README.md diff --git a/helm/cron_jobs/README.md b/helm/cron_jobs/README.md new file mode 100644 index 00000000000..19fedb0cf55 --- /dev/null +++ b/helm/cron_jobs/README.md @@ -0,0 +1,18 @@ +Replayer cron jobs +================== + +There are replayer cron jobs for mainnet, devnet, and berkeley. These +jobs are run daily, to replayer a day's worth of transactions. + +Each cron job downloads the most recent archive dump corresponding to +a network, and loads the data into Postgresql. That results in an +archive database. The most recent replayer checkpoint file is +downloaded, which provides the starting point for the replayer. When +the replayer runs, it creates new checkpoint files every 50 +blocks. When the replayer finishes, it uploads the most recent +checkpoint file, so it can be used in the following day's run. If +there are any errors, the replayer logs are also uploaded. + +There is a separate checkpoint file bucket for each network. Bot the +checkpoint files and error files for a given network are uploaded to +the same bucket. From 27bac985e5789cff8acf589a4c8669ed1316dad0 Mon Sep 17 00:00:00 2001 From: Paul Steckler Date: Thu, 5 Oct 2023 12:03:48 -0700 Subject: [PATCH 3/4] move comment to READMEs --- helm/cron_jobs/README.md | 4 ++-- src/app/archive_blocks/README.md | 20 +++++++++++++++++++ src/app/archive_blocks/archive_blocks.ml | 19 ------------------ src/app/extract_blocks/README.md | 12 +++++++++++ src/app/extract_blocks/extract_blocks.ml | 13 ------------ src/app/missing_blocks_auditor/README.md | 10 ++++++++++ .../missing_blocks_auditor.ml | 9 --------- 7 files changed, 44 insertions(+), 43 deletions(-) create mode 100644 src/app/archive_blocks/README.md create mode 100644 src/app/extract_blocks/README.md create mode 100644 src/app/missing_blocks_auditor/README.md diff --git a/helm/cron_jobs/README.md b/helm/cron_jobs/README.md index 19fedb0cf55..1ae37152a80 100644 --- a/helm/cron_jobs/README.md +++ b/helm/cron_jobs/README.md @@ -2,7 +2,7 @@ Replayer cron jobs ================== There are replayer cron jobs for mainnet, devnet, and berkeley. These -jobs are run daily, to replayer a day's worth of transactions. +jobs are run daily, to replay a day's worth of transactions. Each cron job downloads the most recent archive dump corresponding to a network, and loads the data into Postgresql. That results in an @@ -13,6 +13,6 @@ blocks. When the replayer finishes, it uploads the most recent checkpoint file, so it can be used in the following day's run. If there are any errors, the replayer logs are also uploaded. -There is a separate checkpoint file bucket for each network. Bot the +There is a separate checkpoint file bucket for each network. Both the checkpoint files and error files for a given network are uploaded to the same bucket. diff --git a/src/app/archive_blocks/README.md b/src/app/archive_blocks/README.md new file mode 100644 index 00000000000..718e7a48467 --- /dev/null +++ b/src/app/archive_blocks/README.md @@ -0,0 +1,20 @@ +archive_blocks +============== + +The `archive_blocks` app adds blocks in either "precomputed" or +"extensional" format to the archive database. + +Precomputed blocks are stored in the bucket `mina_network_block_data` +on Google Cloud Storage. Blocks are named NETWORK-HEIGHT-STATEHASH.json. +Example: mainnet-100000-3NKLvMCimUjX1zjjiC3XPMT34D1bVQGzkKW58XDwFJgQ5wDQ9Tki.json. + +Extensional blocks are extracted from other archive databases using +the `extract_blocks` app. + +As many blocks as are available can be added at a time, but all blocks +must be in the same format. + +Except for blocks from the original mainnet, both precomputed and +extensional blocks have a version in their JSON representation. That +version must match the corresponding OCaml type in the code when this +app was built. diff --git a/src/app/archive_blocks/archive_blocks.ml b/src/app/archive_blocks/archive_blocks.ml index 4dc565f9ff2..7851ca04518 100644 --- a/src/app/archive_blocks/archive_blocks.ml +++ b/src/app/archive_blocks/archive_blocks.ml @@ -1,24 +1,5 @@ (* archive_blocks.ml *) -(* archive_blocks adds blocks in either "precomputed" or "extensional" format - to the archive database. - - Precomputed blocks are stored in the bucket `mina_network_block_data` - on Google Cloud Storage. Blocks are named NETWORK-HEIGHT-STATEHASH.json. - Example: mainnet-100000-3NKLvMCimUjX1zjjiC3XPMT34D1bVQGzkKW58XDwFJgQ5wDQ9Tki.json. - - Extensional blocks are extracted from other archive databases using - the `extract_blocks` app. - - As many blocks as are available can be added at a time, but all blocks - must be in the same format. - - Except for blocks from the original mainnet, both precomputed and - extensional blocks have a version in their JSON - representation. That version must match the corresponding OCaml - type in the code when this app was built. -*) - open Core_kernel open Async open Archive_lib diff --git a/src/app/extract_blocks/README.md b/src/app/extract_blocks/README.md new file mode 100644 index 00000000000..bbaecdbb29b --- /dev/null +++ b/src/app/extract_blocks/README.md @@ -0,0 +1,12 @@ +extract_blocks +============== + +The `extract_blocks` app pulls out individual blocks from an archive +database in "extensional" format. Such blocks can be added to other +archive databases using the `archive_blocks` app. + +Blocks are extracted into files with name .json. + +The app offers the choice to extract all canonical blocks, or a +subchain specified with starting state hash, or a subchain specified +with starting and ending state hashes. diff --git a/src/app/extract_blocks/extract_blocks.ml b/src/app/extract_blocks/extract_blocks.ml index a97cf7b712f..e7f58718cba 100644 --- a/src/app/extract_blocks/extract_blocks.ml +++ b/src/app/extract_blocks/extract_blocks.ml @@ -1,18 +1,5 @@ (* extract_blocks.ml -- dump extensional blocks from archive db *) -(* extract_blocks pulls out individual blocks from an archive database - in "extensional" format - - Such blocks can be added to other archive databases using the - `archive_blocks` app - - Blocks are extracted into files with name .json - - The app offers the choice to extract all canonical blocks, - or a subchain specified with starting state hash, - or a subchain specified with starting and ending state hashes -*) - [@@@coverage exclude_file] open Core_kernel diff --git a/src/app/missing_blocks_auditor/README.md b/src/app/missing_blocks_auditor/README.md new file mode 100644 index 00000000000..e5db4ee64a3 --- /dev/null +++ b/src/app/missing_blocks_auditor/README.md @@ -0,0 +1,10 @@ +missing_blocks_auditor +====================== + +The `missing_blocks_auditor` app looks for blocks without parent +blocks in an archive database. + +The app also looks for blocks marked as pending that are lower (have a +lesser height) than the highest (most recent) canonical block. There +can be such blocks if blocks are added when there are missing blocks +in the database. diff --git a/src/app/missing_blocks_auditor/missing_blocks_auditor.ml b/src/app/missing_blocks_auditor/missing_blocks_auditor.ml index 5ed06a0be22..e6fd538f1b5 100644 --- a/src/app/missing_blocks_auditor/missing_blocks_auditor.ml +++ b/src/app/missing_blocks_auditor/missing_blocks_auditor.ml @@ -1,14 +1,5 @@ (* missing_blocks_auditor.ml *) -(* missing_blocks_auditor looks for blocks without parent blocks in an - archive database. - - The app also looks for blocks marked as pending that are lower - (have a lesser height) than the highest (most recent) canonical - block. There can be such blocks if blocks are added when there are - missing blocks in the database. -*) - open Core_kernel open Async From dbd5895eaa6970f9b8bebe5bfbd8f876a5321b9b Mon Sep 17 00:00:00 2001 From: Paul Steckler Date: Thu, 5 Oct 2023 12:12:07 -0700 Subject: [PATCH 4/4] Address PR comments --- helm/cron_jobs/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/helm/cron_jobs/README.md b/helm/cron_jobs/README.md index 1ae37152a80..edd63378621 100644 --- a/helm/cron_jobs/README.md +++ b/helm/cron_jobs/README.md @@ -1,11 +1,11 @@ Replayer cron jobs ================== -There are replayer cron jobs for mainnet, devnet, and berkeley. These +There are replayer cron jobs for Mainnet, Devnet, and Berkeley. These jobs are run daily, to replay a day's worth of transactions. Each cron job downloads the most recent archive dump corresponding to -a network, and loads the data into Postgresql. That results in an +a network, and loads the data into PostgreSQL. That results in an archive database. The most recent replayer checkpoint file is downloaded, which provides the starting point for the replayer. When the replayer runs, it creates new checkpoint files every 50