Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal docs for archive db apps #14269

Merged
merged 7 commits into from
Oct 6, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions helm/cron_jobs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Replayer cron jobs
==================

There are replayer cron jobs for mainnet, devnet, and berkeley. These
jobs are run daily, to replayer a day's worth of transactions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
There are replayer cron jobs for mainnet, devnet, and berkeley. These
jobs are run daily, to replayer a day's worth of transactions.
There are replayer cron jobs for Mainnet, Devnet, and Berkeley. These
jobs are run daily to replay a day's worth of transactions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in my own commit. I'm keeping the comma, though :-)


Each cron job downloads the most recent archive dump corresponding to
a network, and loads the data into Postgresql. That results in an
archive database. The most recent replayer checkpoint file is
downloaded, which provides the starting point for the replayer. When
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
a network, and loads the data into Postgresql. That results in an
archive database. The most recent replayer checkpoint file is
downloaded, which provides the starting point for the replayer. When
a network, and loads the data into PostgreSQL. That results in an
archive database. The most recent replayer checkpoint file is
downloaded, which provides the starting point for the replayer. When

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in my commit.

the replayer runs, it creates new checkpoint files every 50
blocks. When the replayer finishes, it uploads the most recent
checkpoint file, so it can be used in the following day's run. If
there are any errors, the replayer logs are also uploaded.

There is a separate checkpoint file bucket for each network. Bot the
checkpoint files and error files for a given network are uploaded to
the same bucket.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
There is a separate checkpoint file bucket for each network. Bot the
checkpoint files and error files for a given network are uploaded to
the same bucket.
There is a separate checkpoint file bucket for each network. Both the
checkpoint files and error files for a given network are uploaded to
the same bucket.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in my own commit.

21 changes: 20 additions & 1 deletion src/app/archive_blocks/archive_blocks.ml
Original file line number Diff line number Diff line change
@@ -1,4 +1,23 @@
(* archive_blocks.ml -- archive precomputed or extensional blocks to Postgresql *)
(* archive_blocks.ml *)

(* archive_blocks adds blocks in either "precomputed" or "extensional" format
to the archive database.

Precomputed blocks are stored in the bucket `mina_network_block_data`
on Google Cloud Storage. Blocks are named NETWORK-HEIGHT-STATEHASH.json.
Example: mainnet-100000-3NKLvMCimUjX1zjjiC3XPMT34D1bVQGzkKW58XDwFJgQ5wDQ9Tki.json.

Extensional blocks are extracted from other archive databases using
the `extract_blocks` app.

As many blocks as are available can be added at a time, but all blocks
must be in the same format.

Except for blocks from the original mainnet, both precomputed and
extensional blocks have a version in their JSON
representation. That version must match the corresponding OCaml
type in the code when this app was built.
*)

open Core_kernel
open Async
Expand Down
14 changes: 14 additions & 0 deletions src/app/extract_blocks/extract_blocks.ml
Original file line number Diff line number Diff line change
@@ -1,4 +1,18 @@
(* extract_blocks.ml -- dump extensional blocks from archive db *)

(* extract_blocks pulls out individual blocks from an archive database
in "extensional" format

Such blocks can be added to other archive databases using the
`archive_blocks` app

Blocks are extracted into files with name <state-hash>.json

The app offers the choice to extract all canonical blocks,
or a subchain specified with starting state hash,
or a subchain specified with starting and ending state hashes
*)

[@@@coverage exclude_file]

open Core_kernel
Expand Down
11 changes: 10 additions & 1 deletion src/app/missing_blocks_auditor/missing_blocks_auditor.ml
Original file line number Diff line number Diff line change
@@ -1,4 +1,13 @@
(* missing_blocks_auditor.ml -- report missing blocks from an archive db *)
(* missing_blocks_auditor.ml *)

(* missing_blocks_auditor looks for blocks without parent blocks in an
archive database.

The app also looks for blocks marked as pending that are lower
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The app also looks for blocks marked as pending that are lower
The missing blocks auditor app also looks for blocks marked as pending that are lower

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The app" is the ML file that is the missing blocks auditor app?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .ml file gets compiled to an executable program. "The app" is the executable.

(have a lesser height) than the highest (most recent) canonical
block. There can be such blocks if blocks are added when there are
missing blocks in the database.
*)

open Core_kernel
open Async
Expand Down