Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: ADR 001 general parser architecture #111

Closed
wants to merge 9 commits into from
Binary file added .docs/.img/adr-001-general-architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
56 changes: 56 additions & 0 deletions .docs/architecture/PROCESS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# ADR Creation Process

1. Copy the `adr-template.md` file. Use the following filename pattern: `adr-next_number-title.md`
2. Create a draft Pull Request if you want to get an early feedback.
3. Make sure the context and a solution is clear and well documented.
4. Add an entry to a list in the [README](./README.md) file.
5. Create a Pull Request to propose a new ADR.

## ADR life cycle

ADR creation is an **iterative** process. Instead of trying to solve all decisions in a single ADR pull request, we MUST firstly understand the problem and collect feedback through a GitHub Issue.

1. Every proposal SHOULD start with a new GitHub Issue or be a result of existing Issues. The Issue should contain just a brief proposal summary.

2. Once the motivation is validated, a GitHub Pull Request (PR) is created with a new document based on the `adr-template.md`.

3. An ADR doesn't have to arrive to `master` with an _accepted_ status in a single PR. If the motivation is clear and the solution is sound, we SHOULD be able to merge it and keep a _proposed_ status. It's preferable to have an iterative approach rather than long, not merged Pull Requests.

4. If a _proposed_ ADR is merged, then it should clearly document outstanding issues either in ADR document notes or in a GitHub Issue.

5. The PR SHOULD always be merged. In the case of a faulty ADR, we still prefer to merge it with a _rejected_ status. The only time the ADR SHOULD NOT be merged is if the author abandons it.

6. Merged ADRs SHOULD NOT be pruned.

### ADR status

Status has two components:

```
{CONSENSUS STATUS} {IMPLEMENTATION STATUS}
```

IMPLEMENTATION STATUS is either `Implemented` or `Not Implemented`.

#### Consensus Status

```
DRAFT -> PROPOSED -> LAST CALL yyyy-mm-dd -> ACCEPTED | REJECTED -> SUPERSEEDED by ADR-xxx
\ |
\ |
v v
ABANDONED
```

+ `DRAFT`: [optional] an ADR which is work in progress, not being ready for a general review. This is to present an early work and get an early feedback in a Draft Pull Request form.
+ `PROPOSED`: an ADR covering a full solution architecture and still in the review - project stakeholders haven't reached an agreed yet.
+ `LAST CALL <date for the last call>`: [optional] clear notify that we are close to accept updates. Changing a status to `LAST CALL` means that social consensus (of Juno maintainers) has been reached and we still want to give it a time to let the community react or analyze.
+ `ACCEPTED`: ADR which will represent a currently implemented or to be implemented architecture design.
+ `REJECTED`: ADR can go from PROPOSED or ACCEPTED to rejected if the consensus among project stakeholders will decide so.
+ `SUPERSEEDED by ADR-xxx`: ADR which has been superseded by a new ADR.
+ `ABANDONED`: the ADR is no longer pursued by the original authors.

## Language used in ADR

+ The context/background should be written in the present tense.
+ Avoid using a first, personal form.
43 changes: 43 additions & 0 deletions .docs/architecture/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Architecture Decision Records (ADR)

This is a location to record all high-level architecture decisions in Juno.

An Architectural Decision (**AD**) is a software design choice that addresses a functional or non-functional requirement that is architecturally significant.
An Architecturally Significant Requirement (**ASR**) is a requirement that has a measurable effect on a software system’s architecture and quality.
An Architectural Decision Record (**ADR**) captures a single AD, such as often done when writing personal notes or meeting minutes; the collection of ADRs created and maintained in a project constitute its decision log. All these are within the topic of Architectural Knowledge Management (AKM).

You can read more about the ADR concept in this [blog post](https://product.reverb.com/documenting-architecture-decisions-the-reverb-way-a3563bb24bd0#.78xhdix6t).

## Rationale

ADRs are intended to be the primary mechanism for proposing new feature designs and new processes, for collecting community input on an issue, and for documenting the design decisions.
An ADR should provide:

- Context on the relevant goals and the current state
- Proposed changes to achieve the goals
- Summary of pros and cons
- References
- Changelog

Note the distinction between an ADR and a spec. The ADR provides the context, intuition, reasoning, and
justification for a change in architecture, or for the architecture of something
new. The spec is much more compressed and streamlined summary of everything as
it stands today.

If recorded decisions turned out to be lacking, convene a discussion, record the new decisions here, and then modify the code to match.

## Creating new ADR

Read about the [PROCESS](./PROCESS.md).

#### Use RFC 2119 Keywords

When writing ADRs, follow the same best practices for writing RFCs. When writing RFCs, key words are used to signify the requirements in the specification. These words are often capitalized: "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL. They are to be interpreted as described in [RFC 2119](https://datatracker.ietf.org/doc/html/rfc2119).

## ADR Table of Contents

### Accepted

### Proposed

### Draft
227 changes: 227 additions & 0 deletions .docs/architecture/adr-001-general-parser-architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,227 @@
# ADR-001: General parser architecture


## Changelog

- March 11th, 2024: Init draft;
- March 13th, 2024: First draft;

## Status

PROPOSED

## Abstract

This ADR outlines the general parser architecture designed for Juno, addressing the need for enhanced interfaces to support various chains.

## Context

Juno currently lacks the necessary architecture documents and interfaces to effectively support multiple chains. Additionally, it faces challenges in handling messages and transactions, especially when dealing with different blockchain protocols and versions. There is a pressing need to establish a robust and adaptable architecture that can accommodate the diverse requirements of various chains while maintaining efficiency and reliability.

## Decision

To address these challenges, the decision is made to redesign Juno's parser architecture. The architecture will prioritize modularity and extensibility, allowing seamless integration with different blockchain networks and protocols. Key components such as enqueuers, workers, and modules will be defined with clear responsibilities and interfaces to facilitate easy extension and customization. Moreover, the architecture will incorporate mechanisms to handle transactions, messages, and other blockchain operations efficiently.

The provided image offer a comprehensive overview of the Juno parser, detailing its core components and functionalities.

![Architecture](../.img/adr-001-general-architecture.png)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had something like this in mind.

Indexer

Copy link
Contributor Author

@dadamu dadamu Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is great, we can make RabbitMQ or Kafka or other good queue services implement BlockQueue interface if needed.

Besides, cosmos.Tx is different from solana.Tx and ethereum.TX, in addition, transactions must be included inside a block, so I would avoid from transaction stuff inside Juno, make it only as block parser.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, also I think that the Indexer and Parser part should be separated into their program/process. It will allow for less complexity in the code and some freedom in terms of the technologies used.

For instance, later we find that the Indexer wrote in Go has latency spikes due to garbage collection. Then it will be far easier to replace only that part with Haskell (just kidding obviously 😂). Another use-case would be if, I don't know, Bitcoin has SDKs written in C++, we would be able to write an Indexer just for that and re-use all the rest of the architecture.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, also I think that the Indexer and Parser part should be separated into their program/process.

I agree with you. If we apply RabbitMQ or Kafka, we can make them sperately. That's why I starts from interfaces, and it brings us more possibilities to try and scale.

For instance, later we find that the Indexer wrote in Go has latency spikes due to garbage collection.

The Juno task would be Network IO bounding, most of time would be waiting for RPC server response, which takes about 200ms if Juno is not the same instance as RPC server. Therefore, gc spikes can be ignores in the case from my side.

Another use-case would be if, I don't know, Bitcoin has SDKs written in C++, we would be able to write an Indexer just for that and re-use all the rest of the architecture.

We can use other languages if it is worth than Go, especially as you said, the native language of the chain provides SDK then we don't rewrite anything.


Let's break down each element briefly by following topics.

### Context

The Context interface acts as a central hub shared among all elements within the parser, including enqueuers, workers, and other components.

```go
type Context interface {
// Context represents the manager for the concurrent jobs
context.Context

// Database returns the database used to operate blocks by workers
Database() Database
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use a singleton for the Database client?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should use singleton on Context, Node and Database, inherited the previous design.

In golang, Singleton is in the implementation level, interface just a way to abstract the methods.
https://thedevelopercafe.com/articles/singleton-in-golang-839d8610958b

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh understood 👍.

Why do we need the Context interface though? Can't we just use a function to get the database client like in the article you linked?

Copy link
Contributor Author

@dadamu dadamu Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context is mainly a concept to manage concurrency program, and it is shared to every goroutine tasks. Then we can manage all the workers in a easy way. Here is the brief reference.

In addition, putting Database, Node and etc. it follows the original concept from current Juno that including the shared singleton objects among workers.

Do you think using Start(context.Context, Database, Node) would be better?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think Start(context.Context, Database, Node) is any better 😄.

I suggested retrieving the database client by the Singleton getter method like GetCache in the article you linked. I mean from within the function body.

But I think your approach is the recommended one using Golang. I am more used to doing things like that https://medium.com/learnfazz/flawed-singleton-pattern-for-mocking-go-e90794891faf.

Again I think you have the right approach 👍.


// Node returns the client used to listen to blocks by enqueuers
Node() Node

// Modules returns a set of the extensional operations that should be executed inside a worker
Modules() []Module
}
```

### Block

The Block interface serves as a representation of the aggregation of operations on the blockchain and is the fundamental element transferred within the queue for parsing by workers within Juno.

```go
type Block interface {
// Height returns the height of the block.
Height() uint64

// Hash returns the hash of the block.
Hash() string

// Timestamp returns the timestamp of the block.
Timestamp() time.Time

// Proposer returns the address of the block proposer.
Proposer() string
}
```

### Enqueuer

The Enqueuer interface orchestrates the process of listening for available blocks and enqueueing them into the queue for parsing by workers.

In the current version, two types of enqueuers are implemented:

* Missing Blocks Enqueuer: This enqueuer queries missing blocks from the given start block to the current latest block and enqueues them into the queue.
* New Blocks Enqueuer: This enqueuer listens for new blocks from the node and enqueues them into the queue.

```go
type Enqueuer interface {
// ListenAndEnqueueBlocks listens for available blocks and enqueues them into the queue.
ListenAndEnqueueBlocks(ctx Context, queue BlockQueue) (err error)
}
```

### Block Queue

The BlockQueue interface serves as a communication channel between enqueuers and workers within the Juno architecture.

```go
type BlockQueue interface {
// Enqueue enqueues a block into the queue.
Enqueue(block Block)

// Listen returns a channel to listen for blocks.
Listen() <-chan Block
}
```

### Node

The Node interface serves as an abstraction layer for interacting with RPC servers or other clients to obtain block data efficiently.

```go
type Node interface {
// LatestBlock retrieves the latest block from the connected node.
LatestBlock() Block

// SubscribeBlocks returns a channel to subscribe to new blocks from the connected node.
SubscribeBlocks() <-chan Block
}
```

### Database

The Database interface encapsulates the necessary operations for interacting with the underlying storage system.

```go
type Database interface {
// SaveBlock saves a block into the database.
SaveBlock() error

// LatestBlock returns the latest block from the database.
LatestBlock() (Block, error)
}
```

### Worker

The Worker interface defines the functionality for parsing blocks from the queue within Juno's parser architecture.

```go
type Worker interface {
// Start starts the worker to parse blocks from the queue.
Start(ctx Context, queue BlockQueue)
}
```

### Scheduler

The Scheduler interface facilitates the execution of periodic operations within Juno's parser architecture.

```go
type PeriodicOperationsModule interface {
// RegisterPeriodicOperations allows to register all the operations that will be run on a periodic basis.
// The given scheduler can be used to define the periodicity of each task.
// NOTE. This method will only be run ONCE during the module initialization.
RegisterPeriodicOperations(scheduler Scheduler) error
Comment on lines +149 to +152
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of process did we have to launch on a periodic basis, in the last iteration of Juno?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is actually for Callisto for custom module like gov module, I think It would be better to remove it from Juno interfaces as it is actually related to business logic.

Thanks for pointing it out.

}

type Scheduler interface {
// Register registers a periodic operations module.
Register(module PeriodicOperationsModule)

// Start starts the scheduler.
Start(ctx Context)
}
```

### Module

The Module interface serves as the foundation for defining and extending functionality within Juno's parser architecture.

Within Juno, a fundamental module called BlockModule is provided, dedicated to handling blocks.

Additionally, developers have the flexibility to create custom modules tailored to specific business requirements, such as TransactionModule, GovModule, and others.

```go
type Module interface {
// Name returns the name of the module.
Name() string
}

type BlockModule interface {
// HandleBlock handles a block parsed by workers.
HandleBlock(block Block) error
}
```

### Additional operator

The Additional Operator interface provides functionality for executing additional operations on a additional basis within Juno's parser architecture.

Currently, we utilize the additional operation module to host a Prometheus server.

```go
type AdditionalOperationModule interface {
// RegisterAdditionalOperation allows to register the operation that will be run on a periodic basis.
// The given scheduler can be used to define the periodicity of each task.
// NOTE. This method will only be run ONCE during the module initialization.
RegisterAdditionalOperation(operator AdditionalOperator) error
}

type AdditionalOperator interface {
// Register registers an additional operation module.
Register(module AdditionalOperationModule)

// Start starts the additional operator.
Start(ctx Context)
}
```

## Consequences

### Backwards Compatibility

The decision to redesign Juno's parser architecture is not backwards compatible, as the current implementation only supports Cosmos SDK chains. The introduction of a new architecture with enhanced interfaces may require adjustments to existing codebases and integrations. However, this trade-off is necessary to accommodate the future expansion of Juno to support a wider range of blockchain networks beyond Cosmos SDK chains. Despite the lack of backwards compatibility, the long-term benefits of the redesigned architecture justify the need for this transition.

### Positive

* Improved Flexibility: The redesigned architecture will enable Juno to support a wider range of blockchain networks and adapt to evolving protocols more effectively.
* Enhanced Scalability: With a modular and extensible design, Juno will be better equipped to handle increasing volumes of transactions and messages from multiple chains.


### Negative

* Increased Development Complexity: Implementing a more sophisticated architecture may require additional development effort and resources.
* Potential Performance Overhead: Depending on the implementation, there may be a slight performance overhead associated with the modular design. However, this is outweighed by the benefits of improved flexibility and scalability.

### Neutral

unknown

## References

- {reference link}
60 changes: 60 additions & 0 deletions .docs/architecture/adr-template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# ADR {ADR-NUMBER}: {TITLE}

## Changelog

- {date}: {changelog}

## Status

{DRAFT | PROPOSED} Not Implemented

> Please have a look at the [PROCESS](./PROCESS.md#adr-status) page.
> Use DRAFT if the ADR is in a draft stage (draft PR) or PROPOSED if it's in review.

## Abstract

> "If you can't explain it simply, you don't understand it well enough." Provide a simplified and layman-accessible explanation of the ADR.
> A short (~200 word) description of the issue being addressed.

## Context

> This section describes the forces at play, including technological, political, social, and project local. These forces are probably in tension, and should be called out as such. The language in this section is value-neutral. It is simply describing facts. It should clearly explain the problem and motivation that the proposal aims to resolve.
> {context body}

## Decision

> This section describes our response to these forces. It is stated in full sentences, with active voice. "We will ..."
> {decision body}

## Consequences

> This section describes the resulting context, after applying the decision. All consequences should be listed here, not just the "positive" ones. A particular decision may have positive, negative, and neutral consequences, but all of them affect the team and project in the future.

### Backwards Compatibility

> All ADRs that introduce backwards incompatibilities must include a section describing these incompatibilities and their severity. The ADR must explain how the author proposes to deal with these incompatibilities. ADR submissions without a sufficient backwards compatibility treatise may be rejected outright.

### Positive

{positive consequences}

### Negative

{negative consequences}

### Neutral

{neutral consequences}

## Further Discussions

While an ADR is in the DRAFT or PROPOSED stage, this section should contain a summary of issues to be solved in future iterations (usually referencing comments from a pull-request discussion).
Later, this section can optionally list ideas or improvements the author or reviewers found during the analysis of this ADR.

## Test Cases [optional]

Test cases for an implementation are mandatory for ADRs that are affecting consensus changes. Other ADRs can choose to include links to test cases if applicable.

## References

- {reference link}
Loading