Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: tracks block part size in consensus block parts table #1067

Closed
wants to merge 5 commits into from

Conversation

staheri14
Copy link
Contributor

@staheri14 staheri14 commented Aug 25, 2023

Description

Part of #1055. This allows us to calculate the data transmission rate (bytes per second) incurred in the consensus reactor.

@staheri14 staheri14 changed the base branch from main to v0.34.x-celestia August 25, 2023 17:47
@staheri14 staheri14 self-assigned this Aug 25, 2023
schema.WriteBlockPart(conR.traceClient, rs.Height, rs.Round, peer.ID(), part.Index, schema.TransferTypeUpload)
schema.WriteBlockPart(conR.traceClient, rs.Height,
rs.Round, peer.ID(), part.Index,
schema.TransferTypeUpload, int64(parts.Size()))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the reviewers: While an alternative approach could involve utilizing a helper function to measure the part size, it's worth noting that the Size() method within the protobuf message type Part already accomplishes the same task.

@staheri14 staheri14 changed the title feat: tracks block part size in Influx DB feat: tracks block part size in consensus block parts table Aug 25, 2023
@staheri14 staheri14 marked this pull request as ready for review August 25, 2023 20:06
rootulp
rootulp previously approved these changes Aug 25, 2023
})
}

const (
// BlockTable is the name of the table that stores metadata about consensus blocks.
// following schema:
//
// | time | height | timestamp |
// | time | height | unix_millisecond_timestamp | tx_count | square_size | block_size | proposer | last_commit_round |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 341 to 342
part, err := msg.Part.ToProto() // this conversion is only needed to get the part size
// consistent with how it is measured in the gossipDataRoutine
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[uber nit] since this comment is split across multiple lines, proposal to do something like this so that it's clear both lines apply to the code line on 341:

// The .ToProto conversion is only needed to get the part size.
// This is consistent with how it is measured in the gossipDataRoutine.
part, err := msg.Part.ToProto()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorporated you suggestion a82c574

Comment on lines +343 to +346
partSize := -1
if err != nil {
partSize = part.Size()
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[not blocking] Does this intentionally avoid checking for an error?

Copy link
Contributor Author

@staheri14 staheri14 Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is intentional, because the earlier conversion (.toProto()) is merely for measuring the size and tracing it in influx DB, which is not a critical part of the consensus operation. Nevertheless, it is not expected to get any error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is intentional, because the earlier conversion (.toProto()) is merely for measuring the size and tracing it in influx DB, which is not a critical part of the consensus operation.

Makes sense! Since it is not critical to the consensus operation, I think this could not propagate the error and log it instead but this feels kinda hacky:

part, err := msg.Part.ToProto()
if err != nil {
    // log the error
} else {
    schema.WriteBlockPart(conR.traceClient, msg.Height, msg.Round, e.Src.ID(), msg.Part.Index, schema.TransferTypeDownload, int64(part.Size()))
}

I'm not familiar with this code so defer to what you already have.

@cmwaters
Copy link
Contributor

We do have something similar that already exists:

MessageReceiveBytesTotal: prometheus.NewCounterFrom(stdprometheus.CounterOpts{
Namespace: namespace,
Subsystem: MetricsSubsystem,
Name: "message_receive_bytes_total",
Help: "Number of bytes of each message type received.",
}, append(labels, "message_type")).With(labelsAndValues...),
MessageSendBytesTotal: prometheus.NewCounterFrom(stdprometheus.CounterOpts{
Namespace: namespace,
Subsystem: MetricsSubsystem,
Name: "message_send_bytes_total",
Help: "Number of bytes of each message type sent.",
}, append(labels, "message_type")).With(labelsAndValues...),
which measures bytes sent and received per message type. If we're looking into networking efficiency, it may be better just to take this existing work and apply it to InfluxDB

@staheri14
Copy link
Contributor Author

We do have something similar that already exists:

MessageReceiveBytesTotal: prometheus.NewCounterFrom(stdprometheus.CounterOpts{
Namespace: namespace,
Subsystem: MetricsSubsystem,
Name: "message_receive_bytes_total",
Help: "Number of bytes of each message type received.",
}, append(labels, "message_type")).With(labelsAndValues...),
MessageSendBytesTotal: prometheus.NewCounterFrom(stdprometheus.CounterOpts{
Namespace: namespace,
Subsystem: MetricsSubsystem,
Name: "message_send_bytes_total",
Help: "Number of bytes of each message type sent.",
}, append(labels, "message_type")).With(labelsAndValues...),

which measures bytes sent and received per message type. If we're looking into networking efficiency, it may be better just to take this existing work and apply it to InfluxDB

Nice, thanks for mentioning it! 👍 very helpful

@staheri14
Copy link
Contributor Author

Converting this PR to draft as it seems we can derive data transmission rate using the already available Prometheus metrics.

@staheri14 staheri14 marked this pull request as draft September 7, 2023 16:42
@staheri14
Copy link
Contributor Author

staheri14 commented Sep 15, 2023

@evan-forbes The traffic rate analysis has been successfully conducted utilizing Prometheus metrics. Given this, do you think it is would be useful to monitor the block part sizes in InfluxDB? If not, I can proceed to close this pull request (we can always reopen it later if necessary).

@staheri14
Copy link
Contributor Author

Closing it for now, may reopen it in the future if need be.

@staheri14 staheri14 closed this Oct 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants