-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data size SKU for billing #2098
base: master
Are you sure you want to change the base?
Conversation
Definition of `Page::bytes_used_by_rows` to follow. This change seemed to stand on its own enough to deserve a separate commit.
We intend to bill based on these predictable metrics, rather than the somewhat-unpredictable actual heap memory usage of the system. As such, we need a way to compute them (duh). This commit adds `Table` methods for computing the number of resident rows, and the number of bytes stored by those rows.
Still a draft, but the overall strategy here makes sense. :) |
Per out-of-band discussion, I am not sure this computation will actually be useful to us, but it is the thing I can compute at this time. See comment on `BTreeIndex::num_key_bytes` in btree_index.rs for the specific counting implemented here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good overall. Adding testing is important if we are going to use these for billing.
Is it possible to write a function that computes the size of a row s.t. we can assert that bytes_used_by_rows()
is equal to the sum of the size of each row? If so, that would give us a good path toward being able to write tests.
@@ -644,6 +644,39 @@ impl CommittedState { | |||
let index = table.indexes.get(col_list)?; | |||
Some(&index.key_type) | |||
} | |||
|
|||
pub(super) fn report_data_size(&self, database_identity: Identity) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this causes a performance issue because of time spent in with_label_values
, we could store these metric handles in the Table
and/or ExecutionContext
, so reporting values would just be an atomic number operation.
@@ -681,6 +681,10 @@ pub(super) fn record_metrics( | |||
.inc_by(deletes.len() as u64); | |||
} | |||
} | |||
|
|||
if let Some(committed_state) = committed_state { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the code immediately above this, we are also updating a bunch of stats (counters and table size gauges). We should probably unify those at some point.
Slow reconstructions of `num_rows` and `bytes_used_by_rows`. Still to follow: index usage reporting.
@jsdt I have written some unit proptests in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests look good, thanks for adding them. For the blob store, do you think recomputing it every time we report is going to be a performance issue? If so, it might be worth keeping track of it as modifications happen.
I would assume that the majority of modules have very small blob stores. If this turns out not to be the case we could easily memoize it in the same way as this PR is doing for other measures. |
Description of Changes
This PR adds the ability to compute and report the number of rows in memory, and the number of bytes used by those rows.
Part of https://github.com/clockworklabs/SpacetimeDBPrivate/issues/1229 .
Currently, reporting is accomplished by a new group of metrics, all of which are prefixed with
spacetime_data_size
.Specifically, for each database, we report:
blob_store_num_blobs
, the number of bytes used by large blobs in theBlobStore
.len(str)
.blob_store_bytes_used_by_blobs
, the number of blobs in theBlobStore
.For each table in each database, we report:
table_num_rows
, the number of rows in the table.table_bytes_used_by_rows
, the number of bytes inPage
s used by rows in the table.Page
s is also not included here.table_num_rows_in_indexes
, the number of rows in indexes in the table.table_size * num_indices
.table_bytes_used_by_index_keys
, the bytes used to store keys in indexes in the table.KeySize
trait for a precise definition of this metric.In this PR, the new metrics are reported when committing a mutable TX, as that's when their values change. However, it's not necessary to report them this often; unlike our existing metrics they are not incremental. (Or the incremental maintenance is confined to the
table
crate, and not visible to thecore
crate where they are read and reported.) It would be reasonable to report them every N transactions for some choice of N, or every t seconds for some choice of t, or in response to an external request, or in any number of ways.API and ABI breaking changes
N/a, unless adding metrics breaks Prometheus in some way I don't understand.
Expected complexity level and risk
3: it would be unfortunate if we misreported these, since we intend to use them for billing, and the computations for some of the new metrics are non-trivial. It's also possible (I haven't checked) that reporting the benchmarks will have meaningful overhead, causing a performance regression. That said, these changes are very unlikely to break any existing functionality.
Testing