-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented database optimization routine #721
base: next
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions.
This optimises the query planner i.e. index selection and ordering. Do you think our queries are complex enough for this to be beneficial? Or put differently - we still need to create indices for it to choose from, but more often than not we'll only have a single index for it to choose from per query. This doesn't automatically create new indices.
There is also the issue of reproducibility:
Some developers prefer that once the design of an application is frozen, SQLite will always pick the same query plans as it did during development and testing. Then if a millions of copies of the application are shipped to customers, the developers are assured that all of those millions of copies are running the same query plans regardless of what data the individual customers insert into their particular databases. This can help in reproducing complaints of performance problems coming back from the field.
To achieve this objection, never run a full ANALYZE nor the "PRAGMA optimize" command in the application. Rather, only run ANALYZE during development, manually using ...
This also does not shrink the database. That is done using VACUUM
but it will take forever on large databases and should not be done automatically. Database compaction should really only be done manually on demand as the node will have significant downtime during it.
I think, yes, we run pretty complex queries for account states computing with lots of
I didn't expect from SQLite to create indices, I'm adding new indices in another PR in terms of #712.
Yes, and we should probably discuss, what is better: performance over reproducibility of performance issues or reproducibility over performance. To be honest, I can't imagine, how
Ah, my bad! I was confused by documentation and though that |
Co-authored-by: Mirko <48352201+Mirko-von-Leipzig@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM - just need to decide whether we want determinism.
I've had it make a bad decision once or twice where it picked the wrong ordering. In a way its like any optimiser/compiler - sometimes it does go the wrong way. Mostly these issues were related to changes in data composition over time, e.g. more output notes per tx and block |
info!(target: COMPONENT, "Starting database optimization"); | ||
|
||
match self.state.optimize_db().await { | ||
Ok(_) => info!(target: COMPONENT, "Finished database optimization"), | ||
Err(err) => error!(target: COMPONENT, %err, "Database optimization failed"), | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using trace events to communicate the time start and end, prefer using a span to cover it instead.
This essentially means we want a new root span to cover the self.state.optimize_db
call.
info!(target: COMPONENT, "Starting database optimization"); | |
match self.state.optimize_db().await { | |
Ok(_) => info!(target: COMPONENT, "Finished database optimization"), | |
Err(err) => error!(target: COMPONENT, %err, "Database optimization failed"), | |
} | |
} | |
let root_span = tracing::info_span!("optimize_database", interval=self.optimization_interval.as_secs_f32()); | |
self.state.optimize_db() | |
.instrument(root_span) | |
.inspect_err(|err| root_span.set_error(err)) | |
.await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general the store
component still needs a telemetry make-over but this seems like and important one to trace so that we can get the actual performance impact of it.
Implemented database optimization routine:
This is a recommended way according to docs: https://www.sqlite.org/pragma.html#pragma_optimize
Such optimization routine should help query planner to better make decision which index to use when it has choice.