Implemented database optimization routine #721

polydez · 2025-02-26T17:35:22Z

Implemented database optimization routine:

Full optimization with shrinking and analysis on node start.
Limited indexes analysis in background each day (default value, configurable).

This is a recommended way according to docs: https://www.sqlite.org/pragma.html#pragma_optimize

Such optimization routine should help query planner to better make decision which index to use when it has choice.

Mirko-von-Leipzig

Some questions.

This optimises the query planner i.e. index selection and ordering. Do you think our queries are complex enough for this to be beneficial? Or put differently - we still need to create indices for it to choose from, but more often than not we'll only have a single index for it to choose from per query. This doesn't automatically create new indices.

There is also the issue of reproducibility:

Some developers prefer that once the design of an application is frozen, SQLite will always pick the same query plans as it did during development and testing. Then if a millions of copies of the application are shipped to customers, the developers are assured that all of those millions of copies are running the same query plans regardless of what data the individual customers insert into their particular databases. This can help in reproducing complaints of performance problems coming back from the field.

To achieve this objection, never run a full ANALYZE nor the "PRAGMA optimize" command in the application. Rather, only run ANALYZE during development, manually using ...

This also does not shrink the database. That is done using VACUUM but it will take forever on large databases and should not be done automatically. Database compaction should really only be done manually on demand as the node will have significant downtime during it.

crates/store/src/db/migrations.rs

crates/store/src/server/mod.rs

polydez · 2025-02-27T11:09:43Z

This optimises the query planner i.e. index selection and ordering. Do you think our queries are complex enough for this to be beneficial?

I think, yes, we run pretty complex queries for account states computing with lots of ANDs which require query analyzer to choose which index to use.

Or put differently - we still need to create indices for it to choose from, but more often than not we'll only have a single index for it to choose from per query. This doesn't automatically create new indices.

I didn't expect from SQLite to create indices, I'm adding new indices in another PR in terms of #712.

There is also the issue of reproducibility:

Yes, and we should probably discuss, what is better: performance over reproducibility of performance issues or reproducibility over performance. To be honest, I can't imagine, how PRAGMA optimize could ruin performance.

This also does not shrink the database. That is done using VACUUM but it will take forever on large databases and should not be done automatically. Database compaction should really only be done manually on demand as the node will have significant downtime during it.

Ah, my bad! I was confused by documentation and though that 0x10000 flag would cause database to shrink. I have checked this on small PoC code and now I agree, it doesn't shrink it.

Co-authored-by: Mirko <48352201+Mirko-von-Leipzig@users.noreply.github.com>

crates/store/src/db/migrations.rs

Mirko-von-Leipzig

Code LGTM - just need to decide whether we want determinism.

Mirko-von-Leipzig · 2025-02-27T13:28:13Z

There is also the issue of reproducibility:

Yes, and we should probably discuss, what is better: performance over reproducibility of performance issues or reproducibility over performance. To be honest, I can't imagine, how PRAGMA optimize could ruin performance.

I've had it make a bad decision once or twice where it picked the wrong ordering. In a way its like any optimiser/compiler - sometimes it does go the wrong way. Mostly these issues were related to changes in data composition over time, e.g. more output notes per tx and block

Mirko-von-Leipzig · 2025-03-03T08:40:05Z

crates/store/src/server/db_maintenance.rs

+            info!(target: COMPONENT, "Starting database optimization");
+
+            match self.state.optimize_db().await {
+                Ok(_) => info!(target: COMPONENT, "Finished database optimization"),
+                Err(err) => error!(target: COMPONENT, %err, "Database optimization failed"),
+            }
+        }


Instead of using trace events to communicate the time start and end, prefer using a span to cover it instead.

This essentially means we want a new root span to cover the self.state.optimize_db call.

Suggested change

info!(target: COMPONENT, "Starting database optimization");

match self.state.optimize_db().await {

Ok(_) => info!(target: COMPONENT, "Finished database optimization"),

Err(err) => error!(target: COMPONENT, %err, "Database optimization failed"),

}

}

let root_span = tracing::info_span!("optimize_database", interval=self.optimization_interval.as_secs_f32());

self.state.optimize_db()

.instrument(root_span)

.inspect_err(|err| root_span.set_error(err))

.await;

In general the store component still needs a telemetry make-over but this seems like and important one to trace so that we can get the actual performance impact of it.

feat: implement DB optimization routine

623c4f7

polydez requested review from bobbinth and Mirko-von-Leipzig February 26, 2025 17:35

chore: update CHANGELOG.md

32b7ce9

polydez marked this pull request as ready for review February 26, 2025 17:36

Mirko-von-Leipzig reviewed Feb 27, 2025

View reviewed changes

crates/store/src/db/migrations.rs Outdated Show resolved Hide resolved

crates/store/src/server/mod.rs Outdated Show resolved Hide resolved

crates/store/src/server/mod.rs Outdated Show resolved Hide resolved

polydez and others added 2 commits February 27, 2025 16:33

Update crates/store/src/server/mod.rs

bb3d7ae

Co-authored-by: Mirko <48352201+Mirko-von-Leipzig@users.noreply.github.com>

fix: address review comments

f5506b3

Mirko-von-Leipzig reviewed Feb 27, 2025

View reviewed changes

crates/store/src/db/migrations.rs Show resolved Hide resolved

Mirko-von-Leipzig reviewed Feb 27, 2025

View reviewed changes

Mirko-von-Leipzig reviewed Mar 3, 2025

View reviewed changes

Mirko-von-Leipzig approved these changes Mar 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implemented database optimization routine #721

Implemented database optimization routine #721

polydez commented Feb 26, 2025

Mirko-von-Leipzig left a comment •

edited

Loading

polydez commented Feb 27, 2025 •

edited

Loading

Mirko-von-Leipzig left a comment

Mirko-von-Leipzig commented Feb 27, 2025

Mirko-von-Leipzig Mar 3, 2025 •

edited

Loading

Mirko-von-Leipzig Mar 3, 2025

Implemented database optimization routine #721

Are you sure you want to change the base?

Implemented database optimization routine #721

Conversation

polydez commented Feb 26, 2025

Mirko-von-Leipzig left a comment • edited Loading

Choose a reason for hiding this comment

polydez commented Feb 27, 2025 • edited Loading

Mirko-von-Leipzig left a comment

Choose a reason for hiding this comment

Mirko-von-Leipzig commented Feb 27, 2025

Mirko-von-Leipzig Mar 3, 2025 • edited Loading

Choose a reason for hiding this comment

Mirko-von-Leipzig Mar 3, 2025

Choose a reason for hiding this comment

Mirko-von-Leipzig left a comment •

edited

Loading

polydez commented Feb 27, 2025 •

edited

Loading

Mirko-von-Leipzig Mar 3, 2025 •

edited

Loading