-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: iavl/v2 alpha6 [DNM] #1043
base: master
Are you sure you want to change the base?
Conversation
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis pull request introduces major updates across the IAVL v2 codebase. The documentation has been completely rewritten to detail new features like Checkpoints, Leaf Changelog, Replay, Sharding, BTree on Disk, Cache, and Pruning. Several CLI commands (bench, rollback, scan, snapshot) are enhanced with refactored logging (using zerolog) and improved error handling. Core methods for exporting/importing nodes, iterator functionality, and cryptographic proof generation have been updated. Furthermore, the SQL layer is reworked with new connection factories and pruning routines, dependencies and Go versions updated, and obsolete files removed. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI
participant BenchCmd
participant TreeInitializer
participant MetricsProxy
User->>CLI: Execute "iavl bench std"
CLI->>BenchCmd: Parse flags and subcommand
BenchCmd->>TreeInitializer: Initialize tree options and optionally load snapshot
TreeInitializer-->>BenchCmd: Return tree instance
BenchCmd->>MetricsProxy: Initialize Prometheus metrics proxy & start HTTP server (/metrics)
BenchCmd->>User: Display benchmark results
sequenceDiagram
participant Tree
participant Exporter
participant Importer
participant Storage
Tree->>Exporter: Export(tree, version, order)
Exporter->>Storage: Iterate nodes (using connectionFactory)
Storage-->>Exporter: Return node data or error
Exporter-->>Tree: Provide export data/results
Tree->>Importer: Import(version) with exported nodes
Importer->>Storage: Batch write nodes to storage and flush
Importer-->>Tree: Commit import and load new version
Possibly related PRs
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
🔭 Outside diff range comments (2)
v2/internal/encoding.go (2)
100-112
:⚠️ Potential issueUpdate function visibility to unexported.
According to the PR objectives, this function should be unexported. Please rename
EncodeBytesSlice
toencodeBytesSlice
to match the intended visibility.-func EncodeBytesSlice(bz []byte) ([]byte, error) { +func encodeBytesSlice(bz []byte) ([]byte, error) {
100-117
: 💡 Verification agent🧩 Analysis chain
Verify no external usage of these functions.
Let's verify that these functions are not used outside the
encoding
package before making them unexported.
🏁 Script executed:
#!/bin/bash # Description: Check for external usage of EncodeBytesSlice and EncodeBytesSize functions # Search for function usage outside the encoding package echo "Searching for EncodeBytesSlice usage..." rg --type go "EncodeBytesSlice" --glob '!v2/internal/encoding.go' echo "Searching for EncodeBytesSize usage..." rg --type go "EncodeBytesSize" --glob '!v2/internal/encoding.go'Length of output: 1077
External references found for EncodeBytesSize
- EncodeBytesSlice: No usage was detected outside its own file, so making it unexported appears safe.
- EncodeBytesSize: This function is referenced in external files (e.g.,
fastnode/fast_node.go
,node.go
, andmutable_tree_test.go
). Changing its visibility to unexported would break existing usages outside theencoding
package.
🧹 Nitpick comments (55)
v2/pool.go (1)
10-10
: Follow Go naming conventions for acronyms.Rename
poolId
topoolID
to follow Go's standard naming conventions where acronyms should be consistently capitalized.- poolId uint64 + poolID uint64v2/range.go (4)
27-34
: Consider adding a return value to indicate removal success or failure.
Currently, this method silently returns when it finds and removes the first occurrence ofversion
. It might be helpful to return a boolean (or anerror
) to indicate whether a matching version was found. Additionally, ifversions
is typically large, consider using a binary search approach to locate the index, leveraging the sorted ordering.
76-83
: Consider using binary search to avoid linear complexity.
For largeversions
slices, searching linearly for the first element greater thanversion
can be time-consuming. A binary search could be used to find the cutoff index in O(log n), especially ifTruncateTo
is called frequently.
85-97
: Ensure that the loop inFindRecent
cannot degrade performance.
By continuously callingFindPrevious
and then decrementing by 1, the method could iterate many times for large gaps. If performance is critical, consider a strategy that uses binary search to jump multiple versions or an approach that tracks the appropriate index in a single pass.
135-140
: Clarify sentinel value.
When the slice is empty, returning-1
fromFirst()
is a common sentinel approach. Document this separately so that callers know to handle-1
appropriately.v2/range_test.go (2)
17-17
: Explain the role of the newrecent
field in documentation.
recent
represents the expected result forFindRecent
. Including a short comment in the test struct definition would make the test intention clearer for future maintainers.
66-74
: Test multiple distance values forFindRecent
.
These two cases demonstrate onlyrecent: -1
andrecent: 301
. Consider adding more diverse scenarios that test varying distances (the second parameter toFindRecent
) and confirm coverage of corner cases (e.g., exactly equal or just over the limit).v2/export.go (3)
37-38
: Channel lifecycle and error handling.Channels
exporter.out
andexporter.errCh
are never explicitly closed for errors. If the exporter user stops callingNext()
, the goroutine may block attempting to send errors. Consider closingerrCh
once traversal completes or an error occurs, ensuring consistent, clean shutdowns.Also applies to: 42-44, 48-48
74-93
: Consistent naming of parameters.The
preOrderNext
method uses the parameter namecf
similarly topostOrderNext
; this consistency helps readability. Ensure that future extensions maintain these naming conventions.
95-108
: Panic on nil node might obscure errors.Panicking on a
nil
node could complicate debugging if the underlying cause is an unexpected SQL or synchronization issue. Consider returning an explicit error instead of panicking, to allow upstream callers to handle it gracefully.- if node == nil { - panic("nil node") - } - return node, nil + if node == nil { + return nil, fmt.Errorf("encountered nil node in exporter") + } + return node, nilv2/sqlite_connection.go (5)
56-63
: Graceful connection teardown.
close()
iterates all tracked connections and closes them. This is adequate for read-only use. If concurrency is possible, document that external synchronization or usage patterns are required to avoid mid-query closures.
71-74
: Creating a hot connection factory.
newHotConnectionFactory
sets up an empty map for shard connections. Consider doc comments clarifying thathub
remains open untilclose()
is called on the returned factory.
76-81
: Error from unregistered version.
make(version int64)
returns an error if no shard connection is available. This is straightforward, but consider whether an auto-initialization or a more descriptive error message (e.g., “missing shard for version=...”) is appropriate.
83-108
: Shard attach logic.Attaching a new shard using a single main connection can be efficient. However, note that preparing statements for the attached shard might be expensive. If shards are frequently attached/detached, consider a more dynamic approach or caching.
127-137
: Close logic with open shards.
(f *hotConnectionFactory) close()
carefully closes each shard’s statements, but doesn’t explicitly detach them from the database before closingmain
. This could leave behind attached shards if the database handles remain persistent. If desired, consider an explicit detach call for each open shard.v2/metrics/metrics.go (3)
48-83
: Key-based counter increments.
IncrCounter
routes metrics based on string keys. This is flexible but might mask typos (e.g., a misspelled key is silently ignored). Consider logging or failing for unknown keys to catch potential errors early.
87-102
: Measuring durations for DB operations.
MeasureSince
handles timing metrics effectively. Note that if future keys arise (e.g., “db_read” vs. “db_get”), they’ll be silently ignored. Similar toIncrCounter
, consider logging unknown keys.
162-165
: Skipping durations >50µs.Filtering out durations above 50 microseconds may skew the histogram if real-world latencies often exceed that threshold. If that’s intentional, consider clarifying the reason in code comments.
v2/import.go (2)
157-191
: Validate node assembly logic for out-of-order or invalid inputs.
While the stack-based approach for building parent nodes appears correct for post-order (LRN) imports, consider adding validation for out-of-order or malformed sequences (e.g., verifying subtree height consistency). This helps fail early in case of importing invalid node sequences.
209-224
: Handle empty or multi-root imports more gracefully.
Currently, the commit logic errors out if the stack size is not exactly one. If valid use cases ever require importing an empty dataset or multiple roots, you might consider a more flexible approach. Otherwise, document these constraints clearly.v2/sqlite_batch.go (2)
13-16
: Review the necessity of four new fields insqliteBatch
.
Adding multiple fields (queue
,version
,storeLatestLeaves
, andconn
) simplifies database interactions but ties the batch strongly to a single version and queue. This design is fine if there's only ever one queue/connection per batch. Otherwise, consider scoping these fields to smaller helper structs to keep the batch more generic.
222-225
: Check potential indexing overhead with frequent CREATE INDEX calls.
Regularly recreating indexes after each batch could impose overhead. If the index already exists, this is no-op. But if the code runs in tight loops, consider creating the index once outside of high-frequency operations.v2/iterator.go (1)
46-47
: Avoid mixing database logic too deep in iterator.
By directly storingsql *SqliteDb
andcf connectionFactory
inTreeIterator
, the iterator depends on the database layer. For a more modular design, consider keeping iteration logics separate from low-level DB connections, possibly via an abstraction layer.v2/sqlite_prune.go (2)
26-36
: Verify & address TODO items
The lower-priority to-dos (e.g., count leaf orphans, ensure correct orphan joins) may affect correctness of partial or advanced pruning scenarios. Suggest adding reference tickets or tasks to track these if they are still relevant.
66-85
: Caution:PRAGMA synchronous=OFF
may compromise durability
While this boosts performance for batch operations, it also increases risk of data corruption on unexpected system failures. Verify that this trade-off is acceptable for your production needs.v2/sqlite_test.go (2)
49-50
: Add explicit error-checking or fallback for large data inserts
This test inserts large batches of nodes (up to 600k). Consider verifying disk space, rewriting in smaller bursts, or skipping if resources are limited.Also applies to: 63-65
246-300
: Concurrent queries can panic
Test_ConcurrentQuery
is also skipped. If the concurrency issues here arise from shared statement usage, consider using separate statements per goroutine or a transaction-limited scope.v2/proof.go (2)
160-234
: Chunk commits more aggressively if needed
InsidepruneShard
, you commit every 200k rows. If data sets become much larger, consider smaller batch commits or ephemeral transactions to reduce lock durations on large-scale prunes.
278-341
: Consider indexing orphan columns
orphanJoins
does repeated full scans for orphans. If performance becomes a bottleneck, build indexes on(version, sequence)
in the orphan tables or chunk the queries.v2/multitree.go (3)
45-94
: Shift to global logger inImportMultiTree
.
The function now useslog.Info()
instead of a passed-inlogger
. This is fine, but if logging control or customization is needed, consider an optional logger inTreeOptions
or a similar pattern.
96-130
: Path concatenation for cross-platform support.
Usingmt.rootPath + "/" + storeKey
might break on Windows. Preferfilepath.Join(mt.rootPath, storeKey)
to ensure platform-agnostic paths.- Path: mt.rootPath + "/" + storeKey, + Path: filepath.Join(mt.rootPath, storeKey),
308-446
: Large test build method.
TestBuild
is extensive, printing progress and metrics. You might consider breaking out the inline functions (report
, memory usage retrieval, iteration logic) into smaller helpers or usingt.Logf()
instead offmt.Printf
for better test integration.v2/tree_test.go (1)
290-413
: Hard-coded temp directory inTest_Replay
.
Using"/tmp/replay"
can lead to conflicts in parallel tests or ephemeral environments. Prefert.TempDir()
oros.MkdirTemp
to avoid collisions.- tmpDir := "/tmp/replay" + tmpDir := t.TempDir()v2/node.go (2)
76-83
: Renamed fieldsevict
&poolID
.
Consolidating fields fromdirty
toevict
is fine if it clarifies the node lifecycle. The changedString()
method looks consistent, but please confirm thatevict
usage is documented.
99-145
:getLeftNode
andgetRightNode
with panic on error.
Currently, if there's an error from SQL retrieval, the code callspanic(err)
. This can be delicate when you want robust error handling. Consider returning the error upward if the rest of the code can gracefully handle it.v2/snapshot.go (3)
41-52
: Consider wrapping connection handling in a small helper for clarity.In
Snapshot(ctx, tree)
, you explicitly open and close the root connection viarootConnection()
. While the defer-based cleanup is correct (lines 47-51), you might wrap this logic in a helper method to reduce repetition and potential leaks in other snapshots. It’ll also make error handling more uniform.
259-271
: Write checkpoint is not yet implemented.Lines 269-271 return an error if
WriteCheckpoint
is set. If this feature is on your roadmap, consider adding a TODO or removing the flag until implemented, to avoid confusion among consumers of this API.
501-513
: Placeholder panics for unimplemented logic.Within
flush()
(lines 511-513) you havepanic("writeTree not implemented")
. It’s helpful to keep placeholders explicit, but consider either removing or converting them into errors once development is underway to avoid unexpected runtime panics in production.v2/tree.go (5)
8-9
: Avoid direct usage of 'os' within library code if possible.Here you import
"os"
in lines 8-9. While this is often standard in application-level code, in a library context, you might consider injecting paths and environmental details from the outside to keep the library more portable. This is not strictly an error, just a potential design preference.
267-272
: Evaluate checkpoint triggers in high-throughput scenarios.Lines 267-272 conditionally set
shouldCheckpoint
. In a busy environment with frequent writes, you might see many near-simultaneous triggers, or large memory usage if intervals are too large. Consider additional safeguards or making this configurable at runtime.
295-303
: Improve error context in SaveRoot.Here (lines 297-303), if
SaveRoot
fails, the error message is generic:"save root failed"
. For debugging in production, including the version number or additional context would be beneficial (e.g., version, path) to make logs more descriptive.
316-326
: Pruning concurrency risk.Lines 316-326 check and begin pruning within
versionLock.Lock()
. Pruning can be potentially expensive. Ensure it doesn’t hold locks for an extended period or block other essential version updates. Possibly offload the final prune step to a background routine if performance tests indicate blocking issues.
406-460
: Full-tree traversal for checkpoint eviction.In
deepHash
(lines 406-460), you traverse the entire tree to compute hashes and handle eviction. This is functionally sound but might be time-consuming for large trees. Profiling might reveal if chunked or layered eviction is beneficial, especially under heavy load.v2/sqlite.go (4)
8-9
: Use consistent path handling.You use
filepath
usage at line 8 while also frequently concatenating paths with string operations. To avoid subtle bugs (especially on Windows), ensure all paths are built usingfilepath.Join
where feasible.
272-319
: Shard creation approach duplicates schema logic.Lines 272-319 create a tree shard DB with repeated statements for
tree
,leaf
,orphan
, etc. If you expect to update the schema in the future, consider centralizing or generating these statements to avoid duplication across multiple creation points.
483-502
: Graceful shutdown while pruning.Lines 483-502 block
Close()
until pruning completes or timeouts after 30s. That’s a good safeguard but could still leave potential orphan states. Confirm the rest of the system can handle partial or interrupted prune states if that 30s period is insufficient.
919-1040
: Changelog replay correctness checks.Your
replayChangelog
method (lines 919-1040) is thorough, verifying sequences and comparing finalrootHash
totargetHash
. This is a solid safety net. Just be mindful that very large replays could be slow—consider partial or streaming validations if performance becomes an issue.v2/cmd/rollback/rollback.go (1)
26-26
: Consider adding context to error logs.While the error handling is solid, consider enriching error logs with additional context using zerolog's structured logging capabilities.
Example enhancement:
-log.Info().Msgf("revert db %s to version %d", dbPath, version) +log.Info(). + Str("db_path", dbPath). + Int("target_version", version). + Msg("reverting database")Also applies to: 32-32
v2/cmd/snapshot/snapshot.go (1)
33-69
: Consider using errgroup for concurrent operations.The current implementation using channels works but could be simplified using
sync/errgroup
for better error handling and goroutine management.Example refactor:
-var ( - pool = iavl.NewNodePool() - done = make(chan struct{}) - errors = make(chan error) - cnt = 0 -) +import "golang.org/x/sync/errgroup" + +var ( + pool = iavl.NewNodePool() + g, ctx = errgroup.WithContext(cmd.Context()) +) for _, path := range paths { - cnt++ sqlOpts := iavl.SqliteDbOptions{Path: path} sql, err := iavl.NewSqliteDb(pool, sqlOpts) if err != nil { return err } tree := iavl.NewTree(sql, pool, iavl.TreeOptions{}) if err = tree.LoadVersion(version); err != nil { return err } - go func() { - snapshotErr := sql.Snapshot(cmd.Context(), tree) - if snapshotErr != nil { - errors <- snapshotErr - } - snapshotErr = sql.Close() - if snapshotErr != nil { - errors <- snapshotErr - } - done <- struct{}{} - }() + g.Go(func() error { + if err := sql.Snapshot(ctx, tree); err != nil { + return err + } + return sql.Close() + }) } -for i := 0; i < cnt; i++ { - select { - case <-done: - continue - case err := <-errors: - return err - } -} -return nil +return g.Wait()v2/import_test.go (1)
37-59
: Consider extracting export logic into a helper function.The export logic is complex and could be reused. Consider extracting it into a helper function to improve readability and maintainability.
Example:
func exportTrees(t *testing.T, multiTree *MultiTree) map[string][]*Node { exported := make(map[string][]*Node) for sk, tree := range multiTree.Trees { nodes, err := exportTree(t, tree, sk) require.NoError(t, err) exported[sk] = nodes } for sk, nodes := range exported { require.Equal(t, int64(len(nodes)), (multiTree.Trees[sk].root.size*2)-1) } return exported } func exportTree(t *testing.T, tree *Tree, sk string) ([]*Node, error) { var nodes []*Node exporter, err := tree.Export(tree.Version(), PostOrder) if err != nil { return nil, err } defer exporter.Close() for { n, err := exporter.Next() if errors.Is(err, ErrorExportDone) { return nodes, nil } if err != nil { return nil, err } if n == nil { t.Errorf("nil node for %s", sk) } nodes = append(nodes, n) } }v2/cmd/bench/bench.go (1)
124-125
: Consider implementing empty metric methods.The
IncrCounter
andMeasureSince
methods are empty. Consider implementing them or adding comments explaining why they're intentionally empty.Also applies to: 137-137
v2/README.md (4)
3-4
: Consider adding benchmark details to support performance claims.The claim of "10-20x improvement in throughput" is significant. Consider adding:
- Links to benchmark results or test data
- Specific scenarios where these improvements are observed
- Performance comparison tables
43-44
: Fix grammatical error and enhance BTree explanation.There's a grammatical error and the BTree explanation could be more detailed.
Apply this diff:
-would be to lay out nodes in subtrees chunks in the same format as the in-memory AVL tree. A B+Tree is a +would be to lay out nodes in subtrees chunks in the same format as the in-memory AVL tree. A B+Tree is an as close approximation to this as possible.Consider adding:
- B+Tree vs AVL tree performance characteristics
- Specific SQLite configuration recommendations
- Memory/disk trade-offs
🧰 Tools
🪛 LanguageTool
[misspelling] ~43-~43: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...as the in-memory AVL tree. A B+Tree is a as close an approximation to this as po...(EN_A_VS_AN)
50-51
: Clarify the invalidated ratio explanation.The explanation of the ratio correlation to disk size is complex and could be clearer.
Consider restructuring as:
-- invalidated ratio: the ratio of invalidated nodes to total nodes in a shard that triggers a - pruning event. The default is 1.5. Roughly correleates to disk size of a complete tree, where (2 * ratio) is the size of the pre preuned, tree on disk. A ratio of 1.5 means that 3x the initial size should be provisioned. +- invalidated ratio: The ratio of invalidated nodes to total nodes in a shard that triggers a + pruning event. Default: 1.5 + + Disk Size Impact: + - Pre-pruned tree size = 2 * ratio * initial size + - Example: With ratio 1.5, provision 3x the initial tree size
58-59
: Fix missing comma in pruning event description.Add a comma for better readability.
Apply this diff:
-On prune the latest shard is locked (readonly) and a new shard is created. The new shard is now +On prune, the latest shard is locked (readonly) and a new shard is created. The new shard is now🧰 Tools
🪛 LanguageTool
[uncategorized] ~58-~58: Possible missing comma found.
Context: ...ual to the requested prune version. On prune the latest shard is locked (readonly) a...(AI_HYDRA_LEO_MISSING_COMMA)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
v2/cmd/gen/gen.go
is excluded by!**/gen/**
v2/go.sum
is excluded by!**/*.sum
📒 Files selected for processing (35)
v2/README.md
(1 hunks)v2/cmd/bench/bench.go
(1 hunks)v2/cmd/rollback/rollback.go
(2 hunks)v2/cmd/root.go
(2 hunks)v2/cmd/scan/scan.go
(3 hunks)v2/cmd/snapshot/snapshot.go
(2 hunks)v2/export.go
(2 hunks)v2/go.mod
(2 hunks)v2/import.go
(1 hunks)v2/import_test.go
(1 hunks)v2/internal/encoding.go
(2 hunks)v2/iterator.go
(10 hunks)v2/iterator_test.go
(2 hunks)v2/logger.go
(0 hunks)v2/metrics/metrics.go
(4 hunks)v2/migrate/go.mod
(2 hunks)v2/migrate/v0/migrate_v0.go
(1 hunks)v2/multitree.go
(13 hunks)v2/node.go
(12 hunks)v2/pool.go
(3 hunks)v2/pool_test.go
(0 hunks)v2/proof.go
(1 hunks)v2/range.go
(3 hunks)v2/range_test.go
(3 hunks)v2/snapshot.go
(16 hunks)v2/sqlite.go
(20 hunks)v2/sqlite_batch.go
(11 hunks)v2/sqlite_connection.go
(1 hunks)v2/sqlite_metadata.go
(1 hunks)v2/sqlite_prune.go
(1 hunks)v2/sqlite_test.go
(7 hunks)v2/sqlite_writer.go
(0 hunks)v2/testutil/util.go
(3 hunks)v2/tree.go
(19 hunks)v2/tree_test.go
(8 hunks)
💤 Files with no reviewable changes (3)
- v2/pool_test.go
- v2/logger.go
- v2/sqlite_writer.go
🧰 Additional context used
🪛 golangci-lint (1.62.2)
v2/cmd/rollback/rollback.go
27-27: undefined: iavl
(typecheck)
v2/cmd/bench/bench.go
48-48: undefined: iavl
(typecheck)
58-58: undefined: iavl
(typecheck)
60-60: undefined: iavl
(typecheck)
62-62: undefined: iavl
(typecheck)
65-65: undefined: iavl
(typecheck)
🪛 LanguageTool
v2/README.md
[misspelling] ~43-~43: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...as the in-memory AVL tree. A B+Tree is a as close an approximation to this as po...
(EN_A_VS_AN)
[uncategorized] ~58-~58: Possible missing comma found.
Context: ...ual to the requested prune version. On prune the latest shard is locked (readonly) a...
(AI_HYDRA_LEO_MISSING_COMMA)
🔇 Additional comments (79)
v2/pool.go (3)
13-22
: LGTM!The constructor is well-implemented, using sync.Pool correctly with a proper New function for creating Node instances.
35-37
: Consider performance implications of not reusing nodes.The
Get()
method creates new Node instances instead of reusing them from the pool. This could lead to increased memory allocation and GC pressure, especially under high load.Could you clarify if this is an intentional design choice for specific performance reasons? The PR objectives mention performance improvements, but creating new nodes instead of reusing them seems counterintuitive.
56-56
: Clarify the purpose of the no-op Put method.The relationship between
Put()
andsyncPut()
is unclear. Having a no-opPut()
method alongside a functionalsyncPut()
could be confusing for maintainers.Could you provide documentation explaining:
- Why both methods are needed?
- When should each method be used?
- Are there specific scenarios where ignoring node cleanup (no-op Put) is safe?
v2/range.go (3)
36-54
: Validate off-by-one cases in the binary search logic.
The binary search approach for finding the next version appears logically correct. However, be mindful of off-by-one scenarios: ifversion
is just one less than the first element invs
, the result should be the first element. Ensure thorough tests for boundary cases like an empty array or queries that exactly match the first or last version.
99-110
: Double-check shard semantics for minimum versions.
FindShard
returns-1
if no versions exist. Otherwise, it usesFindPrevious
, returning the first version if the givenversion
is less than the earliest. Confirm that this behavior is correct for your shard logic—some call sites may expect a different fallback than the earliest version.
119-119
: Confirm that switching fromFind
toFindNext
is intentional.
FindMemoized
used to rely on a different lookup strategy. Replacing it withFindNext
changes the returned value when the providedversion
is not explicitly present. Validate downstream logic to ensure it can handle being given the next available version rather than a direct match.v2/range_test.go (2)
97-97
: ReplacingFind
withFindNext
changes the test’s semantics.
SinceFindNext
may skip to the next higher version, confirm that the test logic aligns with the new method’s behavior (especially if a test previously expected an exact match).
105-110
: Validate the zero-value check forrecent
.
The test checkstc.recent != 0
to decide whether to callFindRecent
. If version0
is a possible valid input or test scenario, this may inadvertently skip the method. Ensure it’s intentional.v2/export.go (5)
3-3
: Dependency import acknowledgement.The addition of
"fmt"
is necessary for formatting error messages. This change is straightforward and aligns with the new error-handling approach.
19-29
: Consider concurrency and version loading checks.The updated
Export
method correctly handles retrieving or loading the specified version. However, if multiple goroutines invokeExport
concurrently on the same tree and version, there’s a risk of inconsistent reads or partial version loads. Consider adding synchronization or making it clear in documentation that concurrent calls toExport
on the same tree require external synchronization.
51-70
: Traversal logic validation.The updated post-order traversal (
postOrderNext
) and pre-order traversal (preOrderNext
) correctly retrieve child nodes via the connection factory and recurse in the expected order. Both methods push leaf or branch nodes into a channel, and any SQL errors are reported onerrCh
. This design looks sound.
110-112
: Ensure deterministic exporter termination.
Close()
only attempts to close the underlying tree and does not coordinate with the background traversal goroutine. In certain edge cases, if the goroutine is not yet finished, callingClose()
might lead to race conditions. Consider joining or signaling the goroutine before closing the tree for safer termination.
114-114
: Constant usage acknowledgment.Defining
ErrorExportDone
as a package-levelfmt.Errorf
is acceptable. If additional contextual information is ever needed, consider returning a wrapped error.v2/sqlite_connection.go (6)
1-12
: Package and imports.Defining a dedicated package for SQLite connections helps segregate database-specific logic from tree logic. The imports are minimal and relevant.
14-29
:sqliteConnection
struct lifecycle handling.This struct centralizes shard information and prepared statements, promoting maintainable design. However, if partial initialization occurs, make sure all references (like
queryLeaf
,queryBranch
) are valid or nil-checked duringClose()
.
31-38
:readConnectionFactory
initialization.Storing an internal map of shardID to
sqliteConnection
ensures repeated lookups reuse cached connections. This approach is beneficial for read-only scenarios.
65-69
:hotConnectionFactory
design scope.Storing a single
main
connection plus multiple shard connections is a clean approach for “hot” usage scenarios. Ensure concurrency safety if used by multiple goroutines.
110-125
: Shard removal steps well-defined.
removeShard
closes prepared statements, detaches the shard, and removes it from the map. This is an appropriate cleanup flow. Ensure that no queries are ongoing on this shard when removal occurs.
139-145
: Shallow clone for shared connections.
clone()
duplicates the map entries, but eachsqliteConnection
struct still references the same underlyingconn
. If the clone and original are used simultaneously, concurrent usage might cause unexpected conflicts.v2/metrics/metrics.go (7)
23-26
: Interfaces satisfied.Ensuring
_ Proxy = &StructMetrics{}
and_ Proxy = &NilMetrics{}
confirms these types implement theProxy
interface at compile time, avoiding runtime surprises.
28-35
: Nil metrics design.
NilMetrics
provides a clean way to disable metrics collection with no overhead. This pattern is commonly used and is well-structured here.
36-46
: ComposeTreeMetrics
andDbMetrics
.
StructMetrics
embedding two distinct metrics structs is a good approach for grouping their functionalities behind a single interface. This leads to a cleaner overall design.
114-114
: NewTreeHash
metric.Adding
TreeHash
toTreeMetrics
is straightforward. This can help track how often and how intensively hashing operations occur. Ensure it’s incremented at the correct points in the codebase.
145-157
: Immediate reset after query report.
QueryReport
callss.SetQueryZero()
at the end, discarding the data. This design is excellent for a “report and reset” usage pattern. Be sure it’s documented so users don’t expect cumulative metrics after callingQueryReport
.
183-189
: Query metrics reset.
SetQueryZero()
correctly resets all query-related fields. This operation is safe but not thread-safe if other goroutines are still updating metrics. Document concurrency expectations if relevant.
191-202
: Aggregate metrics withAdd()
.Combining another
StructMetrics
object helps accumulate metrics for aggregated reporting. This is straightforward; just ensure no concurrency issues arise if these structures are shared among threads.v2/import.go (2)
68-68
: Consider validating or bounding theversion
size before allocatingi.nonces
.
Ifversion
is extremely large, allocating a slice of lengthversion+1
could lead to high memory usage or out-of-memory errors. Consider either guarding against large versions or using a more memory-efficient allocation strategy.
109-125
: Double-check concurrency and data races during batch commits.
The code spawns a goroutine (line 117) to commit the batch while the main goroutine continues. AlthoughwaitBatch()
enforces sequence, ensure that no other shared fields are mutated concurrently. If needed, add synchronization or a more explicit concurrency pattern to prevent subtle data races.v2/sqlite_batch.go (2)
76-99
: Confirm transactional atomicity inchangelogBatchCommit
.
Committing the transaction here ensures partial commits happen ifleafCount % size == 0
, but any crash mid-batch can result in partial data in the database. If partial data is acceptable, please confirm it’s documented and tested. Otherwise, consider more robust rollback hooks or finalize everything in one block.
105-116
: Ensure consistent error handling forBEGIN
failures.
Whenconn.Begin()
fails, the subsequent statements and prepared statements might remain in an undefined state. Confirm upstream callers handle this case and that leftover statements are closed properly, mitigating any resource leaks.v2/iterator.go (2)
73-74
: Great use of metrics timing viaMeasureSince
.
Measuring iterator steps can provide valuable performance insights, especially when scanning large datasets. These lines effectively gather timing data without significantly impacting logic flow.Also applies to: 254-255
381-399
: Confirm correctness ofIterateRecent
usage.
Ifversion
doesn’t exist, the function returns(false, nil)
. Ensure calling code verifies this return scenario, as iterating over a nil iterator can lead to runtime panics. Consider adding a doc comment clarifying the behavior when the version is missing.v2/sqlite_prune.go (1)
13-24
: Use atomics or locking forglobalPruneLimit
initialization if concurrency calls are expected
Right now, the channel initialization is performed ininit()
and reused globally. Confirm that multiple concurrent tests or package initializations cannot re-invokeSetGlobalPruneLimit
in conflicting ways. Otherwise, consider adding package-level locking or ensuring this call is made exactly once.v2/sqlite_test.go (2)
191-244
: Skipped concurrency test might hide real issues
Test_ConcurrentDetach
is skipped, but it reveals concurrency hazards. Enable it or create a dedicated method for concurrency corner cases to ensure you catch panics or database-level concurrency conflicts early.
302-357
: Ensure read-write concurrency correctness
Test_ConcurrentIndexRead
creates an index onfoo
while a background goroutine queriesbar
. Concurrent index creation can cause lock contention or errors in certain SQLite configurations. Routinely test with different PRAGMA settings.v2/proof.go (2)
16-20
: sync.Pool usage
Using a sync.Pool for temporary buffers can help reduce allocations. Ensure no unsafe usage across goroutines by returning buffers once done, as you already do withdefer proofBufPool.Put(buf)
.
64-73
: Check for large version numbers & boundary conditions
getExistenceProof
andgetNonMembershipProof
rely on versioned logic, but do not explicitly guard against extremely large or invalid version numbers. Ensure upstream code is validated or add a check here if needed.Also applies to: 75-122
v2/multitree.go (4)
8-11
: Imports look good.
No immediate issues with adding"runtime"
,"testing"
, and"time"
—these standard libraries support memory usage tracking, test methods, and time-based operations.
34-43
: Constructor signature simplified.
Removing thelogger
parameter inNewMultiTree
and switching to channel-based async error handling (doneCh
,errorCh
) is consistent with the updated logging approach used throughout the file. Be sure all code paths that previously relied on injected loggers are now safely handled.
161-215
: Concurrent error handling.
The updatedSaveVersionConcurrently
now collects multiple errors viaerrors.Join(errs)
. This is good for surfacing all concurrency issues. However, confirm any infrastructure or higher-level code supports Go 1.20+ whereerrors.Join
is available.
247-261
: Pre-allocatingstoreKeys
.
CreatingstoreKeys
withmake([]string, 0, len(mt.Trees))
is a small performance win. This is a neat micro-optimization for repeated sorting scenarios.v2/tree_test.go (7)
7-15
: New imports for bench and os.
Bringing iniAvl-bench/bench
is appropriate for generating changesets. Ensure the external library version is pinned to avoid unexpected changes from upstream.
27-37
: Additional tree options inTestTree_Hash
.
IntroducingPruneRatio
andMinimumKeepVersions
extends pruning strategies. Confirm they are tested adequately in further test cases to ensure correct pruning behavior.
54-103
:TestTree_Rehash
correctness.
Clearing node hashes and re-hashing ensures integrity after removing in-memory references. This is a robust approach to catch subtle if any node-hash caching bugs. Good job!
106-122
: Snapshot concurrency usage.
mt.SnapshotConcurrently()
followed bymt.Close()
ensures a stable snapshot of the final version. Verify any concurrency edge cases, like partial snapshots or racingMountTrees
, are handled.
125-136
: Re-importing multi-tree withImportMultiTree
.
Importing from a snapshot at version 10,000 and continuing the build to 12,000 is a logical test. Please confirm that any partial or corrupt snapshots cause graceful errors.
200-206
: Checking staged root size.
require.Equal(t, int64(105030), tree.stagedRoot.size)
is a strict test. Great for guaranteeing reproducible state. If environment or data changes for this test, might need updates to these constants.
213-257
:Test_TrivialTree
iteration.
Iterating versions from 1..10 with a small number of changes is a good sanity check. This pattern is straightforward for verifying correctness.v2/node.go (8)
23-31
: Bit-shift approach forVersion()
.
This explicit manual unpacking is correct but easy to invert by mistake. If the codebase uses big-endian for other parts, confirm consistent usage to avoid subtle version mismatches.
38-49
:NewNodeKey
writes version bytes individually.
This matches the shiftedVersion()
logic. Double-check the ordering is consistent withVersion()
for round-trip correctness.
146-261
: AVL rotation logic.
MethodscalcHeightAndSize
,balance
,rotateRight
, androtateLeft
appear correct for an AVL tree. The usage of staged nodes is consistent. Ensure you have full coverage of edge-cases (like single-child subtrees).
309-345
: Refinedget
logic with new error returns.
Early return onnil
node is helpful. Verify no code path inadvertently callsget
whennode
can be nil in normal usage.
358-414
:_hash()
no longer returns a value.
Eliminating the return promotes clarity, as it setsnode.hash
in place. Meanwhile,writeHashBytes
uses better error wrapping. Good improvement in error granularity.
553-554
: Memory usage invarSize()
.
Addinglen(node.key)
&len(node.value)
is correct for dynamic sizing. No issues noted.
564-586
:clone
,isDirty
,Version
,Key
,Value
,Height
.
These getters and helpers are straightforward.clone
callstree.pool.Get()
, so confirm the node pool usage remains consistent in concurrency scenarios.
603-609
:NewImportNode
constructor.
Creates a node with a zero sequence for imports. This is simpler than enforcing unique sequences. Just verify version collisions cannot break the import logic.v2/snapshot.go (4)
17-38
: Introduce concurrency safeguards around the logger usage.You've added a
zerolog.Logger
field (log
) tosqliteSnapshot
(lines 17 & 38). This is great for structured logging but note that multiple goroutines might share this snapshot. If concurrency is possible, ensure the logger usage is thread-safe or wrap mutable shared data with synchronization where needed.
75-82
: Index creation may block large tables.Creating the index on
snapshot_<version>
within the same transaction (lines 81-82) could introduce heavy locking if the snapshot table is large. Consider verifying that your environment can handle this safely or use a more granular approach to avoid excessive lock contention.
331-343
: Defer usage for error accumulation looks good.In
ImportSnapshotFromTable
(lines 331-343), the deferred closure merges errors viaerrors.Join(topErr, err)
, which is a neat approach to unify errors from multiple stages. This pattern is consistent—good job!
567-593
: Transactional pattern is clear and consistent.
prepareWrite()
(lines 567-593) executes aBegin()
onsnap.conn
and prepares statements, then finalizes them inflush()
—this is a clean approach. Make sure any future logic to handle partial commits or rollbacks is well-tested to avoid partial snapshot corruption.v2/tree.go (2)
13-15
: Transition to zerolog for consistent logging.You import and configure
zerolog
(lines 13-15). This aligns with the new logging strategy insnapshot.go
—ensuring uniform logging across modules is a big plus for maintainability.
43-49
: Recheck staged version logic in multi-branch flows.You've introduced
stagedVersion
and separate references topreviousRoot
,stagedRoot
(lines 43-49). This is a powerful pattern, but verify that concurrent modifications (e.g., from checkTx paths) don’t alter staged versions unexpectedly. Ensure every code path sets and reads them consistently.v2/sqlite.go (4)
29-30
: Metrics injection is well-aligned with the new design.Introducing
Readonly bool
andMetrics metrics.Proxy
inSqliteDbOptions
clarifies the database usage patterns. This unifies performance tracking with the iavl v2 model. Nicely done.
41-46
: Hot vs. write connections: watch out for simultaneous usage.Having
hotConnectionFactory
for reads and a distinctwriteConn
for writes can risk unexpected row locking if both are used concurrently in the same shard. Ensure consistent transaction isolation levels or locks if partial updates are possible from multiple places.
461-481
: Shard auto-detection might skip locked shards.
loadShards()
(lines 461-481) logs a warning and skips any locked shards. If a shard was locked for a legitimate reason (e.g., incomplete migration), skipping it might produce incomplete data. Review whether skipping is truly correct, or if you should attempt to recover or warn more aggressively.
649-664
: Empty root sentinel logic is robust.
LoadRoot(version int64)
sets a sentinel child row withnil
node bytes for an empty root (lines 649-664). This approach is straightforward and properly differentiates an “empty root” from missing data. Nice approach to keep chain state consistent for edge cases.v2/cmd/root.go (1)
4-4
: LGTM! Addition of benchmarking command aligns with PR objectives.The addition of the bench package and its command enhances the testing capabilities, which is crucial for validating the performance improvements mentioned in the PR objectives.
Also applies to: 22-22
v2/cmd/rollback/rollback.go (1)
13-16
: LGTM! Enhanced logging with structured format.The switch to zerolog with console writer improves log readability and structure. The configuration with timestamp format and stderr output follows best practices.
v2/cmd/snapshot/snapshot.go (1)
13-16
: LGTM! Consistent logging setup.The zerolog configuration matches other commands, maintaining consistency across the codebase.
v2/sqlite_metadata.go (1)
22-22
: LGTM!Good improvement in error handling by using
fmt.Errorf
. This maintains consistency with the codebase's error handling patterns.v2/cmd/scan/scan.go (2)
24-24
: LGTM!Good practice using blank identifiers for unused parameters in the RunE function.
85-85
: LGTM!Improved error handling for flag requirements by checking for errors when marking flags as required.
Also applies to: 100-105
v2/iterator_test.go (1)
13-13
: LGTM!Good improvements:
- Using
t.TempDir()
for test database ensures proper cleanup.- Adding version saving ensures proper state management.
Also applies to: 28-29
v2/migrate/v0/migrate_v0.go (1)
114-114
: LGTM! Improved error message formatting.The change from
errors.New
tofmt.Errorf
is a good practice for error message formatting.v2/testutil/util.go (3)
101-101
: LGTM! Improved function naming.The function name has been updated to follow Go naming conventions by removing underscores.
172-175
: LGTM! Improved code formatting.The return statement in
OsmoLike
function has been formatted consistently.
443-445
: LGTM! Simplified return statement.The return statement in
CompactedChangelogs
function has been simplified while maintaining the same functionality.v2/go.mod (1)
11-16
: Verify dependency version updates.Several dependencies have been updated or added:
- New:
github.com/cosmos/ics23/go v0.11.0
- New:
github.com/rs/zerolog v1.30.0
Please verify these versions are stable and compatible.
✅ Verification successful
🌐 Web query:
What is the latest stable version of zerolog?
💡 Result:
The latest stable version of zerolog (the Go logging library) as of February 2025 is v1.33.0, based on Debian packaging updates indicating this as a new upstream release[6]. This supersedes earlier versions like v1.20.0 listed in historical changelogs[1].
Key details:
- The library reached major version v1 in May 2024[4][7], marking its stability
- Maintains zero-allocation JSON logging with features like contextual fields and leveled logging[4][5]
- Requires Go ≥1.23 for some configurations[3]
For .NET developers encountering similar naming confusion:
Library Latest Version Platform ZeroLog 2.2.0 .NET zerolog 1.33.0 Go Citations:
- 1: https://go.libhunt.com/zerolog-changelog
- 2: https://www.nuget.org/packages/ZeroLog
- 3: https://github.com/tozd/go-zerolog
- 4: https://pkg.go.dev/github.com/rs/zerolog
- 5: https://github.com/rs/zerolog/tree/master
- 6: https://tracker.debian.org/pkg/golang-github-rs-zerolog
- 7: https://pkg.go.dev/github.com/rs/zerolog/log
- 8: https://zerolog.readthedocs.io/en/latest/
- 9: https://go.libhunt.com/compare-slog-vs-zerolog
- 10: https://azure.microsoft.com/en-us/resources/cloud-computing-dictionary/what-is-vpn
Dependency Versions Verified
- The updated dependency
github.com/cosmos/ics23/go v0.11.0
is a stable release.- The newly added dependency
github.com/rs/zerolog v1.30.0
is stable and compatible. Although the latest stable version is v1.33.0 (per the recent web query), v1.30.0 continues to be a supported release without any known compatibility concerns.v2/migrate/go.mod (1)
3-5
: Verify Go version and toolchain compatibility.The Go version has been updated to 1.21.0, and a newer toolchain (go1.23.2) has been added. This matches the changes in the main go.mod file. Please ensure:
- All migration code is compatible with Go 1.21 features.
- The toolchain version is appropriate for the migration module.
// encodeBytesSize returns the byte size of the given slice including length-prefixing. | ||
func EncodeBytesSize(bz []byte) int { | ||
return EncodeUvarintSize(uint64(len(bz))) + len(bz) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update function visibility to unexported.
According to the PR objectives, this function should be unexported. Please rename EncodeBytesSize
to encodeBytesSize
to match the intended visibility.
-func EncodeBytesSize(bz []byte) int {
+func encodeBytesSize(bz []byte) int {
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// encodeBytesSize returns the byte size of the given slice including length-prefixing. | |
func EncodeBytesSize(bz []byte) int { | |
return EncodeUvarintSize(uint64(len(bz))) + len(bz) | |
} | |
// encodeBytesSize returns the byte size of the given slice including length-prefixing. | |
func encodeBytesSize(bz []byte) int { | |
return EncodeUvarintSize(uint64(len(bz))) + len(bz) | |
} |
func (r *VersionRange) Copy() *VersionRange { | ||
vs := make([]int64, len(r.versions)) | ||
copy(vs, r.versions) | ||
return &VersionRange{versions: vs, cache: make(map[int64]int64)} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Decide whether you want to copy the cache contents.
Currently, Copy()
re-initializes a new empty cache. This is fine if you’d like a fresh manipulation space, but if you intend to reuse cached lookups, consider deep-copying the existing cache
entries to keep the same performance benefits on the new instance.
func (f *readConnectionFactory) make(version int64) (*sqliteConnection, error) { | ||
shardID := f.sql.shards.FindMemoized(version) | ||
if shardID == -1 { | ||
return nil, fmt.Errorf("shard not found version=%d shards=%v", version, f.sql.shards.versions) | ||
} | ||
if conn, ok := f.conns[shardID]; ok { | ||
return conn, nil | ||
} | ||
var err error | ||
f.conns[version], err = f.sql.newShardReadConnection(shardID) | ||
if err != nil { | ||
return nil, err | ||
} | ||
return f.conns[version], nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential key mismatch in conns
map.
The new connection is inserted via f.conns[version], err = ...
, yet the shard ID is distinct from the version. If shardID
doesn’t always equal version
, it could overwrite or conflict. Adopting shardID
as the map key is safer and more intuitive.
- f.conns[version], err = f.sql.newShardReadConnection(shardID)
+ f.conns[shardID], err = f.sql.newShardReadConnection(shardID)
Committable suggestion skipped: line range outside the PR's diff.
func (sql *SqliteDb) beginPrune(pruneTo int64) { | ||
if sql.pruning { | ||
sql.logger.Warn().Int64("pruneTo", pruneTo).Msg("pruning already in progress") | ||
} | ||
sql.pruning = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential concurrency race on sql.pruning
The code sets sql.pruning = true
but doesn’t use a lock or atomic operation. If two goroutines call beginPrune
at nearly the same time, it may cause unintended side effects. Consider protecting sql.pruning
with a mutex or atomic to ensure only one prune starts.
func (sql *SqliteDb) beginPrune(pruneTo int64) {
- if sql.pruning {
+ sql.mutex.Lock()
+ if sql.pruning {
sql.logger.Warn().Int64("pruneTo", pruneTo).Msg("pruning already in progress")
}
+ sql.pruning = true
+ sql.mutex.Unlock()
go func() {
Also applies to: 278-282
// prune branches | ||
join, err := sql.orphanJoins(conn, pruneShards, false) | ||
if err != nil { | ||
return err | ||
} | ||
for _, shard := range pruneShards { | ||
if err := sql.pruneShard(shard, conn, join, false); err != nil { | ||
return err | ||
} | ||
err = conn.Exec("PRAGMA wal_checkpoint(TRUNCATE)") | ||
if err != nil { | ||
return err | ||
} | ||
} | ||
|
||
// prune leaves | ||
join, err = sql.orphanJoins(conn, pruneShards, true) | ||
if err != nil { | ||
return err | ||
} | ||
for _, shard := range pruneShards { | ||
if err := sql.pruneShard(shard, conn, join, true); err != nil { | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Potential partial state if error occurs mid-pruning
When pruning branches or leaves, an error leads to returning immediately. Partial data may remain in the new shard and attachments. Consider a rollback or partial cleanup to maintain a consistent state.
func (sql *SqliteDb) checkPruning() error { | ||
select { | ||
case res := <-sql.pruneCh: | ||
defer func() { sql.pruning = false }() | ||
if res.err != nil { | ||
if errors.Is(res.err, errShardExists) { | ||
sql.logger.Warn().Err(res.err).Msg("prune failed") | ||
return nil | ||
} | ||
return res.err | ||
} | ||
|
||
newShards := &VersionRange{versions: []int64{res.pruneTo}} | ||
newFactory := sql.hotConnectionFactory.clone() | ||
if err := newFactory.addShard(res.pruneTo); err != nil { | ||
return err | ||
} | ||
|
||
var dropped []int64 | ||
for _, v := range sql.shards.versions { | ||
if v > res.pruneTo { | ||
err := newShards.Add(v) | ||
if err != nil { | ||
return err | ||
} | ||
} else { | ||
// Delete shard: | ||
if err := newFactory.removeShard(v); err != nil { | ||
return fmt.Errorf("failed to remove shard %d path=%s: %w", v, sql.opts.shortPath(), err) | ||
} | ||
if err := sql.deleteShardFiles(v); err != nil { | ||
return fmt.Errorf("failed to delete shard %d files path=%s: %w", v, sql.opts.shortPath(), err) | ||
} | ||
dropped = append(dropped, v) | ||
} | ||
} | ||
conn, err := sql.rootConnection() | ||
if err != nil { | ||
return err | ||
} | ||
defer func() { | ||
err = errors.Join(err, conn.Close()) | ||
}() | ||
err = conn.Exec("UPDATE root SET pruned = true WHERE version < ?", res.pruneTo) | ||
if err != nil { | ||
return err | ||
} | ||
if err := sql.unlockShard(res.pruneTo); err != nil { | ||
return err | ||
} | ||
sql.logger.Info(). | ||
Str("took", res.took.String()). | ||
Str("wait", res.wait.String()). | ||
Ints64("dropped", dropped). | ||
Msg("prune completed") | ||
// TODO | ||
// possible race condition, these should be boxed in the same pointer | ||
sql.shards = newShards | ||
sql.hotConnectionFactory = newFactory | ||
return err | ||
default: | ||
return nil | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checkPruning
re-assigns sql.pruning
only on success
If an error happens at any point, we may never reset sql.pruning
to false or unlock the shard. Ensure we handle unlocking shards and resetting state in error pathways to prevent deadlocks.
defer func() {
+ sql.pruning = false
if dErr := conn.Close(); dErr != nil {
err = errors.Join(err, dErr)
}
}()
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
func (sql *SqliteDb) checkPruning() error { | |
select { | |
case res := <-sql.pruneCh: | |
defer func() { sql.pruning = false }() | |
if res.err != nil { | |
if errors.Is(res.err, errShardExists) { | |
sql.logger.Warn().Err(res.err).Msg("prune failed") | |
return nil | |
} | |
return res.err | |
} | |
newShards := &VersionRange{versions: []int64{res.pruneTo}} | |
newFactory := sql.hotConnectionFactory.clone() | |
if err := newFactory.addShard(res.pruneTo); err != nil { | |
return err | |
} | |
var dropped []int64 | |
for _, v := range sql.shards.versions { | |
if v > res.pruneTo { | |
err := newShards.Add(v) | |
if err != nil { | |
return err | |
} | |
} else { | |
// Delete shard: | |
if err := newFactory.removeShard(v); err != nil { | |
return fmt.Errorf("failed to remove shard %d path=%s: %w", v, sql.opts.shortPath(), err) | |
} | |
if err := sql.deleteShardFiles(v); err != nil { | |
return fmt.Errorf("failed to delete shard %d files path=%s: %w", v, sql.opts.shortPath(), err) | |
} | |
dropped = append(dropped, v) | |
} | |
} | |
conn, err := sql.rootConnection() | |
if err != nil { | |
return err | |
} | |
defer func() { | |
err = errors.Join(err, conn.Close()) | |
}() | |
err = conn.Exec("UPDATE root SET pruned = true WHERE version < ?", res.pruneTo) | |
if err != nil { | |
return err | |
} | |
if err := sql.unlockShard(res.pruneTo); err != nil { | |
return err | |
} | |
sql.logger.Info(). | |
Str("took", res.took.String()). | |
Str("wait", res.wait.String()). | |
Ints64("dropped", dropped). | |
Msg("prune completed") | |
// TODO | |
// possible race condition, these should be boxed in the same pointer | |
sql.shards = newShards | |
sql.hotConnectionFactory = newFactory | |
return err | |
default: | |
return nil | |
} | |
} | |
func (sql *SqliteDb) checkPruning() error { | |
select { | |
case res := <-sql.pruneCh: | |
defer func() { sql.pruning = false }() | |
if res.err != nil { | |
if errors.Is(res.err, errShardExists) { | |
sql.logger.Warn().Err(res.err).Msg("prune failed") | |
return nil | |
} | |
return res.err | |
} | |
newShards := &VersionRange{versions: []int64{res.pruneTo}} | |
newFactory := sql.hotConnectionFactory.clone() | |
if err := newFactory.addShard(res.pruneTo); err != nil { | |
return err | |
} | |
var dropped []int64 | |
for _, v := range sql.shards.versions { | |
if v > res.pruneTo { | |
err := newShards.Add(v) | |
if err != nil { | |
return err | |
} | |
} else { | |
// Delete shard: | |
if err := newFactory.removeShard(v); err != nil { | |
return fmt.Errorf("failed to remove shard %d path=%s: %w", v, sql.opts.shortPath(), err) | |
} | |
if err := sql.deleteShardFiles(v); err != nil { | |
return fmt.Errorf("failed to delete shard %d files path=%s: %w", v, sql.opts.shortPath(), err) | |
} | |
dropped = append(dropped, v) | |
} | |
} | |
conn, err := sql.rootConnection() | |
if err != nil { | |
return err | |
} | |
defer func() { | |
sql.pruning = false | |
if dErr := conn.Close(); dErr != nil { | |
err = errors.Join(err, dErr) | |
} | |
}() | |
err = conn.Exec("UPDATE root SET pruned = true WHERE version < ?", res.pruneTo) | |
if err != nil { | |
return err | |
} | |
if err := sql.unlockShard(res.pruneTo); err != nil { | |
return err | |
} | |
sql.logger.Info(). | |
Str("took", res.took.String()). | |
Str("wait", res.wait.String()). | |
Ints64("dropped", dropped). | |
Msg("prune completed") | |
// TODO | |
// possible race condition, these should be boxed in the same pointer | |
sql.shards = newShards | |
sql.hotConnectionFactory = newFactory | |
return err | |
default: | |
return nil | |
} | |
} |
// testTreeBuild(t, multiTree, opts) | ||
multiTree.TestBuild(t, opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove commented code.
The commented line should be removed as it's replaced by the line below it.
-// testTreeBuild(t, multiTree, opts)
multiTree.TestBuild(t, opts)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// testTreeBuild(t, multiTree, opts) | |
multiTree.TestBuild(t, opts) | |
multiTree.TestBuild(t, opts) |
tmpDir := t.TempDir() | ||
require.NoError(t, err) | ||
opts := testutil.BigTreeOptions100_000() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove unnecessary error check.
Line 18 checks for an error that can't occur since t.TempDir()
doesn't return an error.
tmpDir := t.TempDir()
-require.NoError(t, err)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
tmpDir := t.TempDir() | |
require.NoError(t, err) | |
opts := testutil.BigTreeOptions100_000() | |
tmpDir := t.TempDir() | |
opts := testutil.BigTreeOptions100_000() |
go func() { | ||
err := http.ListenAndServe(":2112", nil) | ||
if err != nil { | ||
panic(err) | ||
} | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add proper HTTP server shutdown handling.
The HTTP server is started in a goroutine without proper shutdown handling. This could lead to resource leaks.
Consider implementing graceful shutdown:
+var srv *http.Server
+
func newPrometheusMetricsProxy() *prometheusMetricsProxy {
// ... existing code ...
+ srv = &http.Server{
+ Addr: ":2112",
+ Handler: nil,
+ }
go func() {
- err := http.ListenAndServe(":2112", nil)
+ err := srv.ListenAndServe()
if err != nil && err != http.ErrServerClosed {
panic(err)
}
}()
return p
}
+
+func (p *prometheusMetricsProxy) Close() error {
+ if srv != nil {
+ return srv.Close()
+ }
+ return nil
+}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
go func() { | |
err := http.ListenAndServe(":2112", nil) | |
if err != nil { | |
panic(err) | |
} | |
}() | |
var srv *http.Server | |
func newPrometheusMetricsProxy() *prometheusMetricsProxy { | |
// ... existing code ... | |
srv = &http.Server{ | |
Addr: ":2112", | |
Handler: nil, | |
} | |
go func() { | |
err := srv.ListenAndServe() | |
if err != nil && err != http.ErrServerClosed { | |
panic(err) | |
} | |
}() | |
return p | |
} | |
func (p *prometheusMetricsProxy) Close() error { | |
if srv != nil { | |
return srv.Close() | |
} | |
return nil | |
} |
go 1.21 | ||
|
||
toolchain go1.23.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify Go version and toolchain compatibility.
The Go version has been updated to 1.21, and a newer toolchain (go1.23.2) has been added. Please verify:
- The application is compatible with Go 1.21 features and changes.
- The toolchain version (1.23.2) is appropriate for use with Go 1.21.
🌐 Web query:
What is the latest stable version of Go, and is Go 1.21 still supported?
💡 Result:
The latest stable version of Go as of February 2025 is 1.23.6, released on February 4, 2025[4][7]. This version includes security fixes for the crypto/elliptic
package and general bug fixes[7].
Go 1.21 (released August 2023) is no longer supported - its official support ended on August 13, 2024[4]. According to Go's support policy:
- Each major release receives support until two subsequent major releases are available
- Critical security fixes are only provided for supported versions
- The final minor release for Go 1.21 was version 1.21.13 in August 2024[4]
Current supported versions are:
Version | Release Date | Supported Until | Latest Patch |
---|---|---|---|
1.23 | August 13, 2024 | ~February 2026 | 1.23.6 |
1.22 | February 6, 2024 | August 2025 | 1.22.12 |
For production environments, developers should use either Go 1.23.x or 1.22.x, as these are the only versions currently receiving security updates and critical bug fixes[4][7]. The Go team recommends always using the latest minor release within a supported major version for optimal security and stability[5][8].
Citations:
- 1: https://blog.jetbrains.com/go/2025/01/16/the-goland-2025-1-early-access-program-has-started/
- 2: https://tip.golang.org/doc/go1.21
- 3: https://github.com/golang/tools/releases
- 4: https://endoflife.date/go
- 5: https://softwaremind.com/blog/examining-go-1-21-the-latest-golang-release/
- 6: https://go.dev/wiki/MinimumRequirements
- 7: https://go.dev/doc/devel/release
- 8: https://go.dev/blog/compat
- 9: https://antonz.org/which-go/
- 10: https://go.dev/dl/
Action Required: Update Module Go Version
- The module’s
go 1.21
directive points to a version that is no longer supported (support ended in August 2024). - The specified toolchain (
go1.23.2
) is from a supported release, so while the toolchain is appropriate, there’s a discrepancy with the module’s declared version. - Recommendation: Update the module’s Go version (e.g., change the directive to
go 1.23
) to align with current support and ensure best practices going forward.
Contains performance improvements and code simplification for iavl/v2, particularly around pruning and write operations. The core design and architecture is unchanged. This build was tested extensively against server/v2 architecture in cosmos/cosmos-sdk with the benchmark module.
SaveVersion
Note: most (but not all) of this work is in https://github.com/cosmos/iavl/tree/nightly/v2.0.x
Update:
This PR is quite large so I'm breaking it up.
Summary by CodeRabbit
Documentation
New Features
Refactor
Chores
Tests