Direct JSON writer + class pickle cache (R3-R4)#7
Merged
Conversation
Eliminates serde_json::Value intermediate allocations in the PG decode path (decode_zodb_record_for_pg_json). The new pipeline writes JSON tokens directly from the PickleValue AST to a String buffer in Rust with the GIL released, replacing the two-step allocate-then-serialize approach. Key changes: - json_writer.rs: JsonWriter with fast-path string escaping, ryu floats - json.rs: pickle_value_to_json_string_pg() recursive direct writer - known_types.rs: try_write_reduce_typed/try_write_instance_typed - btrees.rs: btree_state_to_json_writer() for all BTree variants - Thread-local JSON buffer reuse (same pattern as encode ENCODE_BUF) PG path speedup: 1.3-3.3x faster than dict+json.dumps(), wide_dict -55%. FileStorage pipeline: 1.4x faster at median (28.3 vs 40.4 µs/record). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Thread-local Vec cache avoids re-encoding identical class pickles for every ZODB record. With ~6 distinct classes in a typical database, the cache hits ~99.6% after warmup, replacing 7 opcode writes with a single memcpy of ~50 bytes. Uses linear search (faster than HashMap for ~6 entries, avoids string allocation on cache hits). Extracts build_class_pickle() pub(crate) helper reused by both production and test encode paths. FileStorage encode: -2 to -4% (mean 4.9→4.8, median 4.1→4.0 µs). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- BENCHMARKS.md: updated all numbers to R4+PGO, PGO as standard build - PERF_REPORT_ROUND3.md: direct JSON writer results (-55% wide_dict) - PERF_REPORT_ROUND4.md: class pickle cache results (-2 to -4% FS) - PERF_REPORT_COMPOUND.md: cumulative R1-R4 comparison Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Direct PickleValue → JSON string writer (R3): Eliminates all
serde_json::Valueintermediate allocations in the PG decode path. Writes JSON tokens directly from the Rust AST to aStringbuffer with the GIL released. PG path is 1.3-3.3x faster than dict+json.dumps(),wide_dict-55%, FileStorage pipeline 1.4x faster at median.Class pickle cache (R4): Thread-local cache of class pickle bytes per
(module, name)pair. With ~6 distinct classes in a typical ZODB database, replaces 7 opcode writes with a single memcpy on ~99.6% of records. FileStorage encode -2 to -4%.Benchmark docs: All numbers updated to R4+PGO. PGO is now the standard build for benchmarking. Added decompression step for
Data.fs.gz.Key numbers (R4+PGO vs CPython pickle)
Test plan
cargo test)pytest tests/)🤖 Generated with Claude Code