feat(gnovm): paramterize and calibrate gas model (from Xeon 8168 benchmarks)#5291
Open
feat(gnovm): paramterize and calibrate gas model (from Xeon 8168 benchmarks)#5291
Conversation
Replace the start/stop/start/stop benchmarking pattern with a gap-free SwitchOpCode model that atomically finalizes one op and starts the next using a single time.Now() call, ensuring all CPU time is accounted for. Key changes: - New 4-primitive API: BeginOpCode/SwitchOpCode/ResumeOpCode/StopOpCode - SwitchOpCode returns old op code; Go call stack replaces explicit stack - Unified StartStore/StopStore suspend VM op timer and track store duration - Add dedicated RealmDidUpdate/RealmFinalizeTx store operation codes - Add Enabled const (OpsEnabled || StorageEnabled || NativeEnabled) to gate all collection uniformly; specific flags gate only export - Fix FinishStore exporting with TypeNative instead of TypeStore - Remove OpStaticTypeOf exclusion from benchmarking Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- StartNative: finalize previous op BEFORE runtime.GC() so GC time is not attributed to any opcode - Export format expanded from 10 to 14 bytes: now exports (totalDuration, totalSize, count) instead of pre-averaged values - stats.go computes proper weighted averages and per-run stddev - Add GAS_TODO.md with full gas metering audit findings Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pre-load all benchmark packages and stdlibs before enabling recording, so init-phase store operations don't contaminate benchmark data. - Add explicit Recording flag to gate export (not implicit nil check) - Extract loadBenchPackages() to separate loading from benchmarking - Benchmark functions now take pre-loaded *PackageValue directly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Clarify that Fork() intentionally omits gasMeter and GC callback. The caller is responsible for setting these when needed (e.g. the Machine constructor does this for transactions). Query contexts intentionally omit the gasMeter. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…izes The _alloc* constants represent unsafe.Sizeof for each GnoVM value type, but many were stale — structs grew as fields were added (ObjectInfo, FuncValue.Crossing, etc.). This caused systematic under-charging of memory gas. Changes: - Update all _alloc* constants to match current unsafe.Sizeof values - Replace _allocBase+_allocPointer with unified _allocHeap constant - Fix allocPointer to include PointerValue size (was just _allocBase) - Fix allocHeapItem to use actual HeapItemValue size (was _allocTypedValue) - Fix allocMapItem from *3 to *2 (key + value TypedValues) - Add detailed comments explaining the allocation model - Update test golden values for new allocation sizes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Panics at startup if any _alloc* constant doesn't match unsafe.Sizeof, so stale values are caught immediately when struct fields change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the linear per-byte gas model (1 gas/byte) with a calibrated power-law model based on actual Go malloc benchmarks on a dedicated DigitalOcean instance (Intel Xeon 8168, 2-core). The new allocGasTable uses 6 exact benchmark points (1B-32B) plus a power-law fit (ns = 0.47 × size^0.925) for larger sizes. At runtime, allocGas() does O(1) lookup via bits.Len64 + linear interpolation. Key changes: - Small allocs (e.g. 208B struct) drop from 208 gas to ~8 gas - Large allocs use sublinear scaling instead of linear - Allocation-heavy tests see 50-98% gas reduction - CPU/store-dominated tests change <2% Adds gnovm/cmd/calibrate/ with benchmarks, data, and tooling for recalibration on different hardware. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
shlAssign/shrAssign allocated closures on every call for overflow checking that only runs during preprocessing (StagePre). This caused unnecessary GC pressure at runtime. - Remove checkOverflow closure pattern; inline StagePre guard - Add shlCheckOverflow helper that short-circuits for shift > 64 (any non-zero value overflows any fixed-width type) - Cap UntypedBigintType shifts at 10000 bits to prevent DoS via huge big.Int allocations (e.g., 1 << 1_000_000_000) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…audit Add comprehensive microbenchmarks for gas calibration: - bench_ops_test.go: 234 benchmarks covering all 90 GnoVM op handlers, parameterized at powers of 10 (1, 10, 100, 1000, 10000) - bench_gc_test.go: GC visitor traversal benchmarks with realistic object graphs (100 to 10M objects), reports ns/visit for VisitCpuFactor calibration - op_handler_gas_audit.md: audit of every doOpXxx handler documenting cost-varying parameters, pessimistic inputs, and benchmark coverage gaps - benchops: add OpAccumDur/OpCount accessor functions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace flat VisitCpuFactor=8 with gcVisitGasTable, a 25-entry lookup indexed by log2(visitCount). Per-visit cost scales with heap size due to CPU cache effects: 6 gas/visit for small heaps (L2-resident) up to 135 gas/visit for huge heaps (DRAM+TLB bound). Calibrated from BenchmarkGCVisit on DigitalOcean Dedicated 2-core, Intel Xeon Platinum 8168 @ 2.70GHz, cpuBaseNs=5.2. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add parameterized benchmarks for accurate gas calibration: - String ops (Add, Eql, Lss, Convert, Index1_MapStringKey, Slice) × 4 lengths - BigInt ops (14 ops) × 4 bit-lengths (64, 256, 1024, 4096) - BigDec ops (7 ops) × 4 precisions (10, 100, 1000, 10000 digits) - Byte-array variants (Eql, ArrayLit, Index1, Slice) × 4 sizes - Asymmetric BigInt benchmarks (11 cross-size combinations) Add bytes.Equal fast path for byte array comparison in isEql, replacing element-by-element DataByteValue wrapping (~850x speedup). Total benchmark count: 337. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add benchmarks for call dispatch variants:
- BenchmarkOpPrecall_BoundMethod (method precall via BoundMethodValue)
- BenchmarkOpCall_Method (doOpCall with receiver)
- BenchmarkOpIfCond_FalseBranch (else branch dispatch)
- BenchmarkOpSelector_VPInterface_{1,10,100} (interface method dispatch)
- BenchmarkOpTypeAssert2_Interface_{Hit,Miss} (interface type assertion)
- BenchmarkOpTypeSwitch_Interface_{1,10,100} (interface type switch)
Extract shared helpers to reduce duplication:
- benchInterfaceAndImpl: builds interface + implementing type
- benchMethodSetup: builds method type/value/receiver
Fix pre-existing issues:
- Hoist FieldType construction out of hot loop in benchOpStructType
and benchOpInterfaceType (was rebuilding strings + structs per iter)
- Standardize method naming in benchOpTypeAssert1_Interface (M%d)
- Remove redundant BenchmarkOpCall_Closure (duplicate of benchOpCall)
Total benchmark count: 350.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add benchmark findings section to op_handler_gas_audit.md with data on byte array equality fix (850x), BigInt asymmetric costs, BigDec precision scaling, interface method dispatch scaling, GC visit cache effects, and ArrayValue dual representation. Mark completed items in GAS_TODO.md and MEM_TODO.md. Add BENCH.md noting benchops superseded by Go-level benchmarks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…on data Add allocation gas metering to op handler benchmarks: benchMachine() now creates a real Allocator with a GasMeter so reportBenchops() can report exact alloc-gas/op alongside ns/op(pure). This enables precise separation of CPU gas from allocation gas in calibration analysis. Save DO Xeon 8168 benchmark results and analysis report with per-op alloc gas breakdown, proposed gas formulas, and mismatch identification. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ation Add higher-N benchmarks for parameterized ops to improve least-squares fit accuracy: Define, Assign, StructType, InterfaceType, TypeSwitch, TypeSwitch_Interface, ReturnCallDefers, Call (params and captures) at N=1000. Add StructLitNamed benchmarks for the named-field code path. Update analysis report with worst-case input verification section. Update DO Xeon benchmark data with alloc-gas/op metrics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace fixed BenchmarkOpFuncType with benchOpFuncType(b, nParams, nResults) measuring params and results independently at 0/1/10/100/1000 points each. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…arameterized ops Calibrate all 68 OpCPU* constants against Xeon 8168 benchmarks (cpuBaseNs=5.2). Add per-N CPU gas charging in 20 parameterized op handlers so gas scales with input size (e.g., map entries, struct fields, array elements, function params). Key changes: - machine.go: update flat constants from benchmarks, add 22 slope constants - op_assign.go: Define (15/LHS), Assign (17/LHS) - op_expressions.go: ArrayLit (9/elt), SliceLit (4/elt), SliceLit2 (5/size), MapLit (60/entry), StructLit (9/field), FuncLit (7/capture), TypeAssert1/2 (67/method), Convert str↔runes (3/2 per char) - op_call.go: Call (9/param + 5/capture) - op_exec.go: TypeSwitch (49/clause), ForLoop heap (8/var), RangeIter (2/elt) - op_binary.go: Eql array (27/elt), Eql struct (26/field) - op_types.go: FuncType (4/param+result), StructType (6/field), InterfaceType (5/method) - op_decl.go: ValueDecl (6/name) - Add calibrate.analysis and calibrate.plot Makefile targets - Add gen_analysis.py, plot_fits.py, op_gas_formulas.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…r depth Add calibrated per-bit-width gas charging for BigInt and BigDec arithmetic operations, and per-depth charging for NameExpr block traversal. BigInt uses per-kilobit slopes, BigDec uses per-100-digit slopes, and Mul/Quo use quadratic models. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…as target Add per-N gas charging for remaining BigInt ops: - Rem: quadratic model (same as Mul/Quo) - Shl: per-kilobit of shift amount (output growth) - Shr: per-kilobit of input bit width Add unified `make calibrate.gas` target that runs op benchmarks, allocation benchmarks, analysis, and plot generation end-to-end. Also add `make calibrate.alloc` for allocation-only calibration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The compound assignment handlers (+=, -=, *=, etc.) in op_assign.go were missing per-N BigInt/BigDec gas charging, allowing users to bypass per-N gas by using `x += bigval` instead of `x = x + bigval`. Also: use GetBigInt()/GetBigDec() consistently in unary helpers, use max() builtin, remove stale Makefile dependency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The 3B gas-wanted was a leftover from an intermediate state where the loop reduction (100→2 iterations) landed before alloc gas recalibration. Actual gas used is ~22M, so 100M provides adequate headroom. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update expected gas values in txtar tests after per-N gas charging for BigInt/BigDec ops and compound assign handlers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Collaborator
🛠 PR Checks SummaryAll Automated Checks passed. ✅ Manual Checks (for Reviewers):
Read More🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers. ✅ Automated Checks (for Contributors):No automated checks match this pull request. ☑️ Contributor Actions:
☑️ Reviewer Actions:
📚 Resources:Debug
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes thelper lint errors in bench_ops_test.go (71 functions), bench_gc_test.go (1 function), and alloc_bench_test.go (1 function). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
asserts, conversions, equality, BigInt/BigDec arithmetic, NameExpr depth, compound assigns)
1 gas/byte) with calibrated size-class lookup table (allocGasTable)VisitCpuFactor = 8) with cache-aware lookup table (gcVisitGasTable)Depends on #5091 and #5289
Gas impact
gas/const.gnogas/nested_alloc.gnogas/slice_alloc.gnoThe old model massively overcharged allocation gas (208-byte struct = 208 gas ~ 624ns equivalent, actual malloc ~
30ns). The new model charges ~8 gas for the same allocation. CPU base costs decreased because old values were
overestimated, but per-N slopes were added to prevent DoS via large inputs (big arrays, deep scopes, wide BigInts).
GC visit gas increased for large heaps to model cache effects (6 gas/visit in L2 vs 135 gas/visit hitting DRAM).
Per-N charging formulas
base + slope * N(e.g., ArrayLit: 37 + 9/element)bits * slope / 1024(e.g., Add: 9/kb)(bits/32)^2 * slope / 32(e.g., Mul, Rem: slope=1)digits * slope / 100(e.g., Add: 72/100digits)(digits/10)^2 * slope / 10(e.g., Mul, Quo: slope=1)1 * block_depthBenchmarks (5,500+ lines)
bench_ops_test.go: 100+ parameterized op handler benchmarks covering all arithmetic, comparison, assignment,literal construction, type operations, conversions, BigInt/BigDec at multiple bit widths, and NameExpr depth
scaling
bench_gc_test.go: GC visit cost benchmarks with realistic object graphs at varying heap sizes (1K to 16Mobjects)
cmd/calibrate/alloc_bench_test.go: Go heap allocation cost benchmarks across 32 size classes (1B to 1GB)Calibration tooling
cmd/calibrate/gen_analysis.py: Parses op benchmark output, fits linear/quadratic models, computes calibratedOpCPU constants with R^2 validation and worst-case verification
cmd/calibrate/gen_alloc_table.py: Fits power-law model to allocation benchmarks, generatesallocGasTablelookup table
cmd/calibrate/plot_fits.py: Generates multi-panel matplotlib visualizations of parameterized op cost fitscmd/calibrate/op_gas_formulas.md: Documents gas formula for every parameterized opMakefile targets
make calibrate.gas— full end-to-end: op benchmarks + alloc benchmarks + analysis + plotsmake calibrate.alloc— allocation benchmarks + table generationmake calibrate.analysis— run analysis on existing benchmark datamake calibrate.plot— generate fit visualizationsTest plan
go test ./gnovm/pkg/gnolang/ -run TestFiles -test.short— all gas golden tests passgo test ./gno.land/pkg/integration/ -run TestTestdata— all integration tests passgo test ./gno.land/pkg/sdk/vm/ -run TestAddPkg— gas_test.go passes🤖 Generated with Claude Code