Comparative Artifact Performance Evaluation for UPLC programs
A framework for measuring and comparing UPLC programs generated by different Cardano smart contract compilers.
- Overview
- Quick Start
- Live Performance Reports
- Available benchmark scenarios
- Usage (CLI)
- Creating a Submission
- Metrics Explained
- Project Structure
- Resources
- Version and Tooling Requirements
- Development
- Documentation (ADRs)
- Contributing
- License
- Acknowledgments
UPLC-CAPE provides a structured, reproducible way for Cardano UPLC compilers authors and users to:
- Benchmark compiler UPLC output against standardized scenarios
- Compare results across compilers and versions
- Track optimization progress over time
- Share results with the community
Key properties:
- Consistent benchmarks and metrics (CPU units, memory units, script size, term size)
- Reproducible results with versioned scenarios and metadata
- Automation-ready structure for future tooling
- Nix with flakes enabled
- Git
# Clone and enter repository
git clone https://github.com/IntersectMBO/UPLC-CAPE.git
cd UPLC-CAPE
# Enter development environment
nix develop
# Or, if using direnv (recommended)
direnv allow
# Verify CLI
scripts/cape.sh --help
# Or use the cape shim if available in PATH
cape --help# List available benchmarks
cape benchmark list
# View a specific benchmark
cape benchmark fibonacci
cape benchmark two_party_escrow
# Generate JSON statistics for all benchmarks
cape benchmark stats
# Create a submission for your compiler
cape submission new fibonacci MyCompiler 1.0.0 myhandle
cape submission new two_party_escrow MyCompiler 1.0.0 myhandleLatest benchmark reports: UPLC-CAPE Reports
Pull requests that modify submission data automatically get isolated preview sites for review:
- Preview URL pattern:
https://intersectmbo.github.io/UPLC-CAPE/pr-<number>/ - Example: PR #42 β
https://intersectmbo.github.io/UPLC-CAPE/pr-42/ - Trigger conditions: Previews only generate when
.uplcormetadata.jsonfiles change in thesubmissions/directory - Automatic updates: Preview refreshes on every push to the PR branch
- Automatic cleanup: Preview is removed when the PR is closed or merged
- Comment notification: A sticky comment appears on the PR with the direct preview link
Note: PRs that only modify documentation, README files, or code outside submissions/ will not trigger preview generation.
For implementation details, see ADR: PR Preview Deployment.
| Benchmark | Type | Description | Status |
|---|---|---|---|
| Fibonacci | Synthetic | Recursive algorithm performance | Ready |
| Fibonacci (Naive Recursion) | Synthetic | Prescribed naive recursive algorithm for compiler optimization comparison | Ready |
| Factorial | Synthetic | Recursive algorithm performance | Ready |
| Factorial (Naive Recursion) | Synthetic | Prescribed naive recursive algorithm for compiler optimization comparison | Ready |
| Two-Party Escrow | Real-world | Smart contract escrow validator | Ready |
| Streaming Payments | Real-world | Payment channel implementation | Planned |
| Simple DAO Voting | Real-world | Governance mechanism | Planned |
| Time-locked Staking | Real-world | Staking protocol | Planned |
For the full and up-to-date command reference, see USAGE.md.
# Benchmarks
cape benchmark list # List all benchmarks
cape benchmark <name> # Show benchmark details
cape benchmark stats # Generate JSON statistics for all benchmarks
cape benchmark new <name> # Create a new benchmark from template
# Submissions
cape submission list # List all submissions
cape submission list <name> # List submissions for a benchmark
cape submission new <benchmark> <compiler> <version> <handle>
cape submission verify # Verify correctness and validate schemas
cape submission measure # Measure UPLC performance
cape submission aggregate # Generate CSV performance report
cape submission report <name> # Generate HTML report for a benchmark
cape submission report --all # Generate HTML reports for all benchmarksThe cape benchmark stats command generates comprehensive JSON data for all benchmarks:
# Output JSON statistics to console
cape benchmark stats
# Save to file
cape benchmark stats > stats.json
# Use with jq for filtering
cape benchmark stats | jq '.benchmarks[] | select(.submission_count > 0)'The output includes formatted metrics, best value indicators, and submission metadata, making it ideal for generating custom reports or integrating with external tools.
-
Choose a benchmark
cape benchmark list cape benchmark fibonacci
-
Create submission structure
cape submission new fibonacci MyCompiler 1.0.0 myhandle # β submissions/fibonacci/MyCompiler_1.0.0_myhandle/ -
Add your UPLC program
- Replace the placeholder UPLC with your fully-applied program (no parameters).
- Path:
- submissions/fibonacci/MyCompiler_1.0.0_myhandle/fibonacci.uplc
- The program should compute the scenario's required result deterministically within budget.
-
Provide metadata
Create
metadata.jsonaccording tosubmissions/TEMPLATE/metadata.schema.json(see alsometadata-template.json).{ "compiler": { "name": "MyCompiler", "version": "1.0.0", "commit_hash": "a1b2c3d4e5f6789012345678901234567890abcd" }, "compilation_config": { "target": "uplc", "optimization_level": "O2", "flags": ["--inline-functions", "--optimize-recursion"] }, "contributors": [ { "name": "myhandle", "organization": "MyOrganization", "contact": "myhandle@example.com" } ], "submission": { "date": "2025-01-15T00:00:00Z", "source_available": true, "source_repository": "https://github.com/myorg/mycompiler-submissions", "source_commit_hash": "9876543210fedcba9876543210fedcba98765432", "implementation_notes": "Optimized recursive implementation using memoization. See source/ directory for full code and build instructions." } }For reproducibility, include:
compiler.commit_hash: Exact compiler version usedsubmission.source_repositoryandsubmission.source_commit_hash: Link to source code with exact commit
-
Verify and measure
Use the unified verification command to ensure your submission is correct and schema-compliant, then measure performance.
-
Verify correctness and JSON schemas (all submissions or a path):
cape submission verify submissions/fibonacci/MyCompiler_1.0.0_myhandle # or, verify everything cape submission verify --all -
Measure and write metrics.json automatically:
-
Measure all .uplc files under a path (e.g., your submission directory):
cape submission measure submissions/fibonacci/MyCompiler_1.0.0_myhandle # or, from inside the submission directory cape submission measure .
-
Measure every submission under submissions/:
cape submission measure --all
-
-
What verification does:
- Evaluates your UPLC program; if it reduces to BuiltinUnit, correctness passes
- Otherwise, runs the comprehensive test suite defined in
scenarios/{benchmark}/cape-tests.json - Validates your
metrics.jsonandmetadata.jsonagainst schemas
-
What measure does automatically:
- Measures CPU units, memory units, script size, and term size for your .uplc file(s)
- Generates or updates a
metrics.jsonwith scenario, measurements, evaluator, and timestamp - Keeps your existing
notesandversionif present; otherwise fills sensible defaults - Works for a single file, a directory, or all submissions with
--all - Produces output that validates against
submissions/TEMPLATE/metrics.schema.json
-
Aggregation Strategies: The
measuretool now runs multiple test cases per program and provides several aggregation methods for CPU and memory metrics:maximum: Peak resource usage across all test cases (useful for identifying worst-case performance)sum: Total computational work across all test cases (useful for overall efficiency comparison)minimum: Best-case resource usage (useful for identifying optimal performance)median: Typical resource usage (useful for understanding normal performance)sum_positive: Total resources for successful test cases only (valid execution cost)sum_negative: Total resources for failed test cases only (error handling cost)
Higher-level tooling can extract the most relevant aggregation for specific analysis needs.
-
Resulting file example:
{ "scenario": "fibonacci", "version": "1.0.0", "measurements": { "cpu_units": { "maximum": 185916, "sum": 185916, "minimum": 185916, "median": 185916, "sum_positive": 185916, "sum_negative": 0 }, "memory_units": { "maximum": 592, "sum": 592, "minimum": 592, "median": 592, "sum_positive": 592, "sum_negative": 0 }, "script_size_bytes": 1234, "term_size": 45 }, "evaluations": [ { "name": "fibonacci_25_computation", "description": "Pre-applied fibonacci(25) should return 75025", "cpu_units": 185916, "memory_units": 592, "execution_result": "success" } ], "execution_environment": { "evaluator": "plutus-core-executable-1.52.0.0" }, "timestamp": "2025-01-15T00:00:00Z", "notes": "Optional notes." }
-
-
Document
- Add notes to README.md inside your submission folder (implementation choices, optimizations, caveats).
UPLC-CAPE collects both raw measurements (CPU, memory, script size, term size) and derived metrics (fees, budget utilization, capacity).
Quick Reference:
| Metric | Description | Type |
|---|---|---|
| CPU Units | Computational cost (CEK machine steps) | Raw measurement |
| Memory Units | Memory consumption (CEK machine memory) | Raw measurement |
| Script Size | Serialized UPLC size (bytes) | Raw measurement |
| Term Size | AST complexity (node count) | Raw measurement |
| Execution Fee | Runtime cost in lovelace | Derived (Conway) |
| Reference Script Fee | Storage cost in lovelace (tiered) | Derived (Conway) |
| Total Fee | Combined execution + storage cost | Derived (Conway) |
| Budget Utilization | % of tx/block budgets consumed | Derived (Conway) |
| Capacity (tx/block) | Max script executions per tx/block | Derived (Conway) |
π For comprehensive metrics documentation, see doc/metrics.md
This includes detailed formulas, protocol parameters, aggregation strategies, and interpretation guidelines.
UPLC-CAPE/
βββ scenarios/ # Benchmark specifications
β βββ TEMPLATE/ # Template for new scenarios
β βββ fibonacci.md
β βββ factorial.md
β βββ two_party_escrow.md
βββ submissions/ # Compiler submissions (per scenario)
β βββ TEMPLATE/ # Templates and schemas
β β βββ metadata.schema.json
β β βββ metadata-template.json
β β βββ metrics.schema.json
β β βββ metrics-template.json
β βββ fibonacci/
β β βββ MyCompiler_1.0.0_handle/
β βββ two_party_escrow/
β βββ MyCompiler_1.0.0_handle/
βββ scripts/ # Project CLI tooling
β βββ cape.sh # Main CLI
β βββ cape-subcommands/ # Command implementations
βββ lib/ # Haskell library code (validators, fixtures, utilities)
βββ measure-app/ # UPLC program measurement tool
βββ plinth-submissions-app/ # Plinth submission generator
βββ test/ # Test suites
βββ report/ # Generated HTML reports and assets
βββ doc/ # Documentation
β βββ domain-model.md
β βββ adr/
βββ README.md
- Development environment: Nix shell (
nix develop) with optional direnv (direnv allow). - GHC: 9.6.7 (provided in Nix shell).
- Plutus Core target: 1.1.0.
- Use
plcVersion110(for Haskell/PlutusTx code).
- Use
- Package baselines (CHaP):
- plutus-core >= 1.45.0.0
- plutus-tx >= 1.45.0.0
- plutus-ledger-api >= 1.45.0.0
- plutus-tx-plugin >= 1.45.0.0
Enter environment:
nix develop
# or
direnv allowCommon tools:
- cape β¦ (project CLI)
- cabal build (builds all Haskell components: library, executables, tests)
- treefmt (format all files, including UPLC)
- fourmolu (Haskell formatting)
- pretty-uplc (UPLC pretty-printing)
- adr (Architecture Decision Records)
- mmdc -i file.mmd (diagram generation, if available)
UPLC files can be pretty-printed for improved readability:
# Format a single UPLC file in place
pretty-uplc submissions/fibonacci/MyCompiler_1.0.0_handle/fibonacci.uplc
# Format all UPLC files (and other files) via treefmt
treefmtThe treefmt command automatically formats all file types including UPLC files (.uplc). The pretty-printing uses the plutus executable from the Plutus repository and is available in the nix development shell.
ADRs document important design decisions (managed with Log4brains).
Helpful commands:
adr new "Decision Title"
adr preview
adr build
adr helpWe welcome contributions from compiler authors, benchmark designers, and researchers.
-
Add a new benchmark:
cape benchmark new my-new-benchmark # edit scenarios/my-new-benchmark.md -
Add a submission:
cape submission new existing-benchmark MyCompiler 1.0.0 myhandle # fill uplc and json files, then open a PR
Please read CONTRIBUTING.md before opening a PR.
Licensed under the Apache License 2.0. See LICENSE.
- Plutus Core team for infrastructure and reference implementations
- Compiler authors and community contributors
