Skip to content

IntersectMBO/UPLC-CAPE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

UPLC-CAPE

Comparative Artifact Performance Evaluation for UPLC programs

UPLC-CAPE Logo




A framework for measuring and comparing UPLC programs generated by different Cardano smart contract compilers.


License: Apache 2.0 Contributor Covenant



Table of Contents


Overview

UPLC-CAPE provides a structured, reproducible way for Cardano UPLC compilers authors and users to:

  • Benchmark compiler UPLC output against standardized scenarios
  • Compare results across compilers and versions
  • Track optimization progress over time
  • Share results with the community

Key properties:

  • Consistent benchmarks and metrics (CPU units, memory units, script size, term size)
  • Reproducible results with versioned scenarios and metadata
  • Automation-ready structure for future tooling

Quick Start

Prerequisites

  • Nix with flakes enabled
  • Git

Setup

# Clone and enter repository
git clone https://github.com/IntersectMBO/UPLC-CAPE.git
cd UPLC-CAPE

# Enter development environment
nix develop
# Or, if using direnv (recommended)
direnv allow

# Verify CLI
scripts/cape.sh --help
# Or use the cape shim if available in PATH
cape --help

Your first benchmark

# List available benchmarks
cape benchmark list

# View a specific benchmark
cape benchmark fibonacci
cape benchmark two_party_escrow

# Generate JSON statistics for all benchmarks
cape benchmark stats

# Create a submission for your compiler
cape submission new fibonacci MyCompiler 1.0.0 myhandle
cape submission new two_party_escrow MyCompiler 1.0.0 myhandle

Live Performance Reports

Latest benchmark reports: UPLC-CAPE Reports

PR Preview Sites

Pull requests that modify submission data automatically get isolated preview sites for review:

  • Preview URL pattern: https://intersectmbo.github.io/UPLC-CAPE/pr-<number>/
  • Example: PR #42 β†’ https://intersectmbo.github.io/UPLC-CAPE/pr-42/
  • Trigger conditions: Previews only generate when .uplc or metadata.json files change in the submissions/ directory
  • Automatic updates: Preview refreshes on every push to the PR branch
  • Automatic cleanup: Preview is removed when the PR is closed or merged
  • Comment notification: A sticky comment appears on the PR with the direct preview link

Note: PRs that only modify documentation, README files, or code outside submissions/ will not trigger preview generation.

For implementation details, see ADR: PR Preview Deployment.


Available benchmark scenarios

Benchmark Type Description Status
Fibonacci Synthetic Recursive algorithm performance Ready
Fibonacci (Naive Recursion) Synthetic Prescribed naive recursive algorithm for compiler optimization comparison Ready
Factorial Synthetic Recursive algorithm performance Ready
Factorial (Naive Recursion) Synthetic Prescribed naive recursive algorithm for compiler optimization comparison Ready
Two-Party Escrow Real-world Smart contract escrow validator Ready
Streaming Payments Real-world Payment channel implementation Planned
Simple DAO Voting Real-world Governance mechanism Planned
Time-locked Staking Real-world Staking protocol Planned

Usage (CLI)

For the full and up-to-date command reference, see USAGE.md.

Core commands

# Benchmarks
cape benchmark list              # List all benchmarks
cape benchmark <name>            # Show benchmark details
cape benchmark stats             # Generate JSON statistics for all benchmarks
cape benchmark new <name>        # Create a new benchmark from template

# Submissions
cape submission list             # List all submissions
cape submission list <name>      # List submissions for a benchmark
cape submission new <benchmark> <compiler> <version> <handle>
cape submission verify           # Verify correctness and validate schemas
cape submission measure          # Measure UPLC performance
cape submission aggregate        # Generate CSV performance report
cape submission report <name>    # Generate HTML report for a benchmark
cape submission report --all     # Generate HTML reports for all benchmarks

JSON Statistics

The cape benchmark stats command generates comprehensive JSON data for all benchmarks:

# Output JSON statistics to console
cape benchmark stats

# Save to file
cape benchmark stats > stats.json

# Use with jq for filtering
cape benchmark stats | jq '.benchmarks[] | select(.submission_count > 0)'

The output includes formatted metrics, best value indicators, and submission metadata, making it ideal for generating custom reports or integrating with external tools.


Creating a Submission

  1. Choose a benchmark

    cape benchmark list
    cape benchmark fibonacci
  2. Create submission structure

    cape submission new fibonacci MyCompiler 1.0.0 myhandle
    # β†’ submissions/fibonacci/MyCompiler_1.0.0_myhandle/
  3. Add your UPLC program

    • Replace the placeholder UPLC with your fully-applied program (no parameters).
    • Path:
      • submissions/fibonacci/MyCompiler_1.0.0_myhandle/fibonacci.uplc
    • The program should compute the scenario's required result deterministically within budget.
  4. Provide metadata

    Create metadata.json according to submissions/TEMPLATE/metadata.schema.json (see also metadata-template.json).

    {
      "compiler": {
        "name": "MyCompiler",
        "version": "1.0.0",
        "commit_hash": "a1b2c3d4e5f6789012345678901234567890abcd"
      },
      "compilation_config": {
        "target": "uplc",
        "optimization_level": "O2",
        "flags": ["--inline-functions", "--optimize-recursion"]
      },
      "contributors": [
        {
          "name": "myhandle",
          "organization": "MyOrganization",
          "contact": "myhandle@example.com"
        }
      ],
      "submission": {
        "date": "2025-01-15T00:00:00Z",
        "source_available": true,
        "source_repository": "https://github.com/myorg/mycompiler-submissions",
        "source_commit_hash": "9876543210fedcba9876543210fedcba98765432",
        "implementation_notes": "Optimized recursive implementation using memoization. See source/ directory for full code and build instructions."
      }
    }

    For reproducibility, include:

    • compiler.commit_hash: Exact compiler version used
    • submission.source_repository and submission.source_commit_hash: Link to source code with exact commit
  5. Verify and measure

    Use the unified verification command to ensure your submission is correct and schema-compliant, then measure performance.

    • Verify correctness and JSON schemas (all submissions or a path):

      cape submission verify submissions/fibonacci/MyCompiler_1.0.0_myhandle
      # or, verify everything
      cape submission verify --all
    • Measure and write metrics.json automatically:

      • Measure all .uplc files under a path (e.g., your submission directory):

        cape submission measure submissions/fibonacci/MyCompiler_1.0.0_myhandle
        # or, from inside the submission directory
        cape submission measure .
      • Measure every submission under submissions/:

        cape submission measure --all
    • What verification does:

      • Evaluates your UPLC program; if it reduces to BuiltinUnit, correctness passes
      • Otherwise, runs the comprehensive test suite defined in scenarios/{benchmark}/cape-tests.json
      • Validates your metrics.json and metadata.json against schemas
    • What measure does automatically:

      • Measures CPU units, memory units, script size, and term size for your .uplc file(s)
      • Generates or updates a metrics.json with scenario, measurements, evaluator, and timestamp
      • Keeps your existing notes and version if present; otherwise fills sensible defaults
      • Works for a single file, a directory, or all submissions with --all
      • Produces output that validates against submissions/TEMPLATE/metrics.schema.json
    • Aggregation Strategies: The measure tool now runs multiple test cases per program and provides several aggregation methods for CPU and memory metrics:

      • maximum: Peak resource usage across all test cases (useful for identifying worst-case performance)
      • sum: Total computational work across all test cases (useful for overall efficiency comparison)
      • minimum: Best-case resource usage (useful for identifying optimal performance)
      • median: Typical resource usage (useful for understanding normal performance)
      • sum_positive: Total resources for successful test cases only (valid execution cost)
      • sum_negative: Total resources for failed test cases only (error handling cost)

      Higher-level tooling can extract the most relevant aggregation for specific analysis needs.

    • Resulting file example:

      {
        "scenario": "fibonacci",
        "version": "1.0.0",
        "measurements": {
          "cpu_units": {
            "maximum": 185916,
            "sum": 185916,
            "minimum": 185916,
            "median": 185916,
            "sum_positive": 185916,
            "sum_negative": 0
          },
          "memory_units": {
            "maximum": 592,
            "sum": 592,
            "minimum": 592,
            "median": 592,
            "sum_positive": 592,
            "sum_negative": 0
          },
          "script_size_bytes": 1234,
          "term_size": 45
        },
        "evaluations": [
          {
            "name": "fibonacci_25_computation",
            "description": "Pre-applied fibonacci(25) should return 75025",
            "cpu_units": 185916,
            "memory_units": 592,
            "execution_result": "success"
          }
        ],
        "execution_environment": {
          "evaluator": "plutus-core-executable-1.52.0.0"
        },
        "timestamp": "2025-01-15T00:00:00Z",
        "notes": "Optional notes."
      }
  6. Document

    • Add notes to README.md inside your submission folder (implementation choices, optimizations, caveats).

Metrics Explained

UPLC-CAPE collects both raw measurements (CPU, memory, script size, term size) and derived metrics (fees, budget utilization, capacity).

Quick Reference:

Metric Description Type
CPU Units Computational cost (CEK machine steps) Raw measurement
Memory Units Memory consumption (CEK machine memory) Raw measurement
Script Size Serialized UPLC size (bytes) Raw measurement
Term Size AST complexity (node count) Raw measurement
Execution Fee Runtime cost in lovelace Derived (Conway)
Reference Script Fee Storage cost in lovelace (tiered) Derived (Conway)
Total Fee Combined execution + storage cost Derived (Conway)
Budget Utilization % of tx/block budgets consumed Derived (Conway)
Capacity (tx/block) Max script executions per tx/block Derived (Conway)

πŸ“– For comprehensive metrics documentation, see doc/metrics.md

This includes detailed formulas, protocol parameters, aggregation strategies, and interpretation guidelines.


Project Structure

UPLC-CAPE/
β”œβ”€β”€ scenarios/                    # Benchmark specifications
β”‚   β”œβ”€β”€ TEMPLATE/                 # Template for new scenarios
β”‚   β”œβ”€β”€ fibonacci.md
β”‚   β”œβ”€β”€ factorial.md
β”‚   └── two_party_escrow.md
β”œβ”€β”€ submissions/                  # Compiler submissions (per scenario)
β”‚   β”œβ”€β”€ TEMPLATE/                 # Templates and schemas
β”‚   β”‚   β”œβ”€β”€ metadata.schema.json
β”‚   β”‚   β”œβ”€β”€ metadata-template.json
β”‚   β”‚   β”œβ”€β”€ metrics.schema.json
β”‚   β”‚   └── metrics-template.json
β”‚   β”œβ”€β”€ fibonacci/
β”‚   β”‚   └── MyCompiler_1.0.0_handle/
β”‚   └── two_party_escrow/
β”‚       └── MyCompiler_1.0.0_handle/
β”œβ”€β”€ scripts/                      # Project CLI tooling
β”‚   β”œβ”€β”€ cape.sh                   # Main CLI
β”‚   └── cape-subcommands/         # Command implementations
β”œβ”€β”€ lib/                          # Haskell library code (validators, fixtures, utilities)
β”œβ”€β”€ measure-app/                  # UPLC program measurement tool
β”œβ”€β”€ plinth-submissions-app/       # Plinth submission generator
β”œβ”€β”€ test/                         # Test suites
β”œβ”€β”€ report/                       # Generated HTML reports and assets
β”œβ”€β”€ doc/                          # Documentation
β”‚   β”œβ”€β”€ domain-model.md
β”‚   └── adr/
└── README.md

Resources


Version and Tooling Requirements

  • Development environment: Nix shell (nix develop) with optional direnv (direnv allow).
  • GHC: 9.6.7 (provided in Nix shell).
  • Plutus Core target: 1.1.0.
    • Use plcVersion110 (for Haskell/PlutusTx code).
  • Package baselines (CHaP):
    • plutus-core >= 1.45.0.0
    • plutus-tx >= 1.45.0.0
    • plutus-ledger-api >= 1.45.0.0
    • plutus-tx-plugin >= 1.45.0.0

Development

Enter environment:

nix develop
# or
direnv allow

Common tools:

  • cape … (project CLI)
  • cabal build (builds all Haskell components: library, executables, tests)
  • treefmt (format all files, including UPLC)
  • fourmolu (Haskell formatting)
  • pretty-uplc (UPLC pretty-printing)
  • adr (Architecture Decision Records)
  • mmdc -i file.mmd (diagram generation, if available)

UPLC Formatting

UPLC files can be pretty-printed for improved readability:

# Format a single UPLC file in place
pretty-uplc submissions/fibonacci/MyCompiler_1.0.0_handle/fibonacci.uplc

# Format all UPLC files (and other files) via treefmt
treefmt

The treefmt command automatically formats all file types including UPLC files (.uplc). The pretty-printing uses the plutus executable from the Plutus repository and is available in the nix development shell.


Documentation (ADRs)

ADRs document important design decisions (managed with Log4brains).

Helpful commands:

adr new "Decision Title"
adr preview
adr build
adr help

Contributing

We welcome contributions from compiler authors, benchmark designers, and researchers.

  • Add a new benchmark:

    cape benchmark new my-new-benchmark
    # edit scenarios/my-new-benchmark.md
  • Add a submission:

    cape submission new existing-benchmark MyCompiler 1.0.0 myhandle
    # fill uplc and json files, then open a PR

Please read CONTRIBUTING.md before opening a PR.


License

Licensed under the Apache License 2.0. See LICENSE.


Acknowledgments

  • Plutus Core team for infrastructure and reference implementations
  • Compiler authors and community contributors

About

Comparative Artifact Performance Evaluation

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Contributors 6