Pawl is an exploded view of the macOS Seatbelt sandbox built with search. It operates through mechanical exercise of the operating system, pulling the same policy along two directions: forward, from profile records through policy bytes to runtime denial; and reverse, from compiled blobs back to inputs—across multiple lanes and recording where they align or diverge.
Seatbelt policy is enforced through a confounded, censored stack: multiple layers contribute to outcomes. Closing the loop from structure → meaning → runtime behavior is hard even when you have good tools and strong priors. Furthermore, the reward for concerted effort is a fragile, host-bound understanding that can be undone with the next OS update. Rebuilding that understanding involves a new reverse engineering effort.
Pawl is different on purpose: a unitary inspectable account with enough internal wiring that it can notice when it stops being true. Instead of re-fitting reverse engineering tools to macOS when it updates, then exploring the Seatbelt, we instrument policy from the inside out. We find out what is different about a new version of macOS the same way we learn this one: each step in pawl is instrumented and diffed against system outputs. We converge on system features by keeping those outputs in productive doubt while testing new configurations.
The IR has two layers. The evidence layer (pawl/structure/) extracts byte-faithful, hash-stable structure from compiled profiles—header, op-table, node stream, literal and regex pools—with explicitly marked unknown regions. The policy layer interprets evidence into operations, predicates, and control flow as a typed Policy DAG (pawl/contract/), retaining provenance links (origin_offsets, origin_spans) back to underlying bytes. Both the forward lane (SBPL → compile → decode) and the reverse lane (bytes → parse → emit) produce the same Policy type, making structural comparison a type-checked operation rather than a string diff.
The five-point harness (integration/ir/profile/) validates the IR through roundtrip: a circuit that closes only when multiple independent paths agree, making failures localizable instead of mysterious.
1 ◄══ normalized eq ══► 3
Source Reversed
│ ↗ │
│ reverse │ compile
▼ ╱ ▼
2 ───────────╯ 4
Source ◄═ structural eq ═► Roundtrip
blob blob
│ │
└────────────┬────────────┘
│
▼
5
PolicyWitness
Points 1–4 form a square of artifacts: source SBPL compiles to source blob, reverse renders back to SBPL, which recompiles to the roundtrip blob. The horizontal edges are cross-checks—normalized surface comparison above, structural hash comparison below. Point 5 is the runtime oracle via PolicyWitness (sandbox_check), which validates behavior without trusting any of the static comparisons. When the circuit disagrees, the location of the disagreement tells you which transformation is at fault.
| Level | What it catches |
|---|---|
| Normalized equality | Normalization drift, parser bugs |
| Structural hash | Op-table divergence, node graph differences |
| PolicyWitness | Host disagrees despite internal consensus |
The top edge of this square—normalized equality between source and reversed SBPL—depends on the comparison surface being fair. The compiler transforms source in ways that cannot be perfectly inverted: it adds implicit rules, restructures predicates, shares nodes across operations, substitutes parameters, and flattens imports. Without accounting for these, every disagreement is ambiguous between a reverser bug and a known compiler transformation. The normalization layer (pawl/normalize/) handles this programmatically rather than maintaining a growing list of exceptions. A baseline artifact tracks ~54 profile-type baseline allows (deny-default profiles) and ~5 baseline denies (allow-default profiles); the reverse lane suppresses these, guarded by a dedicated test suite. Two permanent sidecars handle entitlement blocks the compiler silently drops and cross-operation predicate misattribution caused by compiler node-sharing.
We localize errors in our regex pipeline through a stricter harness (integration/ir/profile/regex/), with DFA equivalence taking the place of a structural hash supported by a runtime file-read oracle independent of sandbox_check.
integration/tests/ holds the test suite (make test). integration/ir/ houses the five-point harnesses for regex and profile validation. CARTON (integration/carton/) is a frozen, reviewable bundle of host-scoped mappings and contracts—when inputs change, refresh it with build, diff, check. Evidence under integration/evidence/ is curated output; consume it, never hand-edit. Fixtures under integration/fixtures/ are synthetic specimens that isolate single structural or semantic knobs; the d-ring corridor (integration/fixtures/pawl/d-ring/) guides the reverse/compile/trace loops inside a narrow, instrumented track.
Architecture
pawl/README.md— workbench overview and CLIpawl/PAWL.md— modules, boundaries, and independence contractpawl/contract/README.md— shared types, two-renderer boundary, import conventionspawl/normalize/README.md— canonical comparison surface and normalization catalogpawl/casefile/CASEFILE.md— casefile schema and receipts
Harnesses
integration/ir/ir_schema.md— IR specification (evidence + policy layers)integration/FIVE_POINT_HARNESS.md— five-point harness spec (profile + regex + diagnostics)
Runtime
runtime/PolicyWitness.md— runtime oracle and probe executionruntime/specimen/probe_plan_contract.py— probe plan contract checksorchestration/probe_plan_builder.py— plan generation from casefiles
Fixtures and evidence
integration/fixtures/pawl/d-ring/— synthetic fixture families for compiler behaviorintegration/carton/README.md— frozen mappings and reviewable contracts
Supporting
tools/README.md— Ghidra/Frida/doctor surfaces and host-bound CLIs
The project has many moving parts; static artifacts, decoders, mappings, harnesses, probes, and their surrounding macOS environment all interact. It is a living, interdependent mesh, reaching into the agents who work on it as intimately as it does the sandbox.
Your part in this tangle may be to glue some of these parts together or to unstick them, depending. Trace inputs and outputs across the boundary you are touching, and update the generator or contract that defines the interface rather than patching artifacts directly. Leave a clear, rerunnable path from source to output so the wiring stays explicit for the next change. That is how to proceed generally; in particular, proceed first by reading CONTRIBUTING.md and AGENTS.md.