Anvil

Labeling queue and governance toolkit for human-in-the-loop workflows. Anvil provides GenServer-based queues for fast, in-memory work plus a Postgres/Oban pipeline for production-grade exports, telemetry, and retention.

Installation

Add Anvil to your dependencies:

{:anvil_ex, "~> 0.1.1"}

Highlights

Schema-driven validation with typed fields (:text, :select, :multiselect, :range, :number, :boolean, :date, :datetime)
Pluggable assignment policies: round-robin, random, weighted expertise, redundancy (k labels per sample), or custom policy modules
Storage adapters for ETS (default) and Postgres (Anvil.Storage.Postgres) with Ecto schemas for queues, assignments, labels, schema versions, and audit logs
Agreement metrics (Cohen, Fleiss, Krippendorff) with telemetry and background recomputation
PII-aware exports with redaction, pseudonyms, manifests, and reproducibility verification
Background jobs via Oban for timeouts, agreement recompute, and retention sweeps
Optional Forge sample bridge and simple ACL helpers for queue membership

Quickstart (in-memory queue)

# 1) Define a schema
schema =
  Anvil.Schema.new(
    name: "sentiment",
    fields: [
      %Anvil.Schema.Field{
        name: "sentiment",
        type: :select,
        required: true,
        options: ["positive", "negative", "neutral"]
      }
    ]
  )

# 2) Start a queue (ETS storage by default)
{:ok, queue} =
  Anvil.create_queue(
    queue_id: "sentiment_queue",
    schema: schema,
    labels_per_sample: 2,
    policy: :round_robin
  )

# 3) Load work and labelers
Anvil.add_samples(queue, [%{id: "s1", text: "Great product!"}])
Anvil.add_labelers(queue, ["alice", "bob"])

# 4) Pull and start an assignment
{:ok, assignment} = Anvil.get_next_assignment(queue, "alice")
{:ok, assignment} = Anvil.Queue.start_assignment(queue, assignment.id)

# 5) Submit a label (validated against the schema)
{:ok, label} =
  Anvil.submit_label(queue, assignment.id, %{"sentiment" => "positive"})

# 6) Fetch labels and compute agreement
labels = Anvil.Queue.get_labels(queue)
{:ok, score} = Anvil.Agreement.compute(labels)

Assignment policies

:round_robin – walks samples in order
:random – random sample from available set
:expertise – weighted expertise policy (:expertise_scores, :min_expertise, optional :difficulty_field)
:redundancy – prioritize under-labeled samples; default when labels_per_sample > 1
Custom module – pass a module or {module, config} that implements Anvil.Queue.Policy

Storage backends

ETS (default): zero-dependency, in-memory storage for tests and ephemeral queues.
Postgres: use Anvil.Storage.Postgres (defaults to Anvil.Repo). Requires the host app to provide the database schema matching the Ecto modules under Anvil.Schema.*; migrations are not bundled in this repo.

{:ok, queue} =
  Anvil.create_queue(
    queue_id: "prod_queue",
    schema: schema,
    storage: Anvil.Storage.Postgres
  )

Agreement metrics

Anvil.Agreement.compute/2 auto-selects Cohen/Fleiss based on rater count or accepts metric: :cohen | :fleiss | :krippendorff.
Helpers: compute_for_field/3, compute_all_dimensions/3, and summary/3 for per-field rollups.
Telemetry emits low-agreement events when scores drop below 0.6.

Exporting labels

Manifested export (ADR-005): deterministic ordering, SHA256 manifest, optional PII redaction and pseudonyms. Requires Postgres data with schema versions.

{:ok, %{manifest: manifest, output_path: path}} =
  Anvil.Export.to_format(:csv, queue_id, %{
    schema_version_id: schema_version_id,
    output_path: "/tmp/labels.csv",
    redaction_mode: :automatic
  })

{:ok, :reproducible} = Anvil.Export.verify_reproducibility(manifest)

Legacy export: works with ETS queues; serializes current in-memory labels.
```
:ok = Anvil.Export.export(queue, format: :csv, path: "labels.csv")
```

PII, retention, and governance

PII metadata on fields (pii, retention_days, redaction_policy) drives redaction and retention.
Anvil.PII.Redactor supports strip/truncate/hash/regex policies and payload redaction modes (:none, :automatic, :aggressive).

HTTP API (for clients like Ingot)

Anvil ships a Plug/Cowboy server for /v1 IR endpoints. Enable it with:

# config/config.exs or runtime.exs
config :anvil, :api_server, enabled: true, port: 4101

In test, the server is disabled by default. Start your Anvil app (or release) to expose the API for HTTP clients; Ingot’s default adapter expects this.
Anvil.PII.Retention and the Anvil.Workers.RetentionSweep Oban job enforce retention windows and optional soft/hard deletion.
Labeler pseudonyms available via Anvil.PII.Pseudonym.

Background jobs and telemetry

Oban cron (see config/config.exs): timeout sweeps, agreement recompute, retention sweeps.
Telemetry events cover queue creation, assignment dispatch/completion, validation errors, exports, agreement, and storage queries.

Forge integration

Anvil.ForgeBridge fetches samples via pluggable backends (Direct, HTTP, Cached, Mock) with Cachex-based caching.

Authentication and access control

Anvil.Auth.ACL provides queue membership checks (:labeler, :reviewer, :owner) plus helpers for granting/revoking access.
Signed URL and OIDC helpers are available under Anvil.Auth.

Development

mix test
mix docs

License

MIT License - see LICENSE for details.

Acknowledgments

Built by the North Shore AI team for the machine learning community.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
assets		assets
config		config
docs		docs
lib		lib
priv/repo/migrations		priv/repo/migrations
test		test
.formatter.exs		.formatter.exs
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
mix.exs		mix.exs
mix.lock		mix.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anvil

Installation

Highlights

Quickstart (in-memory queue)

Assignment policies

Storage backends

Agreement metrics

Exporting labels

PII, retention, and governance

HTTP API (for clients like Ingot)

Background jobs and telemetry

Forge integration

Authentication and access control

Development

License

Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

North-Shore-AI/anvil

Folders and files

Latest commit

History

Repository files navigation

Anvil

Installation

Highlights

Quickstart (in-memory queue)

Assignment policies

Storage backends

Agreement metrics

Exporting labels

PII, retention, and governance

HTTP API (for clients like Ingot)

Background jobs and telemetry

Forge integration

Authentication and access control

Development

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages