LabelingIR

Shared intermediate representation (IR) structs for the North Shore labeling stack. LabelingIR provides typed, JSON-serializable data structures used across Forge, Anvil, Ingot, and external clients for human-in-the-loop ML workflows.

Highlights

Typed structs with enforced keys and sensible defaults
JSON-serializable via Jason for API transport and storage
Multi-tenant support with tenant_id and optional namespace on all entities
Lineage tracking via optional lineage_ref for provenance
Zero dependencies beyond Jason

Installation

Add to your mix.exs:

def deps do
  [
    {:labeling_ir, "~> 0.2.0"}
  ]
end

Or from GitHub:

def deps do
  [
    {:labeling_ir, git: "https://github.com/North-Shore-AI/labeling_ir.git"}
  ]
end

Structs

Core Entities

Struct	Purpose
`LabelingIR.Sample`	UI-friendly sample representation with payload and artifacts
`LabelingIR.Dataset`	Versioned dataset with slices for labeling/eval workloads
`LabelingIR.Schema`	Declarative label schema definition
`LabelingIR.Schema.Field`	Individual field in a schema (`:text`, `:select`, `:scale`, etc.)

Labeling Workflow

Struct	Purpose
`LabelingIR.Assignment`	Unit of labeling work binding a sample to a schema
`LabelingIR.Label`	Human-provided label values with timing and metadata
`LabelingIR.EvalRun`	Evaluation run (human or model) over a dataset slice

Artifacts

Struct	Purpose
`LabelingIR.Artifact`	Artifact attached to a sample (image, JSON, text, etc.)
`LabelingIR.ArtifactRef`	Lightweight reference to an artifact by ID

Usage

Creating a Sample

alias LabelingIR.{Sample, Artifact}

sample = %Sample{
  id: "sample_001",
  tenant_id: "acme_corp",
  pipeline_id: "sentiment_v2",
  payload: %{"text" => "Great product, highly recommend!"},
  artifacts: [
    %Artifact{
      id: "img_001",
      url: "https://storage.example.com/screenshot.png",
      filename: "screenshot.png",
      artifact_type: :image,
      mime: "image/png"
    }
  ],
  metadata: %{"source" => "support_tickets"},
  created_at: DateTime.utc_now()
}

Defining a Schema

alias LabelingIR.{Schema, Schema.Field}

schema = %Schema{
  id: "sentiment_schema_v1",
  tenant_id: "acme_corp",
  fields: [
    %Field{
      name: "sentiment",
      type: :select,
      required: true,
      options: ["positive", "negative", "neutral"]
    },
    %Field{
      name: "confidence",
      type: :scale,
      required: true,
      min: 1,
      max: 5,
      help: "How confident are you in this label?"
    },
    %Field{
      name: "notes",
      type: :text,
      required: false
    }
  ]
}

Creating an Assignment

alias LabelingIR.Assignment

assignment = %Assignment{
  id: "assign_001",
  queue_id: "sentiment_queue",
  tenant_id: "acme_corp",
  sample: sample,
  schema: schema,
  existing_labels: [],
  expires_at: DateTime.add(DateTime.utc_now(), 3600, :second)
}

Submitting a Label

alias LabelingIR.Label

label = %Label{
  id: "label_001",
  assignment_id: "assign_001",
  sample_id: "sample_001",
  queue_id: "sentiment_queue",
  tenant_id: "acme_corp",
  user_id: "labeler_alice",
  values: %{
    "sentiment" => "positive",
    "confidence" => 4,
    "notes" => "Clear positive sentiment"
  },
  time_spent_ms: 12500,
  created_at: DateTime.utc_now()
}

Datasets and Eval Runs

alias LabelingIR.{Dataset, EvalRun}

dataset = %Dataset{
  id: "sentiment_dataset_v1",
  tenant_id: "acme_corp",
  version: "1.0.0",
  slices: [
    %{name: "train", sample_ids: ["s1", "s2", "s3"], filter: %{}},
    %{name: "test", sample_ids: ["s4", "s5"], filter: %{}}
  ],
  created_at: DateTime.utc_now()
}

eval_run = %EvalRun{
  id: "eval_001",
  tenant_id: "acme_corp",
  dataset_id: "sentiment_dataset_v1",
  slice: "test",
  run_type: :model,
  model_ref: "gpt-4-turbo",
  metrics: %{
    "accuracy" => 0.92,
    "f1" => 0.89,
    "cohens_kappa" => 0.85
  },
  created_at: DateTime.utc_now()
}

JSON Serialization

All structs derive Jason.Encoder for seamless JSON serialization:

Jason.encode!(sample)
# => {"id":"sample_001","tenant_id":"acme_corp",...}

Field Types

The Schema.Field struct supports the following types:

Type	Description
`:text`	Free-form text input
`:boolean`	True/false toggle
`:select`	Single selection from `options` list
`:multiselect`	Multiple selections from `options` list
`:scale`	Numeric scale with `min`/`max` bounds

Custom types can be used as atoms (e.g., :date, :datetime, :number).

Architecture

LabelingIR serves as the shared contract between:

Forge - Sample ingestion and preprocessing pipelines
Anvil - Labeling queue management and human workflows
Ingot - UI frontend for labelers and reviewers
External clients - API consumers and integrations

┌─────────┐     ┌─────────┐     ┌─────────┐
│  Forge  │────▶│  Anvil  │────▶│  Ingot  │
└─────────┘     └─────────┘     └─────────┘
     │               │               │
     └───────────────┼───────────────┘
                     │
              ┌──────┴──────┐
              │ LabelingIR  │
              │  (structs)  │
              └─────────────┘

Development

# Install dependencies
mix deps.get

# Run tests
mix test

# Generate docs
mix docs

# Type checking
mix dialyzer

License

MIT License - see LICENSE for details.

Acknowledgments

Built by the North Shore AI team for the machine learning community.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
lib		lib
test		test
.formatter.exs		.formatter.exs
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
mix.exs		mix.exs
mix.lock		mix.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LabelingIR

Highlights

Installation

Structs

Core Entities

Labeling Workflow

Artifacts

Usage

Creating a Sample

Defining a Schema

Creating an Assignment

Submitting a Label

Datasets and Eval Runs

JSON Serialization

Field Types

Architecture

Development

License

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

License

North-Shore-AI/labeling_ir

Folders and files

Latest commit

History

Repository files navigation

LabelingIR

Highlights

Installation

Structs

Core Entities

Labeling Workflow

Artifacts

Usage

Creating a Sample

Defining a Schema

Creating an Assignment

Submitting a Label

Datasets and Eval Runs

JSON Serialization

Field Types

Architecture

Development

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages