blackroad-ab-testing

Statistically rigorous A/B testing framework for BlackRoad OS — consistent hashing, chi-squared significance, SQLite persistence.

Features

Consistent hashing — users always land in the same variant across sessions
Chi-squared significance testing — know when your results are statistically meaningful
Multi-variant support — A/B/C/D experiments with arbitrary traffic splits
SQLite persistence — zero-dependency storage, no external services required
CLI + Python API — use from the terminal or import as a library
Duplicate-safe — idempotent assignment and conversion recording

Installation

git clone https://github.com/BlackRoad-OS/blackroad-ab-testing.git
cd blackroad-ab-testing
pip install -r requirements.txt

CLI Usage

Create an experiment

python -m src.ab_testing --db myexp.db create button-color \
  --desc "Test CTA button color" \
  --variants "control:50,treatment:50" \
  --metric conversion

Start / stop an experiment

python -m src.ab_testing --db myexp.db start button-color
python -m src.ab_testing --db myexp.db stop button-color

Assign a user to a variant

python -m src.ab_testing --db myexp.db assign button-color user-123
# User 'user-123' assigned to variant: treatment

Record a conversion

python -m src.ab_testing --db myexp.db convert button-color user-123
python -m src.ab_testing --db myexp.db convert button-color user-123 --value 49.99

View stats and significance

python -m src.ab_testing --db myexp.db stats button-color --confidence 0.95

{
  "variant_stats": {
    "control":   { "assignments": 512, "conversions": 51,  "conversion_rate_pct": 9.96 },
    "treatment": { "assignments": 488, "conversions": 122, "conversion_rate_pct": 25.0 }
  },
  "significance": {
    "chi2_stat": 54.21,
    "p_value": 0.0001,
    "is_significant": true,
    "recommended_winner": "treatment",
    "reason": "Chi2=54.2100 > critical=3.841 at 95% confidence (p≈0.0001)"
  }
}

Get winning variant

python -m src.ab_testing --db myexp.db winner button-color
# Winner: treatment

Export full results

python -m src.ab_testing --db myexp.db export button-color > results.json

List all experiments

python -m src.ab_testing --db myexp.db list

Python API

from src.ab_testing import ABTestingFramework, Experiment, Variant

fw = ABTestingFramework(db_path="myexp.db")

# Create experiment
exp = Experiment(
    name="checkout-flow",
    variants=[
        Variant("control",   traffic_pct=50.0, description="Current flow"),
        Variant("treatment", traffic_pct=50.0, description="Simplified flow"),
    ],
    description="Test simplified checkout",
    hypothesis="Simplified flow increases conversions by 15%",
    metric="purchase",
)
fw.create_experiment(exp)
fw.start_experiment("checkout-flow")

# Assign users
variant = fw.assign_variant(exp, user_id="user-abc")  # deterministic

# Record conversions
fw.record_conversion("checkout-flow", user_id="user-abc", value=79.99)

# Check significance
sig = fw.calculate_significance(exp, confidence=0.95)
print(sig.is_significant, sig.recommended_winner, sig.reason)

# Get winner (also persists to DB)
winner = fw.get_winning_variant("checkout-flow")

# Export
results = fw.export_results("checkout-flow")

Statistical Methodology

Chi-Squared Test of Independence

The framework tests whether the difference in conversion rates between variants is statistically significant using the chi-squared (χ²) test of independence.

Given an experiment with variants V₁, V₂, …, Vₙ, a 2×n contingency table is built:

	Converted	Not converted
Control	c₁	n₁ − c₁
Treatment	c₂	n₂ − c₂

The chi-squared statistic is:

χ² = Σ (observed − expected)² / expected

where expected frequencies are computed from row and column marginals.

The result is compared against the critical value for (confidence, degrees_of_freedom = n_variants − 1).

Confidence levels supported: 90%, 95%, 99%, 99.9%

Consistent Hashing

User-to-variant assignment uses SHA-256 hashing:

hash_val = SHA256(f"{experiment_name}:{user_id}")
bucket   = (hash_val % 10_000) / 100.0   # → [0.00, 99.99]

Variants are assigned by cumulative traffic percentage buckets. This guarantees:

The same user always sees the same variant
Traffic splits are uniform across the hash space
Changing experiment name produces independent assignments

SQLite Schema

-- Experiment definitions
CREATE TABLE experiments (
    name         TEXT PRIMARY KEY,
    description  TEXT,
    hypothesis   TEXT,
    metric       TEXT,
    status       TEXT,   -- draft | running | stopped
    variants_json TEXT,
    created_at   TEXT,
    started_at   TEXT,
    ended_at     TEXT,
    winner       TEXT
);

-- User → variant assignments (one per user per experiment)
CREATE TABLE assignments (
    id              TEXT PRIMARY KEY,
    experiment_name TEXT,
    user_id         TEXT,
    variant_name    TEXT,
    assigned_at     TEXT,
    UNIQUE(experiment_name, user_id)
);

-- Conversion events (one per user per experiment)
CREATE TABLE conversions (
    id              TEXT PRIMARY KEY,
    experiment_name TEXT,
    user_id         TEXT,
    variant_name    TEXT,
    converted_at    TEXT,
    value           REAL,
    metadata        TEXT
);

Running Tests

pytest tests/ -v --tb=short

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
src		src
tests		tests
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

blackroad-ab-testing

Features

Installation

CLI Usage

Create an experiment

Start / stop an experiment

Assign a user to a variant

Record a conversion

View stats and significance

Get winning variant

Export full results

List all experiments

Python API

Statistical Methodology

Chi-Squared Test of Independence

Consistent Hashing

SQLite Schema

Running Tests

License

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

License

BlackRoad-OS/blackroad-ab-testing

Folders and files

Latest commit

History

Repository files navigation

blackroad-ab-testing

Features

Installation

CLI Usage

Create an experiment

Start / stop an experiment

Assign a user to a variant

Record a conversion

View stats and significance

Get winning variant

Export full results

List all experiments

Python API

Statistical Methodology

Chi-Squared Test of Independence

Consistent Hashing

SQLite Schema

Running Tests

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages