SAGED: A Holistic Bias-Benchmarking Pipeline for Language Models with Customisable Fairness Calibration

Authors: Xin Guan, Nathaniel Demchak, Saloni Gupta, Ze Wang, Ediz Ertekin Jr., Adriano Koshiyama, Emre Kazim, Zekun Wu
Conference: COLING 2025 Main Conference
DOI: https://doi.org/10.48550/arXiv.2409.11149

Overview

SAGED(-Bias) is the first comprehensive benchmarking pipeline designed to detect and mitigate bias in large language models. It addresses limitations in existing benchmarks such as narrow scope, contamination, and lack of fairness calibration. The SAGED pipeline includes the following five core stages:

This diagram illustrates the core stages of the SAGED pipeline:

Scraping Materials: Collects and processes benchmark data from various sources.
Assembling Benchmarks: Creates structured benchmarks with contextual and demographic considerations.
Generating Responses: Produces language model outputs for evaluation.
Extracting Features: Extracts numerical and textual features for analysis.
Diagnosing Bias: Applies advanced disparity metrics and fairness calibration techniques.

SAGED evaluates max disparity (e.g., impact ratio) and bias concentration (e.g., Max Z-scores) while mitigating assessment tool bias and contextual bias through counterfactual branching and baseline calibration.

Installation

Install the development version from GitHub:

# Clone the repository
git clone https://github.com/holistic-ai/SAGED-Bias.git
cd SAGED-Bias

# Install Hatch (if not already installed)
pip install hatch

# Create a virtual environment
hatch env create

Install the dependencies

hatch run install

Running Tests

hatch run pytest tests --cache-clear --cov=saged --cov-report=term

Key Features

1. Customize Bias-Benchmarking Prompts and Metrics

SAGED allows users to define custom prompts and tailor bias-benchmarking metrics, making it adaptable to different contexts and evaluation requirements.

2. Benchmark Building: Scrape and Assemble

Scraping (_scrape.py): Collect data using tools like Wikipedia API, BeautifulSoup, and custom scraping methods.
Assembling (_assembler.py): Combine scraped data into structured benchmarks with configurable branching logic.

3. Benchmark Running: Generate and Extract

Generate (_generator.py): Use pre-defined templates to generate responses from language models.
Extract (_extractor.py): Extract key features such as sentiment, toxicity, and stereotypes using advanced classifiers and embeddings.

4. Diagnosis: Group, Summarize, and Compare

Diagnose (_diagnoser.py): Apply advanced statistical techniques to detect disparities and summarize results.
Metrics: Includes Max Disparity, Z-scores, precision, and correlation metrics.

5. Pipeline - Build and Run Benchmark

Pipeline (_pipeline.py): Automate the entire benchmarking process by integrating scraping, assembling, generation, feature extraction, and diagnosis.

Usage Guide

Building a Benchmark

Scraping Materials: Use the KeywordFinder, SourceFinder, or Scraper classes from _scrape.py to collect benchmark data.
Assembling Prompts: Use the PromptAssembler class in _assembler.py to split sentences and create custom prompts.

Running the Benchmark

Generate Responses: Use the ResponseGenerator class in _generator.py to generate outputs from language models.
Extract Features: Apply the FeatureExtractor class in _extractor.py for sentiment, toxicity, and stereotype analysis.

Diagnosing Bias

Group and Analyze: Use the DisparityDiagnoser class in _diagnoser.py to calculate group statistics and compare disparities.
Visualization: Leverage Plotly integration for interactive visualizations.

End-to-End Pipeline

The Pipeline class in _pipeline.py integrates all stages into a seamless workflow.

Citation

If you use SAGED in your work, please cite the following paper:

@article{guan2025saged,
  title={SAGED: A Holistic Bias-Benchmarking Pipeline for Language Models with Customisable Fairness Calibration},
  author={Xin Guan and Nathaniel Demchak and Saloni Gupta and Ze Wang and Ediz Ertekin Jr. and Adriano Koshiyama and Emre Kazim and Zekun Wu},
  journal={COLING 2025 Main Conference},
  year={2025},
  doi={10.48550/arXiv.2409.11149}
}

License

SAGED-bias is released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.github/workflows		.github/workflows
dist		dist
external_dependency		external_dependency
saged		saged
tests		tests
tutorials		tutorials
.gitignore		.gitignore
PKG-INFO		PKG-INFO
README.md		README.md
assistants.py		assistants.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
system_diagram.png		system_diagram.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAGED: A Holistic Bias-Benchmarking Pipeline for Language Models with Customisable Fairness Calibration

Overview

Installation

Install the development version from GitHub:

Install the dependencies

Running Tests

Key Features

1. Customize Bias-Benchmarking Prompts and Metrics

2. Benchmark Building: Scrape and Assemble

3. Benchmark Running: Generate and Extract

4. Diagnosis: Group, Summarize, and Compare

5. Pipeline - Build and Run Benchmark

Usage Guide

Building a Benchmark

Running the Benchmark

Diagnosing Bias

End-to-End Pipeline

Citation

License

About

Releases

Packages

Contributors 2

Languages

holistic-ai/SAGED-Bias

Folders and files

Latest commit

History

Repository files navigation

SAGED: A Holistic Bias-Benchmarking Pipeline for Language Models with Customisable Fairness Calibration

Overview

Installation

Install the development version from GitHub:

Install the dependencies

Running Tests

Key Features

1. Customize Bias-Benchmarking Prompts and Metrics

2. Benchmark Building: Scrape and Assemble

3. Benchmark Running: Generate and Extract

4. Diagnosis: Group, Summarize, and Compare

5. Pipeline - Build and Run Benchmark

Usage Guide

Building a Benchmark

Running the Benchmark

Diagnosing Bias

End-to-End Pipeline

Citation

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages