Don't just measure accuracy. Understand failure.
etsi-failprint is a diagnostic tool designed to answer the question: "Why is my model failing?" It automatically isolates failure patterns across Tabular, NLP, and Computer Vision workflows, generating human-readable reports that pinpoint the root cause of errors.
Features • Installation • Quickstart • Multi-Modal Analysis • Counterfactuals
Standard metrics (Accuracy, F1) tell you how often you fail. Failprint tells you why.
- Multi-Modal Native: Seamlessly analyze failures in structured DataFrames, raw Text, or Image datasets using a unified API.
- Automated Segmentation: Automatically discovers weak spots (e.g., "Model fails 80% of the time when
Income < 50k"). - Lazy & Lightweight: Heavy dependencies (Torch, SpaCy, Transformers) are lazy-loaded. If you only analyze tabular data, you never pay the memory cost of deep learning libraries.
- Robust: Built with graceful degradation. If an optional dependency is missing or incompatible, Failprint adapts instead of crashing.
- Smart Segmentation: Identifies feature ranges or categories where error rates are statistically anomalous.
- Semantic Clustering:
- NLP: Groups failed texts by semantic meaning using Sentence Transformers.
- CV: Clusters failed images using ResNet embeddings to find visual patterns.
- Meta-Feature Extraction:
- Text: Analyzes failures by length, sentiment, subjectivity, and NER entities.
- Vision: Analyzes failures by brightness, contrast, aspect ratio, and dimensions.
- Counterfactuals: Suggests minimal changes to input data that would flip a failure to a success.
- Actionable Reporting: Outputs detailed Markdown reports with visual insights directly to your workspace.
- Python (3.8 or later)
pip install etsi-failprintgit clone https://github.com/etsi-failprint/etsi-failprint.git
cd etsi-failprint
pip install -e .Identify which features are driving your model's mistakes.
import pandas as pd
from etsi.failprint import analyze
# Load your data
df = pd.read_csv("loan_predictions.csv")
X = df.drop("target", axis=1)
y_true = df["target"]
y_pred = pd.Series([0, 1, 0, ...]) # Your model's predictions
# Run analysis
report = analyze(
X, y_true, y_pred,
cluster=True, # Cluster similar failures?
output="markdown" # Generate 'failprint_report.md'
)
print(report)Output Insight: "Segment Age < 25 contributes to 40% of all failures."
Failprint isn't just for spreadsheets. It understands unstructured data too.
Lazy-loads spacy and sentence-transformers to find semantic and structural failure patterns.
from etsi.failprint import analyze_nlp
texts = [
"I love this product!",
"Terrible service, very slow.",
"Product is okay but arrived late."
]
y_true = [1, 0, 0] # Sentiment labels
y_pred = [1, 1, 0] # Model predictions (Error on index 1)
report = analyze_nlp(texts, y_true, y_pred)Output Insight: "Failures are highly correlated with Sentiment Polarity < -0.5 and Word Count < 5."
Lazy-loads torch and torchvision to find visual failure clusters (e.g., "Dark images" or "Blurry dogs").
from etsi.failprint import analyze_cv
images = ["img1.jpg", "img2.jpg", "img3.jpg"]
y_true = [0, 1, 0]
y_pred = [0, 0, 0]
analyze_cv(images, y_true, y_pred)Output Insight: "Cluster 0 (Dark Images) accounts for 60% of false negatives."
Go beyond diagnostics. Ask "What should have happened?" This mode suggests the minimal change required to fix a prediction.
from etsi.failprint import analyze
# Run in counterfactual mode
analyze(
X, y_true, y_pred,
output="counterfactuals"
)Example Output:
Original Input: {'Age': 22, 'Income': 35000, 'Education': 'High School'}
Suggested Change: Education to 'Bachelor's'
Prediction: Success (Counterfactual)
Pull requests are welcome!
Please refer to CONTRIBUTING.md and CODE_OF_CONDUCT.md before submitting a Pull Request.
Connect with the etsi.ai team and other contributors on our Discord.
This project is distributed under the BSD-2-Clause License. See the LICENSE for details.
Built with ❤️ by etsi.ai