Skip to content

Pinghsuanlin/llm-anomaly-explainer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Anomaly Explainer

" A hybrid ML (Isolation Forest) + GenAI system that detects sensor anomalies and generates human-readable root cause narratives

Why this project

Anomaly detection systems in industrial operations typically stop at flaging an event. Engineers then spend significatn amount of time manually interpreting sensor readings to determine root cause and next action, which is a slow, expertise-dependent process.

The project explores whether an LLM layer added on top of a standard ML detector (Isolation Forest) could close the gap. The LLM layer would be able to automatically generate strucutred root cause narratives and recommend immediate actions from raw anomaly context in plain human language.

alt text

Approach:

The pipeline has 3 stages:

  1. Detection: Isolation Forest trained on normalized multi-sensor readings from the NASA CMAPSS Jet Engine dataset. Rolling statistics are computed per engine unit to capture degradation trends, not just point values.
  2. Context extraction: For each flaged anoamly, the system extracts the top deviating sensors relative to that unit's early-cycle baseline. Such numeric value would work as % deviation from baseline.
  3. LLM narration: Built from the anomaly context and passed to a Mistral/LLama model, following a structure of 3 fields: likely_cause, risk_level and recommended_action in JSON format.

Stack:

Layer Tool Reason
Anomaly Detection Isolation Forest (scikit-learn) Unsupervised, non labelled anomaly data required.
Feature Engineering Rolling means, std per unit Captures deradation trends, not just point values.
LLM Mistral-small Low cost and reliability of mistral-small for structured output.
UI Streamlit Fast iteration and deployable to HuggingFace spaces
Visualization Plotly Interactive tim series with anomaly overlay

Results:

  • Isolation Forest at contamination=0.05 flags ~1,000 anomaly events
  • LLM narrative generation averages ~1.2s per explanation (Mistral free tier)
  • JSON parse success rate: ~96% on first attempt; regex fallback handles the remainder
  • Risk level distribution across flagged events: ~60% Medium, ~25% High, ~15% Low

Quickstart

# 1. Clone and set up environment
git clone https://github.com/Pinghsuanlin/llm-anomaly-explainer
cd llm-anomaly-explainer

conda create -n anomaly-explainer python=3.11 -y
conda activate anomaly-explainer
pip install -r requirements.txt

# 2. Add your API key (Mistral free tier — console.mistral.ai)
cp .env.example .env
# Edit .env: MISTRAL_API_KEY=your_key_here

(if prefer full local; no API key)
# Install Ollama from ollama.com, then:
ollama pull mistral
# Set LLM_BACKEND=ollama in your .env

# 3. Download dataset
# Place train_FD001.txt in data/raw/
# Dataset: https://data.nasa.gov/dataset/cmapss-jet-engine-simulated-data

# 4. Run
streamlit run app.py

Design decisions & trade-offs

  1. Why Isolation Forest over LSTM? CMAPSS has no labelled anomaly ground truth, making supervised approaches hard to validate. Isolation Forest is fast, unsupervised, and produces a continuous anomaly score. Though LSTM would add signal on degradation trajectory but requires more engineering and computing power (may be very slow in training).
  2. Why not send raw sensor values to the LLM? Raw normalized floats (e.g. s11: 0.743) are not meaningful to an LLM without domain context. Baseline-relative deviations (s11: +34% vs normal operating range) give the model something it can reason about linguistically.
  3. Why structured JSON output?* Enforcing a schema makes the LLM output programmatically usable; also, the Streamlit UI can render risk level as a colour-coded badge rather than parsing free text. Which makes the system's behaviour testable and predictable.

Limitations

  1. Isolation Forest has no temporal memory; that is, it treats each cycle as independent. A proper production system would use sequence-aware detection (LSTM autoencoder, Prophet) for forecast.
  2. LLM explanations are plausible, not verified. They reflect the model's training on engineering text, not ground truth fault labels.
  3. CMAPSS FD001 uses a single operating condition; real-world sensor data is noisier and multi-regime.

About

Isolation Forest + rolling Z-score + GenAI system that detects sensor anomalies and generates human-readable root cause narratives.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors