LLM Log Analyzer

A log analyzer for CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions) that leverages Large Language Models to automatically detect and explain the root cause of failures, suggesting possible fixes.

This project originated from the difficulty of diagnosing issues in Unity build logs, which are notoriously verbose and hard to navigate — especially for less experienced developers. While optimized for Unity by default, the analyzer is fully configurable and can be adapted to any kind of build or runtime logs.

Feel free to contribute with new presets or refinements.

Features

Efficient Processing: Handles very large logs (>50MB) using streaming techniques
Flexible AI Models: Support for multiple LLM models (Claude, Gemini, OpenAI models)
Structured Output: Provides both human-friendly summaries and machine-readable JSON
CI-Ready: Designed for seamless integration with Jenkins, GitLab CI, and other platforms
Configurable: Customizable error patterns and prompts for different log types
Smart Filtering: Extracts only relevant error information to optimize LLM results
Unity-Optimized: Includes Unity-specific and generic patterns by default

Installation

Using pip (Recommended)

pip install llm-log-analyzer

From GitHub (Development)

pip install git+https://github.com/renatoaf/llm-log-analyzer
# or pip3 install git+ssh://git@github.com/renatoaf/llm-log-analyzer

Cloning the repo

Clone this repository:

git clone https://github.com/renatoaf/llm-log-analyzer
cd llm-log-analyzer

Install dependencies:

pip install -r requirements.txt

Setup

Set up API keys for your preferred LLM provider:

For Gemini:

export GOOGLE_API_KEY="your_gemini_api_key"

For OpenAI:

export OPENAI_API_KEY="your_openai_api_key"

For Claude:

export ANTHROPIC_API_KEY="your_anthropic_api_key"

or

export AWS_BEARER_TOKEN_BEDROCK="your_aws_bearer_token"
export AWS_REGION="your_aws_region" # defaults to us-east-1

For using Claude through AWS Bedrock.

Note: The analyzer will default to Gemini. You can get a Gemini API key from Google AI Studio.

Usage

The usages refer to the package being installed via pip. If you cloned the repo, replace llm-log-analyzer by python3 analyze_log.py (inside src/llm_log_analyzer).

Basic Usage

# Check available LLM providers and their status
llm-log-analyzer --list-providers

# Basic analysis (uses generic preset by default)
llm-log-analyzer application.log

# Use Unity preset for Unity build logs  
llm-log-analyzer build.log -P unity

# Specify LLM provider with presets
llm-log-analyzer server.log -P generic --provider openai
llm-log-analyzer build.log -P unity --provider gemini
llm-log-analyzer build.log -P unity --provider claude

# Use custom API key with presets
llm-log-analyzer system.log -P generic --provider openai --api-key your_key
llm-log-analyzer build.log -P unity --provider claude --api-key your_key

# Advanced options with presets
llm-log-analyzer app.log -P generic --tail-lines 5000 --output-dir results/ --debug

# Manual file specification
llm-log-analyzer server.log --patterns-file patterns/generic.txt --prompt-file prompts/generic.txt

Command Line Options

Run:

llm-log-analyzer -h

For detailed options and usage examples.

Analysis Presets

The analyzer now supports presets that combine pattern and prompt files for common use cases. This makes it easier to switch between different types of log analysis.

Available Presets

Preset	Description	Pattern File	Prompt File
`generic`	General purpose analysis for any application logs	`patterns/generic.txt`	`prompts/generic.txt`
`unity`	Unity game engine build logs with game-specific patterns	`patterns/unity.txt`	`prompts/unity.txt`
`android`	Native Android builds analysis	`patterns/android.txt`	`prompts/android.txt`
`ios`	Native iOS builds analysis	`patterns/ios.txt`	`prompts/ios.txt`
`flutter`	Flutter cross-platform builds analysis	`patterns/flutter.txt`	`prompts/flutter.txt`

Using Presets

# Use preset (recommended approach)
llm-log-analyzer build.log -P unity
llm-log-analyzer server.log -P generic
llm-log-analyzer gradle-build.log -P android
llm-log-analyzer xcode-build.log -P ios
llm-log-analyzer flutter-build.log -P flutter

# Override individual files while using a preset
llm-log-analyzer build.log -P unity --prompt "Focus on memory leaks in Unity builds"
llm-log-analyzer build.log -P generic --patterns-file my-custom-patterns.txt

# Manual file specification
llm-log-analyzer build.log --patterns-file patterns/unity.txt --prompt-file prompts/unity.txt

Default Behavior: If no preset is specified, the analyzer defaults to the generic preset, making it suitable for any type of log analysis out of the box. Beware that this wasn't optimized yet, so results may feel generic.

Output Files

The analyzer generates two output files:

analysis.json - Machine-readable structured analysis:

{
  "root_cause": "Script compilation failed due to a missing definition for 'Armadillo' in the 'CharacterEnum' enum. This suggests a new character was added without updating the enum accordingly.",
  "relevant_lines": [
    "Assets/Scripts/Gameplay/Controllers/Player/CharacterSoundControllerFactory.cs(148,66): error CS0117: 'CharacterEnum' does not contain a definition for 'Armadillo'",
    "Error building Player because scripts have compile errors in the editor"
  ],
  "confidence": 0.95,
  "action_suggestion": "Add a definition for the 'Armadillo' character to the 'CharacterEnum' enum in the CharacterSoundControllerFactory.cs file. Ensure the enum is updated whenever new characters are introduced.",
  "human_summary": "The Unity build failed due to a script compilation error. The error message indicates that `CharacterEnum` is missing a definition for 'Armadillo'.  This likely happened because a new character, 'Armadillo', was added without updating the corresponding enum. To fix this, add a new entry for 'Armadillo' to the `CharacterEnum` enum in the `CharacterSoundControllerFactory.cs` file."
}

analysis.md - Human-friendly Markdown summary for Slack/MR comments:

# Build Analysis Report

## Summary
The Unity build failed due to a script compilation error. The error message indicates that `CharacterEnum` does not contain a definition for `Armadillo`.  This likely means a new character was added without updating the enum. To fix this, add `Armadillo` to the `CharacterEnum` or correct the reference in `CharacterSoundControllerFactory.cs` if a different character was intended.

## Analysis Configuration

**Provider:** GeminiProvider

**Models Used:** 
- Chunk Analysis: gemini-2.5-flash
- Final Aggregation: gemini-2.5-pro

**Processing Settings:**
- Max Chunks: 20
- Max Workers: 4

## Analysis Details

**Root Cause:** Script compilation error due to missing definition for 'Armadillo' in 'CharacterEnum'

**Confidence:** 🟢 95.0%

## Action Items
Add the 'Armadillo' definition to the 'CharacterEnum' or correct the reference in 'CharacterSoundControllerFactory.cs' if it's intended to be a different character.

## Relevant Log Lines
Assets/Scripts/Gameplay/Controllers/Player/CharacterSoundControllerFactory.cs(148,66): error CS0117: 'CharacterEnum' does not contain a definition for 'Armadillo'

Error building Player because scripts have compile errors in the editor

Exit Codes

0: No significant failures detected in log
1: Failures found and analyzed
2: Analysis failed (e.g., file not found, API errors)

LLM Provider Comparison

The analyzer supports multiple LLM providers to fit different needs and budgets:

Claude (Anthropic)

Cost: Pay-per-use pricing, very expensive
Speed: Fast, but more expensive per request
Accuracy: Excellent, especially for complex log analysis and suggestions for solution
Setup: Requires paid Anthropic account

Best for: Complex logs, when you need solution accuracy

GPT (OpenAI)

Cost: Pay-per-use pricing with competitive rates
Speed: Fast with good throughput
Accuracy: Good for most log analysis tasks
Setup: Requires OpenAI API key from OpenAI Platform

Best for: Balanced cost and performance

Gemini (Google)

Cost: Free tier with generous limits
Speed: Very fast (especially Gemini 2.5 Flash)
Accuracy: Good for most log analysis tasks
Setup: Get API key from Google AI Studio

Best for: CI environments, frequent analysis

Gemini works exceptionally well for Unity builds use case. Given its strong performance and competitive pricing, it is currently the recommended option for running this analyzer in CI environments.

CI Integration

Jenkins Pipeline

pipeline {
    agent any
    
    environment {
        GOOGLE_API_KEY = credentials('google-api-key')
        OPENAI_API_KEY = credentials('openai-api-key') # Alternative
        ANTHROPIC_API_KEY = credentials('anthropic-api-key') # Alternative
    }
    
    stages {
        stage('Application Build') {
            steps {
                sh 'make build > build.log 2>&1'
            }
            post {
                failure {
                    // Only run analyzer when build fails
                    echo "Build failed, analyzing logs..."
                    sh '''
                        llm-log-analyzer -P unity --output-dir ./analysis/ build.log
                        
                        # Check if analysis found failures
                        if [ $? -eq 1 ]; then
                            echo "Build failures detected and analyzed"
                        fi
                    '''
                    
                    // Archive analysis results
                    archiveArtifacts artifacts: 'analysis/*', allowEmptyArchive: true
                    
                    // Publish results to Slack/Teams
                    script {
                        if (fileExists('analysis/analysis.md')) {
                            def analysis = readFile('analysis/analysis.md')
                            slackSend(
                                channel: '#builds',
                                color: 'danger',
                                message: "🚨 Build Failed - Analysis Results:\n```${analysis}```"
                            )
                        }
                    }
                }
                always {
                    // Archive build log for reference
                    archiveArtifacts artifacts: 'build.log', allowEmptyArchive: true
                    archiveArtifacts artifacts: 'analysis.md', allowEmptyArchive: true
                }
            }
        }
    }
}

GitLab CI

# .gitlab-ci.yml
variables:
  GOOGLE_API_KEY: $GOOGLE_API_KEY
  OPENAI_API_KEY: $OPENAI_API_KEY # Alternative
  ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY # Alternative

stages:
  - build
  - analyze

build_app:
  stage: build
  script:
    - make build > build.log 2>&1
  artifacts:
    when: always
    paths:
      - build.log
    expire_in: 1 day

log_analysis:
  stage: analyze
  dependencies:
    - build_app
  script:
    ... install llm-log-analyzer ...
    - llm-log-analyzer build.log --output-dir analysis/ -P unity
    - |
      if [ -f analysis/analysis.md ]; then
        echo "## 🚨 Build Failed - Log Analysis Results" >> analysis_comment.md
        cat analysis/analysis.md >> analysis_comment.md
        
        # Optional: Post to merge request if available
        if [ -n "$CI_MERGE_REQUEST_IID" ]; then
          curl -X POST \
            -H "PRIVATE-TOKEN: $GITLAB_TOKEN" \
            -F "body=@analysis_comment.md" \
            "$CI_API_V4_URL/projects/$CI_PROJECT_ID/merge_requests/$CI_MERGE_REQUEST_IID/notes"
        fi
      fi
  artifacts:
    when: always
    paths:
      - analysis/
    expire_in: 1 week
  when: on_failure

Customization

The analyzer is highly configurable and can be adapted for different log types, environments, and specific project needs.

Configuration for Different Log Types

Generic Application Logs (Default)

By default, the analyzer uses the generic preset suitable for any application logs:

llm-log-analyzer app.log  # Uses generic preset by default
llm-log-analyzer app.log -P generic  # Explicitly specify generic preset

Unity Build Logs

For Unity build logs, use the Unity preset with game-specific patterns and prompts:

llm-log-analyzer build.log -P unity  # Uses Unity-specific patterns and prompts

Custom Log Types

Create your own patterns and prompts for specific applications:

Create custom patterns file (custom_patterns.txt):

# Custom error patterns for MyApp
MyApp ERROR:
FATAL:
Application crashed
Database connection failed

Create custom prompt file (custom_prompt.txt):

You are an iOS specialist investigating errors in a Xcode build log...

Use custom configuration:

# Start with a preset and override specific parts
llm-log-analyzer build.log -P generic --prompt "You are a DevOps expert analyzing Docker container logs. Look for ..."

llm-log-analyzer xcode.log -P generic --prompt "You are an iOS Developer investigating a crash..."

# Use completely custom files
llm-log-analyzer myapp.log --patterns-file custom_patterns.txt --prompt-file custom_prompt.txt

# Mix preset patterns with custom prompt
llm-log-analyzer build.log -P unity --prompt "Focus specifically on IL2CPP compilation issues..."

Additional Context for Known Issues

You can provide additional context via a file to help the analyzer understand known issues and solutions specific to your project or CI environment. This is particularly useful for:

Documenting recurring build failures and their solutions
Providing project-specific troubleshooting information
Adding context about infrastructure-specific issues
Including known workarounds for common problems

Creating a Context File:

Create a text file (e.g., context.txt) with your known issues and solutions, example:

# Unity Build Context
Known Issues:
- Unity 2021.3.5f1 has memory issues on Jenkins agents with <8GB RAM
- Android builds fail on Gradle 7.x due to dependency conflicts with com.google.firebase
- Network timeouts during package resolution in corporate networks

Common Solutions:
- Memory issues: Use agents with more memory or reduce build parallelism with --max-workers 2
- Gradle conflicts: Use Gradle 6.8 or update Firebase to v31.0.0+
- Unity license activation fails on new runners: requires manual activation or check license server connectivity

Build Environment Notes:
- Using GitLab CI with Docker containers
- Jenkins agents have variable memory (4-16GB)
- Corporate firewall may block some package repositories
...

Usage:

# Basic usage with context file
llm-log-analyzer build.log -P unity --additional-context-file context.txt

# Different context files for different environments
llm-log-analyzer unity-build.log -P unity --additional-context-file contexts/unity-context.txt
llm-log-analyzer server.log --additional-context-file contexts/server-context.txt

# Combine with other options
llm-log-analyzer build.log --additional-context-file team-knowledge.txt --verbose --debug --output-dir results/

Best Practices for Context Files:

Keep context files versioned with your project
Update context as you discover new issues and solutions
Use clear headers and bullet points for readability
Include environment-specific details (OS, versions, infrastructure)
Document both problems AND their solutions

Model Selection

Each provider uses optimized models for different tasks:

OpenAI:

Chunk analysis: gpt-4o-mini
Final aggregation: gpt-4o-mini

Gemini:

Chunk analysis: gemini-2.5-flash
Final aggregation: gemini-2.5-pro

Claude: (directly via Anthropic API or AWS Bedrock)

Chunk analysis: claude-3-5-haiku-20241022
Final aggregation: claude-3-5-sonnet-20241022

You can customize these by using the parameters --aggregation-model and --chunk-model.

Troubleshooting

Common Issues

API Rate Limits

The analyzer includes automatic retry with exponential backoff
Consider using other models for chunk analysis to reduce costs
Adjust chunk-size to reduce API calls
Limit the number of chunks with max-chunks
Fine tune your filters

Memory Issues with Large Logs

The analyzer uses streaming processing to avoid loading entire logs
Reduce tail-lines if needed
Fine tune your filters

Poor Analysis Quality

Ensure your filtered log contains actual error information
Fine tune your error patterns to reduce noise
Try increasing tail-lines to capture more context
Run with debug option and check that error keywords are being detected properly by analyzing the filtered_log.txt
Run with debug option and analyze the final output prompts

Network/Connectivity

Set up proper proxy configuration if behind corporate firewall
Verify API keys are correctly set
Check if your network allows HTTPS connections to Anthropic/Gemini API

JSON Parsing Errors (Gemini)

Error: Invalid control character at: line X column Y
Cause: Gemini sometimes includes unescaped control characters in JSON responses
Solution: The analyzer now automatically sanitizes JSON responses
Workaround: Use a different provider if Gemini consistently fails: --list-providers

Debug Mode

Enable debug mode (-d) and/or verbose logging (-v) to diagnose issues:

llm-log-analyzer -P unity Editor.log --debug

This will output debug files (filtered logs, prompt and model responses).

You can additionally run in verbose mode to show API request/response information and details about execution.

llm-log-analyzer -P unity Editor.log --verbose

Optimization

Aggressive Filtering – Irrelevant noise is stripped out automatically, keeping only error-related lines and context. This minimizes token usage while preserving diagnostic value.
Chunked Processing – If the filtered log is still too large, it is split into smaller chunks. Each chunk is analyzed individually, and results are aggregated into a single root cause report.
Tail Limit – Analysis prioritizes the most recent section of the log, where failures are most likely to occur, reducing cost and improving accuracy.

Contributing

Fork the repository
Create a feature branch
Submit a pull request

License

MIT License - see LICENSE file for details.

Made with ❤️ for Unity developers tired of manually debugging build failures.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github/workflows		.github/workflows
scripts		scripts
src/llm_log_analyzer		src/llm_log_analyzer
.cursorignore		.cursorignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

LLM Log Analyzer

Features

Installation

Using pip (Recommended)

From GitHub (Development)

Cloning the repo

Setup

Usage

Basic Usage

Command Line Options

Analysis Presets

Available Presets

Using Presets

Output Files

Exit Codes

LLM Provider Comparison

Claude (Anthropic)

GPT (OpenAI)

Gemini (Google)

CI Integration

Jenkins Pipeline

GitLab CI

Customization

Configuration for Different Log Types

Generic Application Logs (Default)

Unity Build Logs

Custom Log Types

Additional Context for Known Issues

Model Selection

Troubleshooting

Common Issues

Debug Mode

Optimization

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 10

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages