Skip to content

IBM/AssetOpsBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

98 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AssetOpsBench: Benchmarking AI Agents for Industrial Asset Operations & Maintenance

AssetOps MultiAgentBench OpenAI Llama Mistral Granite

📄 Paper | 🤗 Huggingface | 📢 Blog


📑 Table of Contents

  1. Announcements
  2. Introduction
  3. Datasets
  4. AI Agents
  5. Multi-Agent Frameworks
  6. System Diagram
  7. Leaderboards
  8. Docker Setup
  9. Talks & Events
  10. External Resources
  11. Contributors

📣 Announcements

  • 2025-06-01: AssetOpsBench v1.0 released with 140+ industrial scenarios.
  • 2025-09-01: CODS Competition launched.
  • Upcoming Events: Tutorial at AAAI 2026 – Agents for Industry 4.0 Applications.
  • Stay tuned for new tracks, competitions, and community events.

🏗️ Introduction

AssetOpsBench is a unified framework for developing, orchestrating, and evaluating domain-specific AI agents in industrial asset operations and maintenance.

It provides:

  • 4 domain-specific agents
  • 2 multi-agent orchestration frameworks

Designed for maintenance engineers, reliability specialists, and facility planners, it allows reproducible evaluation of multi-step workflows in simulated industrial environments.


📂 Datasets: 140+ Scenarios

AssetOpsBench scenarios span multiple domains:

Domain Example Task
IoT "List all sensors of Chiller 6 in MAIN site"
FSMR "Identify failure modes detected by Chiller 6 Supply Temperature"
TSFM "Forecast 'Chiller 9 Condenser Water Flow' for the week of 2020-04-27"
WO "Generate a work order for Chiller 6 anomaly detection"

Some tasks focus on a single domain, others are multi-step end-to-end workflows.
Explore all scenarios here.


🤖 AI Agents

Domain-Specific Agents

  • IoT Agent: get_sites, get_history, get_assets, get_sensors
  • FMSR Agent: get_sensors, get_failure_modes, get_failure_sensor_mapping
  • TSFM Agent: forecasting, timeseries_anomaly_detection
  • WO Agent: generate_work_order

Multi-Agent Frameworks

  • MetaAgent: reAct-based single-agent-as-tool orchestration
  • AgentHive: plan-and-execute sequential workflow

🖼️ System Diagram

Visual overview of AssetOpsBench workflow:

System Diagram


🏆 Leaderboards

  • Evaluated with 7 Large Language Models
  • Trajectories scored using LLM Judge (Llama-4-Maverick-17B)
  • 6-dimensional criteria measure reasoning, execution, and data handling

Example: MetaAgent leaderboard

meta_agent_leaderboard


🐳 Run AssetOpsBench in Docker

  • Pre-built Docker Images: assetopsbench-basic (minimal) & assetopsbench-extra (full)
  • Conda environment: assetopsbench
  • Full setup guide
cd /path/to/AssetOpsBench
chmod +x benchmark/entrypoint.sh
docker-compose -f benchmark/docker-compose.yml build
docker-compose -f benchmark/docker-compose.yml up

🎤 Talks & Events

  • Workshops: Participate in GenAIBench-26 at AAAI 2025 focusing on multi-agent AI workflows.
  • Webinars & Seminars: Learn best practices for industrial task automation with AI agents.
  • Competitions: Benchmark your agents on real-world industrial scenarios using AssetOpsBench.

🔗 External Resources


About

AssetOpsBench - Industry 4.0

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published