AssetOpsBench: Benchmarking AI Agents for Industrial Asset Operations & Maintenance

📑 Table of Contents

Announcements
Introduction
Datasets
AI Agents
Multi-Agent Frameworks
System Diagram
Leaderboards
Docker Setup
Talks & Events
External Resources
Contributors

📣 Announcements

2025-06-01: AssetOpsBench v1.0 released with 140+ industrial scenarios.
2025-09-01: CODS Competition launched.
Upcoming Events: Tutorial at AAAI 2026 – Agents for Industry 4.0 Applications.
Stay tuned for new tracks, competitions, and community events.

🏗️ Introduction

AssetOpsBench is a unified framework for developing, orchestrating, and evaluating domain-specific AI agents in industrial asset operations and maintenance.

It provides:

4 domain-specific agents
2 multi-agent orchestration frameworks

Designed for maintenance engineers, reliability specialists, and facility planners, it allows reproducible evaluation of multi-step workflows in simulated industrial environments.

📂 Datasets: 140+ Scenarios

AssetOpsBench scenarios span multiple domains:

Domain	Example Task
IoT	"List all sensors of Chiller 6 in MAIN site"
FSMR	"Identify failure modes detected by Chiller 6 Supply Temperature"
TSFM	"Forecast 'Chiller 9 Condenser Water Flow' for the week of 2020-04-27"
WO	"Generate a work order for Chiller 6 anomaly detection"

Some tasks focus on a single domain, others are multi-step end-to-end workflows.
Explore all scenarios here.

🤖 AI Agents

Domain-Specific Agents

IoT Agent: get_sites, get_history, get_assets, get_sensors
FMSR Agent: get_sensors, get_failure_modes, get_failure_sensor_mapping
TSFM Agent: forecasting, timeseries_anomaly_detection
WO Agent: generate_work_order

Multi-Agent Frameworks

MetaAgent: reAct-based single-agent-as-tool orchestration
AgentHive: plan-and-execute sequential workflow

🖼️ System Diagram

Visual overview of AssetOpsBench workflow:

🏆 Leaderboards

Evaluated with 7 Large Language Models
Trajectories scored using LLM Judge (Llama-4-Maverick-17B)
6-dimensional criteria measure reasoning, execution, and data handling

Example: MetaAgent leaderboard

🐳 Run AssetOpsBench in Docker

Pre-built Docker Images: assetopsbench-basic (minimal) & assetopsbench-extra (full)
Conda environment: assetopsbench
Full setup guide

cd /path/to/AssetOpsBench
chmod +x benchmark/entrypoint.sh
docker-compose -f benchmark/docker-compose.yml build
docker-compose -f benchmark/docker-compose.yml up

🎤 Talks & Events

Workshops: Participate in GenAIBench-26 at AAAI 2025 focusing on multi-agent AI workflows.
Webinars & Seminars: Learn best practices for industrial task automation with AI agents.
Competitions: Benchmark your agents on real-world industrial scenarios using AssetOpsBench.

🔗 External Resources

📄 Paper: AssetOpsBench: Benchmarking AI Agents for Industrial Asset Operations
🤗 HuggingFace: Scenario & Model Hub
📢 Blog: Insights, Tutorials, and Updates
🎥 Recorded Talks: Link coming soon.

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
aaai_website		aaai_website
benchmark		benchmark
metadata		metadata
scenarios		scenarios
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
renovate.json		renovate.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AssetOpsBench: Benchmarking AI Agents for Industrial Asset Operations & Maintenance

📑 Table of Contents

📣 Announcements

🏗️ Introduction

📂 Datasets: 140+ Scenarios

🤖 AI Agents

Domain-Specific Agents

Multi-Agent Frameworks

🖼️ System Diagram

🏆 Leaderboards

🐳 Run AssetOpsBench in Docker

🎤 Talks & Events

🔗 External Resources

About

Uh oh!

Releases

Packages

Contributors 7

Languages

License

IBM/AssetOpsBench

Folders and files

Latest commit

History

Repository files navigation

AssetOpsBench: Benchmarking AI Agents for Industrial Asset Operations & Maintenance

📑 Table of Contents

📣 Announcements

🏗️ Introduction

📂 Datasets: 140+ Scenarios

🤖 AI Agents

Domain-Specific Agents

Multi-Agent Frameworks

🖼️ System Diagram

🏆 Leaderboards

🐳 Run AssetOpsBench in Docker

🎤 Talks & Events

🔗 External Resources

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages