Skip to content

adam-hardy1/Storage-Dashboard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Storage Environment Dashboard

Built by a storage engineer to provide a single view across fragmented enterprise platforms.

A unified, read-only dashboard for monitoring enterprise storage environments across multiple platforms.

This dashboard addresses a common operational gap: storage health, alerts, capacity, and replication status are often fragmented across vendor tools, requiring engineers to check multiple systems daily.


Demo

Storage Dashboard Demo


At a Glance

  • Platforms: Pure Storage, Dell PowerScale (Isilon), Brocade SAN
  • Focus: Health, alerts, capacity, replication visibility
  • Mode: Mock data (demo-ready) with planned live integrations
  • Architecture: FastAPI backend + React (Vite) frontend
  • Deployment: Docker Compose (single-command startup)
  • Scope: Read-only, internal-facing, no automation actions

Why This Exists

In real environments, storage engineers often rely on multiple vendor interfaces to understand overall system health. This dashboard provides a single, normalized view to quickly identify issues and trends without switching between platforms.

Status

This project is an actively developed MVP focused on real-world storage monitoring workflows.

Implemented:

  • Frontend dashboard with global overview, alerts, and drill-down views
  • FastAPI backend with polling scheduler and normalized data model
  • Mock-mode collectors simulating Pure, Isilon, and Brocade environments
  • Docker Compose deployment for consistent local setup

Planned:

  • Live API integrations with real storage systems
  • Persistent storage for historical trends
  • Enhanced alert correlation and enrichment

Why I Built This

As a storage and infrastructure engineer, I regularly have to check multiple systems to answer a simple question: is everything healthy?

That typically means logging into separate vendor tools for Pure Storage, Dell PowerScale (Isilon), and SAN fabrics, each with its own interface and data model.

I built this project to consolidate that workflow into a single, read-only view focused on fast situational awareness and operational clarity.

Quick Start

Run the dashboard locally in mock mode using Docker Compose (no infrastructure required).

Docker Compose (recommended)

git clone https://github.com/adam-hardy1/Storage-Dashboard.git
cd Storage-Dashboard
cp .env.example .env
docker compose up --build

Technical Decisions

This project was designed to balance simplicity, realism, and operational relevance, with decisions guided by real-world storage and infrastructure workflows.

  • FastAPI (Backend)
    Chosen for its performance, simplicity, and native support for RESTful APIs. FastAPI provides automatic validation, clear type-driven development, and built-in API documentation, making it well-suited for aggregating and normalizing data from multiple storage platforms.

  • React + Vite (Frontend)
    React enables a modular, component-driven UI for displaying system health, alerts, and capacity data. Vite was selected for its fast development experience and minimal configuration, allowing rapid iteration without unnecessary overhead.

  • Mock Mode (Demo-Friendly Design)
    A mock data layer was implemented to allow the application to run without access to real infrastructure. This ensures the project is fully demoable in any environment while maintaining a clear path to live integrations with platforms such as Pure Storage and Dell PowerScale (Isilon).

  • Docker Compose (Deployment)
    Docker Compose provides a simple, reproducible way to run the full stack (frontend + backend) with a single command. This approach mirrors real-world containerized deployments while keeping setup lightweight and accessible for local development and evaluation.

Overview

Storage Dashboard aggregates health, capacity, alerting, and replication data from multiple platforms into a unified, read-only view.

The application is designed for fast situational awareness. It emphasizes clear health signals, real-time alerts, and trend visibility, while remaining operationally safe with no configuration or destructive actions.

Data is collected through periodic polling of vendor APIs, normalized into a common model, and cached for resilience. The frontend consumes this data via a lightweight API, ensuring responsive performance without direct dependency on live storage systems.

Key design principles:

  • Fast glanceability so you can answer the most important questions without drill-down
  • Normalized health model so different platforms speak the same language
  • Operationally honest so stale or unavailable data is clearly visible
  • Cached, resilient polling so the dashboard tolerates API hiccups and shows last successful refresh time
  • Develop locally and deploy at work, with mock mode providing realistic simulated data and a single environment variable to switch to live mode

Core Capabilities

  • Unified multi-platform visibility across Pure Storage, Dell PowerScale (Isilon), and Brocade SAN
  • Normalized health model enabling consistent status across different vendor systems
  • Real-time monitoring with periodic polling and centralized aggregation
  • Alert awareness with severity prioritization and lifecycle tracking
  • Trend visibility for capacity, alert activity, and replication state
  • Drill-down analysis from global view to individual system detail
  • Site-aware organization grouping systems by data center and environment
  • Containerized deployment for consistent local and internal environments

Screenshots

Screenshots captured from mock mode demonstrating key functionality.

These screenshots highlight the core dashboard views and capabilities:

View Description
Dashboard Overview Global summary — cards with critical banner and site-grouped systems
Alerts Table Sortable, filterable alert table with severity, duration, and trend indicators
Pure Detail View Pure Storage drill-down view with trend sparklines, hardware health, and replication
Brocade View Brocade FC switch cards showing port health, ISL status, and utilization

Feature Breakdown

Global Situational Awareness

  • Critical attention banner that highlights when any system is in a critical state, with click-to-filter behavior
  • Group summary cards aggregated by platform and environment tier (Pure Prod, Dell PowerScale (Isilon) Prod, Brocade / SAN Fabric, etc.)
  • Site and data center grouping with systems organized by site (DC1, DC2, DR) and then by platform within each site

Alert Management

  • Structured alert table with columns: severity, platform, system, site/env, summary, component, duration, source time, first seen
  • Sortable by severity, platform, system, duration, or timestamp (default: critical first)
  • Filterable by platform, severity, site, and environment
  • Duration tracking shows how long each alert has been active, color-coded (green < 10m, yellow 10m to 1h, red > 1h)
  • Trend approximation with worsening, stable, or resolving indicators per alert

Trend Tracking

  • Rolling history for capacity, alert count, and replication lag per system (configurable window, default ~24h)
  • Inline trend indicators on group cards showing delta with units (e.g., +2.1%, -3 alerts)
  • Sparkline charts in drill-down views with Y-axis labels and time window badges
  • Group-level trends computed by averaging across member systems

Drill-Down Detail Views

  • Click any system card to open a dedicated detail page
  • Breadcrumb navigation: Dashboard > Platform > system-name
  • Full breakdown: system info, trend charts, capacity detail, component health grid, replication cards, current alerts, and alert history

Staleness Visualization

  • Color-coded freshness everywhere: green (fresh), yellow (aging > 2m), red (stale > 5m)
  • Explicit data age and collector state labels on every card footer
  • Staleness bar in drill-down views showing last success time and data age
  • Stale systems highlighted with colored borders

Authentication

  • Optional session-based auth via environment variables
  • Login page with session cookies (12h expiry)
  • 401 responses auto-redirect to login on session expiry
  • Completely transparent when disabled (no login screen, no overhead)

Supported Platforms

Pure Storage FlashArray

  • API: REST v2.x with OAuth2 token exchange
  • Collected: array health, controller status, drive health, FC port state, capacity/space, data reduction ratio, snapshots, ActiveCluster and ActiveDR replication
  • Mock data: 4 simulated arrays (2 prod, 1 dev, 1 DR) with 24 drive bays, 8 FC ports, ActiveCluster links

Dell PowerScale / Isilon

  • API: OneFS Platform API (PAPI) with Basic Auth on port 8080
  • Collected: cluster health, per-node status with disk counts, storage pool capacity, SyncIQ replication policies and reports, events/alerts
  • Mock data: 4 simulated clusters (3 prod, 1 dev) with multi-node configurations and multi-policy SyncIQ

Brocade Fibre Channel Switches

  • API: FOS REST API (v9+) with session auth
  • Collected: switch health, per-port status with error counters, ISL trunk state, PSU/fan health, port utilization
  • Mock data: 5 simulated switches (4 prod across 2 fabrics, 1 dev) with 24 to 48 FC ports, ISL links, and hardware components

Architecture

The diagram below shows the high-level data flow from collectors to the UI:

┌──────────────────────────────────────────────────────┐
│                React Frontend (Vite)                  │
│                                                      │
│  useRouter ─── useDashboard ─── useSystemDetail      │
│       │              │                │               │
│  ┌────▼──────────────▼────────────────▼──────────┐   │
│  │ Dashboard View        │  Detail View          │   │
│  │ - CriticalBanner      │  - Breadcrumbs        │   │
│  │ - HealthOverview       │  - StalenessBar       │   │
│  │ - AlertFeed (table)    │  - TrendCharts        │   │
│  │ - Site-grouped cards   │  - ComponentGrid      │   │
│  │ - CollectorStatus      │  - AlertHistory       │   │
│  └───────────────────────┴───────────────────────┘   │
└────────────────────────┬─────────────────────────────┘
                         │  /api/*
┌────────────────────────▼─────────────────────────────┐
│                  FastAPI Backend                       │
│                                                       │
│  Auth Middleware ── Router (/api) ── Dashboard Cache   │
│                                         │             │
│                         ┌───────────────▼──────────┐  │
│                         │     TrendStore           │  │
│                         │  deque per system/metric  │  │
│                         │  alert_history tracking   │  │
│                         └───────────────┬──────────┘  │
│                                         │             │
│  Scheduler (asyncio) ─── poll every 60s ─┘            │
│       │                                               │
│  ┌────▼──────────────────────────────────────────┐   │
│  │  Collectors (BaseCollector ABC)                │   │
│  │    ├── PureCollector      → FlashArray API     │   │
│  │    ├── IsilonCollector    → OneFS PAPI         │   │
│  │    ├── BrocadeCollector   → FOS REST API       │   │
│  │    ├── MockPureCollector  (4 arrays)           │   │
│  │    ├── MockIsilonCollector (4 clusters)        │   │
│  │    └── MockBrocadeCollector (5 switches)       │   │
│  └───────────────────────────────────────────────┘   │
│                                                       │
│  Normalized Health Model                              │
│    StorageSystem → SystemHealth → HardwareComponent   │
│    Alert, CapacityInfo, ReplicationLink                │
│    GroupSummary, TrendData, SystemDetail               │
└───────────────────────────────────────────────────────┘

Data flow: Collectors poll vendor APIs on a configurable interval (default 60s). Each collector normalizes the response into the common StorageSystem model. The cache stores current state, tracks alert first-seen times, records trend data points, and computes group aggregations. The frontend polls /api/summary every 30s and never blocks on a live storage API call.

Running Locally (Detailed Setup)

Choose one of the following options depending on how you want to run the application.

Option 1: Docker Compose

Run using Docker Compose (if the repo is already cloned), with access to API docs and endpoints:

cp .env.example .env
docker compose up --build

Option 2: Run Directly

Backend:

cd backend
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
STORAGE_DASHBOARD_MODE=mock uvicorn app.main:app --reload

Frontend:

cd frontend
npm install
npm run dev

The frontend dev server proxies /api/* requests to the backend automatically.

Mock Mode vs Live Mode

Mock Mode Live Mode
Env var STORAGE_DASHBOARD_MODE=mock STORAGE_DASHBOARD_MODE=live
What it does Generates realistic simulated data from 13 mock systems Connects to real storage APIs
Requires Nothing -- works out of the box Network access + API credentials
Use case Local development, UI testing, demos Production deployment

In mock mode, the dashboard simulates 4 Pure arrays, 4 Dell PowerScale (Isilon) clusters, and 5 Brocade switches across 3 sites (DC1, DC2, DC3) with randomized health, alerts, capacity, and replication data. Some systems will randomly enter warning or critical states to exercise all UI paths.

Connecting to Real Infrastructure

Pure Storage FlashArray

  1. Generate an API token: FlashArray GUI > Settings > API Tokens
  2. Add to .env:
STORAGE_DASHBOARD_PURE_ARRAYS=[{"name":"pure-prod-01","host":"10.0.0.1","api_token":"your-token","site":"DC1","environment":"prod"}]

Dell PowerScale (Isilon)

  1. Create a read-only user with privileges: ISI_PRIV_LOGIN_PAPI, ISI_PRIV_STATISTICS, ISI_PRIV_SYNCIQ, ISI_PRIV_QUOTA, ISI_PRIV_EVENT
  2. Add to .env:
STORAGE_DASHBOARD_ISILON_CLUSTERS=[{"name":"isilon-prod-01","host":"10.0.0.10","port":8080,"username":"monitor","password":"secret","site":"DC1","environment":"prod"}]

Brocade FC Switches

  1. Ensure FOS REST API is enabled (FOS 9.x+)
  2. Create a read-only user or use an existing monitor account
  3. Add to .env:
STORAGE_DASHBOARD_BROCADE_SWITCHES=[{"name":"brocade-dc1-fab-a","host":"10.0.0.20","username":"admin","password":"secret","site":"DC1","environment":"prod"}]

Authentication (Optional)

To enable login protection:

STORAGE_DASHBOARD_AUTH_USERNAME=admin
STORAGE_DASHBOARD_AUTH_PASSWORD=your-secure-password

Leave both empty to disable authentication entirely.

Configuration Reference

All settings are controlled via environment variables (prefix STORAGE_DASHBOARD_):

Variable Default Description
MODE mock mock for simulated data, live for real APIs
POLL_INTERVAL 60 Seconds between polling cycles
STALE_THRESHOLD 300 Seconds before data is considered stale
CORS_ORIGINS localhost:5173,localhost:3000 Allowed frontend origins
TREND_MAX_POINTS 1440 Max trend data points per metric (~24h at 1/min)
AUTH_USERNAME (empty) Set with password to enable auth
AUTH_PASSWORD (empty) Set with username to enable auth
PURE_ARRAYS [] JSON array of Pure FlashArray configs
ISILON_CLUSTERS [] JSON array of PowerScale/Isilon configs
BROCADE_SWITCHES [] JSON array of Brocade switch configs

API Endpoints

Dashboard

Method Endpoint Description
GET /api/summary Full dashboard payload (systems, groups, collector statuses)
GET /api/groups Aggregated group summaries by platform + environment
GET /api/systems All systems. Filters: ?platform=, ?site=, ?environment=
GET /api/systems/{name} Single system current state
GET /api/systems/{name}/detail Full drill-down: system + trends + alert history
GET /api/systems/{name}/trends Trend time-series data for a system
GET /api/alerts All alerts. Filters: ?severity=, ?platform=, ?site=, ?environment=, ?active_only=
GET /api/health Dashboard self-health check

Authentication

Method Endpoint Description
GET /api/auth/status Check if auth is required and session validity
POST /api/auth/login Authenticate with {"username", "password"}
POST /api/auth/logout Clear session

Project Structure

storage-dashboard/
├── backend/
│   ├── app/
│   │   ├── main.py                  # FastAPI app, auth middleware, lifespan
│   │   ├── config.py                # Pydantic settings from env vars
│   │   ├── cache.py                 # In-memory cache, trend store, group aggregation
│   │   ├── scheduler.py             # Async polling scheduler
│   │   ├── models/
│   │   │   └── health.py            # Normalized data models (20+ types)
│   │   ├── collectors/
│   │   │   ├── base.py              # Abstract BaseCollector interface
│   │   │   ├── pure.py              # Pure Storage FlashArray (REST v2.x)
│   │   │   ├── isilon.py            # PowerScale / Isilon (OneFS PAPI)
│   │   │   ├── brocade.py           # Brocade FC switches (FOS REST)
│   │   │   ├── mock_pure.py         # 4 simulated Pure arrays
│   │   │   ├── mock_isilon.py       # 4 simulated Isilon clusters
│   │   │   └── mock_brocade.py      # 5 simulated Brocade switches
│   │   └── routers/
│   │       ├── dashboard.py         # Dashboard REST API routes
│   │       └── auth.py              # Session-based authentication
│   ├── Dockerfile
│   └── requirements.txt
├── frontend/
│   ├── src/
│   │   ├── App.jsx                  # Root: routing, auth gate, layout
│   │   ├── components/
│   │   │   ├── Header.jsx           # Sticky header with refresh + auth
│   │   │   ├── CriticalBanner.jsx   # Red attention banner for critical systems
│   │   │   ├── HealthOverview.jsx   # Group summary cards with trends
│   │   │   ├── AlertFeed.jsx        # Sortable/filterable alert table
│   │   │   ├── SystemCard.jsx       # Per-system card (clickable)
│   │   │   ├── SystemDetailView.jsx # Drill-down: trends, health, replication
│   │   │   ├── CollectorStatus.jsx  # Collector health table
│   │   │   ├── Sparkline.jsx        # SVG sparkline + trend indicator
│   │   │   └── LoginPage.jsx        # Authentication form
│   │   ├── hooks/
│   │   │   ├── useDashboard.js      # Polls /api/summary every 30s
│   │   │   ├── useSystemDetail.js   # Polls /api/systems/{name}/detail
│   │   │   ├── useRouter.js         # Hash-based client-side routing
│   │   │   └── useAuth.js           # Auth state management
│   │   ├── utils/
│   │   │   └── format.js            # Formatting: bytes, duration, time, trends
│   │   └── styles/
│   │       └── dashboard.css        # Full application styles (dark theme)
│   ├── Dockerfile
│   ├── nginx.conf                   # Production nginx with API proxy
│   └── vite.config.js               # Dev server with API proxy
├── docker-compose.yml
├── .env.example
└── .gitignore

Tech Stack

Layer Technology Why
Backend Python 3.12 + FastAPI Async-native, auto-generated API docs, lightweight
Frontend React 18 + Vite Fast builds, simple SPA, no heavy framework needed
Styling Custom CSS (dark theme) No CSS framework dependency, full control
Charts Inline SVG sparklines Zero chart library overhead, renders in milliseconds
Caching In-memory (dict + deque) No external database needed for Phase 1
Polling asyncio tasks Runs inside the FastAPI process, no Celery/Redis
Deployment Docker Compose Single command, works identically local and at work

Future Enhancements (Path to Production)

The following outlines how this project could evolve into a production-grade monitoring platform:

  • Persistent storage using SQLite or PostgreSQL to retain trend history across restarts
  • Notification integration with Slack, Teams, or PagerDuty for critical state changes
  • Multi-site topology view showing replication relationships between sites
  • Capacity forecasting using trend data to estimate days until full
  • Additional platforms such as NetApp ONTAP, Dell PowerStore, and Cisco MDS
  • RBAC with role-based access control for team-level views
  • Dark/light theme toggle based on user preference
  • Export functionality for alerts and capacity data in CSV or JSON format
  • Audit log to track user access and activity for compliance purposes

License

This project is provided for portfolio and demonstration purposes only.
See the LICENSE file for full details.

About

Full-stack storage monitoring dashboard (FastAPI + React) for enterprise environments, providing visibility into health, capacity, alerts, and replication across platforms.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors