A Ceph cluster analysis tool for helping with clusters where ceph -s shows HEALTH_WARN status.
The ceph-doctor tool provides a monitor subcommand that displays a real-time view of the cluster's state, showing how Ceph is working towards resolving the HEALTH_WARN condition.
After smartd alerted on bad blocks on two disks we are going to replace them. We ran:
ceph osd out osd.43
ceph osd out osd.46Now the doctor shows how data is moving away from the 43 and 46 into many other disks.
During scrub, ceph detected an inconsistant placement group. Unfortunately ceph does not see this a overly problematic, and does not take to repairing the inconsistency immediagely but rather works on the other scrubbing jobs.
Since we want to see ceph working on a fix, we stop the ongoing scrubbing.
ceph osd set noscrub
ceph osd set nodeep-scrubNow ceph starts with the fixing and we can trun the scrubbing back on. (Maybe some ceph guru can tell us how to get this behavior automatically)
ceph osd unset noscrub
ceph osd unset nodeep-scrubPre-compiled binaries for common platforms (Linux AMD64 & ARM64) are available on the GitHub Releases page. Download the latest release for your architecture, make it executable, and move it to a directory in your $PATH.
If you want to build from source, or if there isn't a binary for your platform, you can compile ceph-doctor yourself. You'll need a recent Rust toolchain.
cargo build --releaseOr install directly:
cargo install --path .Monitor your local Ceph cluster:
ceph-doctor monitorThe monitor tool calls ceph pg dump --format json-pretty at a configurable interval (default: 5 seconds) to observe cluster activity and recovery progress.
--interval <SECONDS>: Set the refresh interval (default: 5)--prefix-command <COMMAND>: Command prefix for remote execution
For remote Ceph clusters, use the --prefix-command option:
# SSH to remote host with sudo
ceph-doctor monitor --prefix-command "ssh ceph-host sudo"
# SSH with custom user and sudo
ceph-doctor monitor --prefix-command "ssh user@ceph-host sudo"
# Kubernetes pod execution
ceph-doctor monitor --prefix-command "kubectl exec ceph-pod --"
# Docker container execution
ceph-doctor monitor --prefix-command "docker exec ceph-container"The monitor displays:
- Recovery Progress: Shows active recovery operations with rates and ETAs
- Placement Group States: Summary of PG states across the cluster
- OSD Data Movement: Tracks data movement between OSDs
- Inconsistent PGs: Highlights placement groups requiring attention
- Real-time Updates: Responsive terminal interface with resize support
- Rust toolchain
- Access to a Ceph cluster (local or remote)
- Terminal with TTY support (required for the interactive interface)
- q, Ctrl+C, or Esc: Quit the application
- Terminal resize is automatically handled
Built with:
clapfor command-line parsingserdefor JSON handlingtokiofor async operationsratatuifor terminal UIcrosstermfor terminal backendanyhowfor error handlingchronofor time operations
The tool parses Ceph's JSON output to provide organized, real-time monitoring of cluster health and recovery progress.
cargo ci-check- Run all CI checks locally (formatting, linting, tests)cargo fmt- Auto-format codecargo fmt-check- Check code formattingcargo clippy-check- Run clippy lintscargo test-all- Run all tests
This project enforces code quality through:
- Automated formatting with
rustfmt - Linting with
clippy(configured inCargo.toml) - Testing with unit tests
- CI/CD via GitHub Actions
Before submitting changes, run:
cargo ci-checkThis runs the same checks as the GitHub Actions CI pipeline.
Run the full test suite:
cargo test-allTest the monitor interface with a live cluster:
cargo run -- monitorThe project supports cross-compilation for Linux AMD64 and ARM64.
# Install cross tool
cargo install cross
# Build for ARM64
cross build --release --target aarch64-unknown-linux-gnu
# Build for AMD64 (native)
cargo build --release --target x86_64-unknown-linux-gnuThe release builds are optimized with:
- Link Time Optimization (LTO)
- Symbol stripping
- Single codegen unit
- Panic=abort for smaller binaries
- Fork the repository
- Create a feature branch
- Make your changes
- Run
cargo ci-checkto ensure code quality - Submit a pull request
All pull requests must pass CI checks including formatting, linting, and testing.


