The Maintenance module provides a centralized orchestration layer for all database maintenance operations. It allows operators to define named maintenance schedules with cron-based execution, maintenance window enforcement, task sequencing with halt-on-failure semantics, and aggregated per-module health reporting.
| Interface / File | Role |
|---|---|
include/maintenance/database_maintenance_orchestrator.h |
Primary public API |
include/maintenance/maintenance_task.h |
Task types, job struct, job state enum |
include/maintenance/maintenance_schedule.h |
Schedule entry with JSON serialization |
include/maintenance/maintenance_health_report.h |
Health report aggregation |
src/maintenance/database_maintenance_orchestrator.cpp |
Implementation |
src/maintenance/maintenance_registry.cpp |
Default schedule bundles |
Central coordinator for all maintenance scheduling and execution.
#include "maintenance/database_maintenance_orchestrator.h"
// Construction (via dependency injection)
auto orchestrator = DatabaseMaintenanceOrchestrator(
scheduler, // TaskScheduler*
index_maintenance, // std::shared_ptr<IndexMaintenanceManager>
audit_logger // std::shared_ptr<utils::AuditLogger>
);
orchestrator.start();
// Create a schedule
MaintenanceScheduleEntry schedule;
schedule.id = "nightly-index-rebuild";
schedule.name = "Nightly Index Rebuild";
schedule.cron_expression = "0 2 * * *"; // 2:00 AM daily
schedule.window_start_hour = 1;
schedule.window_end_hour = 5;
schedule.tasks = { MaintenanceTaskType::INDEX_REBUILD, MaintenanceTaskType::STATISTICS_UPDATE };
schedule.halt_on_task_failure = true;
schedule.enabled = true;
auto result = orchestrator.createSchedule(schedule);
// List recent jobs
auto jobs = orchestrator.listJobs(50);
// Get aggregated health report
MaintenanceHealthReport health = orchestrator.getHealthReport();Provides pre-built schedule bundles for common maintenance patterns:
#include "maintenance/maintenance_registry.h"
// Get default schedule bundles
auto daily_schedules = MaintenanceRegistry::getDailySchedules();
auto weekly_schedules = MaintenanceRegistry::getWeeklySchedules();
auto monthly_schedules = MaintenanceRegistry::getMonthlySchedules();In Scope:
- Schedule CRUD (create, read, update, patch, delete, enable, disable)
- Cron-based execution via
TaskScheduler - Maintenance window enforcement (UTC hour range)
- Sequential task execution with halt-on-failure
- Per-module health probe registry and aggregation
- Job lifecycle management (PENDING → RUNNING → SUCCEEDED/FAILED/CANCELLED/SKIPPED)
- 24-hour job retention with automatic pruning
- Audit logging and Prometheus-compatible metrics
Out of Scope:
- Schedule persistence (planned v1.1.0 — currently in-memory only)
- Explicit DAG task dependencies (planned v1.2.0 — currently total order)
- Distributed maintenance coordination (planned v2.0.0)
INDEX_REBUILD INDEX_OPTIMIZE INDEX_CONSISTENCY_CHECK
STORAGE_COMPACTION WAL_ARCHIVING BACKUP_VERIFICATION
METRICS_COLLECTION LOG_ROTATION CACHE_WARM
DEAD_LETTER_DRAIN REPLICA_VALIDATION MVCC_CLEANUP
SCHEMA_VALIDATION RETENTION_ENFORCEMENT STATISTICS_UPDATE
SECURITY_SCAN AUDIT_LOG_FLUSH BLOOM_FILTER_REBUILD
CUSTOM
11 endpoints under /api/v1/maintenance/:
POST /schedules— create scheduleGET /schedules— list allGET /schedules/{id}— get by IDPUT /schedules/{id}— replacePATCH /schedules/{id}— partial updateDELETE /schedules/{id}— deletePOST /schedules/{id}/enable— enablePOST /schedules/{id}/disable— disableGET /jobs— list recent jobs (last 24 hours)GET /jobs/{id}— get job detailsGET /health— aggregated health report
RBAC: maintenance:read · maintenance:write · maintenance:admin
Modules can register health probes to contribute to the aggregated health report:
orchestrator.registerHealthProbe("my_module", []() -> ModuleHealthSignal {
ModuleHealthSignal signal;
signal.module_name = "my_module";
signal.status = ModuleHealthStatus::HEALTHY;
signal.message = "All systems nominal";
return signal;
});40+ unit tests in tests/test_maintenance_orchestrator.cpp covering:
- Schedule CRUD and validation
- JSON round-trips (
toJson()/fromJson()/applyPatch()) - Maintenance window enforcement and SKIPPED state
- Job lifecycle (SUCCEEDED, FAILED, CANCELLED)
halt_on_task_failurecascading behaviour- Health probe registration and aggregation
- Metrics collection
The following peer-reviewed sources form the scientific foundation of the Maintenance module.
-
Chaudhuri, S., & Weikum, G. (2000). Rethinking Database System Architecture: Towards a Self-Tuning RISC-Style Database System. Proceedings of the 26th International Conference on Very Large Data Bases (VLDB), 1–10. URL: https://dl.acm.org/doi/10.5555/645926.671577
Introduces the concept of self-tuning database components that monitor and adapt internal parameters at runtime. Directly motivates the
MaintenanceOrchestratoradaptive scheduling model and the health-probe feedback loop inhealth_probe.cpp. -
Agrawal, S., Chaudhuri, S., Kollar, L., Marathe, A., Narasayya, V., & Syamala, M. (2004). Database Tuning Advisor for Microsoft SQL Server 2005. Proceedings of the 30th International Conference on Very Large Data Bases (VLDB), 1110–1121. URL: https://dl.acm.org/doi/10.5555/1316689.1316803
Describes automated index/statistics recommendation. Informs the
REINDEX_HNSWandREBUILD_SECONDARY_INDEXEStask types and thehalt_on_task_failurecascading strategy.
-
Liu, C. L., & Layland, J. W. (1973). Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment. Journal of the ACM, 20(1), 46–61. DOI: 10.1145/321738.321743
Rate-Monotonic Scheduling (RMS) theory for periodic task sets. Informs the maintenance-window priority model (
CRITICAL > HIGH > MEDIUM > LOW) and themax_concurrent_tasksadmission-control bound inTaskScheduler. -
Silberschatz, A., Galvin, P. B., & Gagne, G. (2018). Operating System Concepts (10th ed.). Wiley. ISBN: 978-1-119-32091-3.
Chapter 5 (CPU Scheduling) motivates the multi-level feedback queue used for maintenance job priorities and the preemptive scheduling of
CRITICALtasks.
-
Chaudhuri, S., & Weikum, G. (2000). Rethinking Database System Architecture: Towards a Self-Tuning RISC-Style Database System. VLDB 2000. https://dl.acm.org/doi/10.5555/645926.671577
-
Agrawal, S., et al. (2004). Database Tuning Advisor for Microsoft SQL Server 2005. VLDB 2004. https://dl.acm.org/doi/10.5555/1316689.1316803
-
Liu, C. L., & Layland, J. W. (1973). Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment. Journal of the ACM, 20(1), 46–61. https://doi.org/10.1145/321738.321743
-
Silberschatz, A., Galvin, P. B., & Gagne, G. (2018). Operating System Concepts (10th ed.). Wiley. ISBN: 978-1-119-32091-3