diff --git a/.assistant/status.md b/.assistant/status.md index a42f88a..c225e1a 100644 --- a/.assistant/status.md +++ b/.assistant/status.md @@ -36,6 +36,8 @@ --- ## Recent Progress +- **P-033 UI:** Added Backfill tab to run one-off backfill via `/api/backfill/run` (ephemeral container, run-once). +- **P-033 kickoff:** Added one-off backfill flow backend — new `/api/backfill/run` endpoint launches ephemeral backfill container (no main config mutation); loader supports `BACKFILL_RUN_ONCE` and idles after run to avoid restart loops. - **Startup Control:** Added `AUTO_START`/`START_SIGNAL_FILE` gating so webui deployments keep the generator idle until Start is triggered; Control UI start/restart now writes the start signal; loader waits for the flag; tests added. - **P-032 Bug Fix:** Added `format_cdt()` to send Matomo `cdt` in UTC; tests confirm CET→UTC conversion. - **P-032 Complete:** End-to-end backfill (validation, UI, loader caps/seed/RPS, docs, pytest coverage); merged `develop`→`main` via PR #11. diff --git a/.assistant/task_log.md b/.assistant/task_log.md index e7f2f4f..2bff4e5 100644 --- a/.assistant/task_log.md +++ b/.assistant/task_log.md @@ -272,3 +272,38 @@ - args: none - result: Tests passed (20 passed). - artifacts: none + +- tool: apply_patch (control-ui/app.py; control-ui/container_manager.py; control-ui/models.py; matomo-load-baked/loader.py) +- args: Added one-off backfill support: `/api/backfill/run` launches an ephemeral backfill container using current env, validates input, and sets `BACKFILL_RUN_ONCE`; loader now idles after a one-shot run to avoid restart loops. +- result: Backfill can be triggered as a separate flow without mutating the main container config; loader respects `BACKFILL_RUN_ONCE`. +- artifacts: none + +- tool: shell (python3 -m pytest matomo-load-baked/tests/test_backfill.py) +- args: none +- result: Tests passed (5 passed). +- artifacts: none + +- tool: apply_patch (control-ui/static/index.html; control-ui/static/js/api.js; control-ui/static/js/app.js; control-ui/static/js/backfill.js) +- args: Added Backfill tab with one-off backfill form, wired to new `/api/backfill/run` endpoint via API helper and App controller; displays run results (container/id/message). +- result: Users can trigger one-off backfill runs from the UI without altering primary config. +- artifacts: control-ui/static/js/backfill.js + +- tool: apply_patch (control-ui/app.py; control-ui/container_manager.py; control-ui/models.py; control-ui/static/js/api.js; control-ui/static/js/backfill.js; control-ui/static/index.html) +- args: Added backfill status/cleanup endpoints and UI: list labeled backfill runs, cleanup exited jobs, render runs table, and enforce frontend date window validation (<=180d, no future end, mode exclusivity). +- result: Backfill tab now shows run history and supports cleanup; backend lists/cleans backfill containers; client blocks invalid windows before calling the API. +- artifacts: control-ui/static/js/backfill.js + +- tool: apply_patch (control-ui/app.py; control-ui/static/js/api.js; control-ui/static/js/backfill.js; control-ui/static/index.html; control-ui/models.py) +- args: Persist last backfill payload/result to disk, expose `/api/backfill/last`, and surface last-run info in the Backfill tab alongside status/history. +- result: UI now loads the most recent backfill record on tab activation; backend saves/serves last run metadata. +- artifacts: control-ui/static/js/backfill.js + +- tool: apply_patch (control-ui/app.py; control-ui/container_manager.py; control-ui/models.py; control-ui/static/js/api.js; control-ui/static/js/backfill.js; control-ui/static/index.html) +- args: Added backfill cancel endpoint and UI action to stop running backfill containers; runs table now shows cancel buttons for running jobs. +- result: Users can stop a running backfill from the UI; backend stops labeled backfill containers safely. +- artifacts: control-ui/static/js/backfill.js + +- tool: apply_patch (control-ui/static/index.html; control-ui/static/js/backfill.js) +- args: Added “Load last payload” button and auto-fill support using the last saved backfill payload from `/api/backfill/last`. +- result: Users can quickly rerun or tweak the previous backfill configuration without retyping. +- artifacts: control-ui/static/js/backfill.js diff --git a/WEB_UI_GUIDE.md b/WEB_UI_GUIDE.md index 9038415..2f1140d 100644 --- a/WEB_UI_GUIDE.md +++ b/WEB_UI_GUIDE.md @@ -179,6 +179,24 @@ The Web UI consists of 5 main tabs: ``` Guardrails: window must end on/before today; start <= end; max 180 days; caps must be consistent (total ≥ per-day); warnings on very high per-day caps and RPS. +### Backfill Tab (One-off runs) + +**Purpose:** Run a historical replay as an ephemeral job without changing the main config. + +**Features:** +- Separate form from the main Config tab; calls `/api/backfill/run` to spawn a one-off container. +- Mode guardrails: absolute (start/end) or relative (days back + duration), not both; frontend enforces no future end and ≤180-day windows. +- Run controls: caps per day/total, RPS limit, seed, run name, “run once and idle” toggle. +- History & control: table of backfill runs with cancel for running jobs and cleanup for exited jobs. +- Last payload/result: loads the most recent backfill payload; “Load last payload” button pre-fills the form. + +**Usage:** +1) Open Backfill tab. Pick absolute or relative window; fill caps/throttle/seed as needed. + - Matomo credentials: the backfill uses the current loadgen environment (MATOMO_URL/SITE_ID/MATOMO_TOKEN_AUTH) from the main container. Set the token in the Config tab (Matomo Token Auth) before running backfill; the Backfill tab does not ask for it separately. +2) Click **Run Backfill** (validation runs client-side; server validates too). +3) Monitor the runs table; use **Cancel** to stop a running job; use **Cleanup exited** to remove finished jobs. +4) Use **Load last payload** to quickly rerun or tweak the previous backfill. + **Field Reference:** | Field | Description | Default | Range | diff --git a/control-ui/app.py b/control-ui/app.py index 4142cd5..696d2bf 100644 --- a/control-ui/app.py +++ b/control-ui/app.py @@ -30,6 +30,12 @@ RestartResponse, LogsResponse, ApplyConfigResponse, + BackfillRunRequest, + BackfillRunResponse, + BackfillStatusResponse, + BackfillCleanupResponse, + BackfillLastResponse, + BackfillCancelResponse, URLContentRequest, PresetListResponse, PresetDetail, @@ -61,6 +67,7 @@ container_manager = ContainerManager(docker_client) config_database = Database(os.getenv("CONFIG_DB_PATH")) FUNNEL_CONFIG_PATH = Path(os.getenv("FUNNEL_CONFIG_PATH", "/app/data/funnels.json")) +BACKFILL_HISTORY_PATH = Path(os.getenv("BACKFILL_HISTORY_PATH", "/app/data/backfill_history.json")) @asynccontextmanager @@ -224,6 +231,153 @@ async def get_status(request: Request, authenticated: bool = Depends(verify_api_ ) +@app.post("/api/backfill/run", response_model=BackfillRunResponse) +@limiter.limit("10/minute") +async def run_backfill( + request: Request, + backfill_request: BackfillRunRequest, + authenticated: bool = Depends(verify_api_key), +): + """ + Launch a one-off backfill run in an ephemeral container without mutating the main loadgen config. + """ + if not docker_client.is_connected(): + raise HTTPException(status_code=503, detail="Docker daemon not connected") + + current_env = container_manager.get_current_env() + if current_env is None: + return BackfillRunResponse(success=False, message="Primary container not found", error="container_not_found") + + env = current_env.copy() + env.update({ + "BACKFILL_ENABLED": "true", + "BACKFILL_RUN_ONCE": "true" if backfill_request.BACKFILL_RUN_ONCE else "false", + "AUTO_START": "true", + }) + + for field in [ + "BACKFILL_START_DATE", + "BACKFILL_END_DATE", + "BACKFILL_DAYS_BACK", + "BACKFILL_DURATION_DAYS", + "BACKFILL_MAX_VISITS_PER_DAY", + "BACKFILL_MAX_VISITS_TOTAL", + "BACKFILL_RPS_LIMIT", + "BACKFILL_SEED", + ]: + value = getattr(backfill_request, field) + if value is not None: + env[field] = str(value) + + validator = ConfigValidator() + try: + validator.validate_config(env) + except Exception as e: + return BackfillRunResponse(success=False, message="Validation failed", error=str(e)) + + result = container_manager.spawn_backfill_job(env_vars=env, name=backfill_request.name) + if result.get("success"): + # Persist last run payload/result for UI reference + try: + BACKFILL_HISTORY_PATH.parent.mkdir(parents=True, exist_ok=True) + history = { + "payload": env, + "result": result, + "timestamp": datetime.utcnow().isoformat() + "Z", + } + BACKFILL_HISTORY_PATH.write_text(json.dumps(history, indent=2), encoding="utf-8") + except Exception as e: + logger.warning(f"Failed to write backfill history: {e}") + return BackfillRunResponse( + success=True, + message="Backfill job started", + container_name=result.get("container_name"), + container_id=result.get("container_id"), + ) + return BackfillRunResponse( + success=False, + message="Failed to start backfill job", + error=result.get("error"), + ) + + +@app.get("/api/backfill/status", response_model=BackfillStatusResponse) +@limiter.limit("30/minute") +async def backfill_status(request: Request, authenticated: bool = Depends(verify_api_key)): + """Return list of backfill runs (ephemeral containers).""" + if not docker_client.is_connected(): + raise HTTPException(status_code=503, detail="Docker daemon not connected") + + runs = container_manager.list_backfill_runs() + formatted = [] + for r in runs: + formatted.append({ + "container_name": r.get("name"), + "container_id": r.get("id"), + "state": r.get("status"), + "started_at": r.get("started_at"), + "finished_at": r.get("finished_at"), + "exit_code": r.get("exit_code"), + "error": None, + }) + + return BackfillStatusResponse(success=True, message="ok", runs=formatted) + + +@app.post("/api/backfill/cleanup", response_model=BackfillCleanupResponse) +@limiter.limit("10/minute") +async def backfill_cleanup(request: Request, authenticated: bool = Depends(verify_api_key)): + """Remove exited backfill containers.""" + if not docker_client.is_connected(): + raise HTTPException(status_code=503, detail="Docker daemon not connected") + + result = container_manager.cleanup_backfill_runs() + success = len(result.get("errors", [])) == 0 + message = "Cleanup complete" if success else "Cleanup completed with errors" + return BackfillCleanupResponse( + success=success, + message=message, + removed=result.get("removed", []), + errors=result.get("errors", []), + ) + + +@app.get("/api/backfill/last", response_model=BackfillLastResponse) +@limiter.limit("30/minute") +async def backfill_last(request: Request, authenticated: bool = Depends(verify_api_key)): + """Return last backfill payload/result if available.""" + if not BACKFILL_HISTORY_PATH.exists(): + return BackfillLastResponse(success=True, message="No backfill history", payload=None, result=None, timestamp=None) + try: + data = json.loads(BACKFILL_HISTORY_PATH.read_text(encoding="utf-8")) + return BackfillLastResponse( + success=True, + message="ok", + payload=data.get("payload"), + result=data.get("result"), + timestamp=data.get("timestamp"), + ) + except Exception as e: + return BackfillLastResponse(success=False, message="Failed to read backfill history", payload=None, result=None, timestamp=None) + + +@app.post("/api/backfill/cancel", response_model=BackfillCancelResponse) +@limiter.limit("10/minute") +async def backfill_cancel( + request: Request, + container_name: str = Body(..., embed=True), + authenticated: bool = Depends(verify_api_key), +): + """Stop a running backfill container.""" + if not docker_client.is_connected(): + raise HTTPException(status_code=503, detail="Docker daemon not connected") + + result = container_manager.cancel_backfill(container_name) + if result.get("success"): + return BackfillCancelResponse(success=True, message="Backfill container stopped") + return BackfillCancelResponse(success=False, message="Failed to stop backfill container", error=result.get("error")) + + @app.post("/api/start", response_model=StartResponse) @limiter.limit("10/minute") async def start_container( diff --git a/control-ui/container_manager.py b/control-ui/container_manager.py index afdcbe2..328fc32 100644 --- a/control-ui/container_manager.py +++ b/control-ui/container_manager.py @@ -4,6 +4,7 @@ Provides high-level operations for managing the load generator container. """ import os +import time from typing import Dict, Any, Optional from datetime import datetime, timezone from docker_client import DockerClient @@ -15,6 +16,8 @@ class ContainerManager: def __init__(self, docker_client: DockerClient): self.docker = docker_client self.start_signal_file = os.environ.get("START_SIGNAL_FILE", "/app/data/loadgen.start") + self.backfill_container_prefix = os.environ.get("BACKFILL_CONTAINER_PREFIX", "matomo-loadgen-backfill") + self.backfill_label_key = "backfill-job" def parse_env_list(self, env_list: list) -> Dict[str, str]: """ @@ -52,6 +55,14 @@ def mask_sensitive_values(self, env_dict: Dict[str, str]) -> Dict[str, str]: masked[key] = '***MASKED***' return masked + + def get_current_env(self) -> Optional[Dict[str, str]]: + """Return current container env as a dict.""" + info = self.docker.get_container_info() + if not info: + return None + env_list = info.get("config", {}).get("env", []) + return self.parse_env_list(env_list) def calculate_uptime(self, started_at: Optional[str]) -> Optional[str]: """ @@ -103,6 +114,110 @@ def send_start_signal(self) -> bool: except Exception as e: print(f"Error writing start signal: {e}") return False + + def spawn_backfill_job(self, env_vars: Dict[str, str], name: Optional[str] = None) -> Dict[str, Any]: + """ + Launch a one-off backfill container using the current container as a template. + Does not mutate the primary matomo-loadgen container. + """ + try: + container = self.docker.get_container() + if not container: + return {"success": False, "error": "Primary container not found", "container_name": None, "container_id": None} + + # Extract template info + attrs = container.attrs + config = attrs.get("Config", {}) + host_config = attrs.get("HostConfig", {}) + image = config.get("Image") + volumes = host_config.get("Binds", []) + network_mode = host_config.get("NetworkMode", "bridge") + + # Prepare env (disable restart loops and force backfill run) + env = self.parse_env_list(config.get("Env", [])) + env.update(env_vars) + env.setdefault("BACKFILL_ENABLED", "true") + env.setdefault("BACKFILL_RUN_ONCE", "true") + env.setdefault("AUTO_START", "true") + env.setdefault("LOG_LEVEL", "INFO") + + env_list = [f"{k}={v}" for k, v in env.items()] + + job_name = name or f"{self.backfill_container_prefix}-{int(time.time())}" + new_container = self.docker.client.containers.run( + image=image, + name=job_name, + environment=env_list, + volumes=volumes, + network_mode=network_mode, + restart_policy={"Name": "no"}, + labels={self.backfill_label_key: "true"}, + detach=True, + ) + + return { + "success": True, + "error": None, + "container_name": new_container.name, + "container_id": new_container.short_id, + } + except Exception as e: + return {"success": False, "error": str(e), "container_name": None, "container_id": None} + + def list_backfill_runs(self) -> list: + """List backfill containers by prefix/label.""" + runs = [] + try: + containers = self.docker.client.containers.list(all=True, filters={"label": f"{self.backfill_label_key}=true"}) + for c in containers: + c.reload() + state = c.attrs.get("State", {}) + runs.append({ + "name": c.name, + "id": c.short_id, + "status": c.status, + "started_at": state.get("StartedAt"), + "finished_at": state.get("FinishedAt"), + "exit_code": state.get("ExitCode"), + }) + except Exception as e: + print(f"Error listing backfill runs: {e}") + return runs + + def cleanup_backfill_runs(self) -> Dict[str, Any]: + """Remove stopped backfill containers.""" + removed = [] + errors = [] + try: + containers = self.docker.client.containers.list(all=True, filters={"label": f"{self.backfill_label_key}=true"}) + for c in containers: + c.reload() + if c.status not in ("exited", "created", "dead"): + continue + try: + removed.append(c.name) + c.remove(force=True) + except Exception as e: + errors.append(f"{c.name}: {e}") + except Exception as e: + errors.append(str(e)) + return {"removed": removed, "errors": errors} + + def cancel_backfill(self, container_name: str) -> Dict[str, Any]: + """Stop and remove a backfill container by name.""" + try: + container = self.docker.client.containers.get(container_name) + container.reload() + if container.labels.get(self.backfill_label_key) != "true": + return {"success": False, "error": "Not a backfill container"} + if container.status not in ("running", "paused", "created"): + return {"success": False, "error": f"Container is {container.status}, not running"} + container.stop(timeout=10) + # Remove after stop to prevent any restart attempts by external agents + container.remove(force=True) + return {"success": True, "error": None} + except Exception as e: + return {"success": False, "error": str(e)} def get_status(self) -> Dict[str, Any]: """ @@ -343,6 +458,10 @@ def update_and_restart(self, env_vars: Dict[str, str]) -> Dict[str, Any]: # Merge new env vars with existing ones (prioritize new) existing_env = {e.split('=', 1)[0]: e.split('=', 1)[1] for e in config.get('Env', []) if '=' in e} existing_env.update(env_vars) + # Ensure start signal path follows container env if present + if "START_SIGNAL_FILE" in existing_env: + self.start_signal_file = existing_env["START_SIGNAL_FILE"] + new_env = [f"{k}={v}" for k, v in existing_env.items()] # Store container settings @@ -351,6 +470,8 @@ def update_and_restart(self, env_vars: Dict[str, str]) -> Dict[str, Any]: volumes = host_config.get('Binds', []) network_mode = host_config.get('NetworkMode', 'bridge') restart_policy = host_config.get('RestartPolicy', {}) + log_config = host_config.get('LogConfig', {}) + labels = config.get('Labels', {}) # Stop and remove the old container was_running = current_state == "running" @@ -368,6 +489,8 @@ def update_and_restart(self, env_vars: Dict[str, str]) -> Dict[str, Any]: volumes=volumes, network_mode=network_mode, restart_policy=restart_policy, + log_config=log_config, + labels=labels, detach=True ) diff --git a/control-ui/models.py b/control-ui/models.py index 8209a12..eda0cb4 100644 --- a/control-ui/models.py +++ b/control-ui/models.py @@ -119,6 +119,66 @@ class ApplyConfigResponse(BaseModel): error: Optional[str] = None +class BackfillRunRequest(BaseModel): + """Request body for one-off backfill runs""" + BACKFILL_START_DATE: Optional[str] = None + BACKFILL_END_DATE: Optional[str] = None + BACKFILL_DAYS_BACK: Optional[str] = None + BACKFILL_DURATION_DAYS: Optional[str] = None + BACKFILL_MAX_VISITS_PER_DAY: Optional[int] = None + BACKFILL_MAX_VISITS_TOTAL: Optional[int] = None + BACKFILL_RPS_LIMIT: Optional[float] = None + BACKFILL_SEED: Optional[int] = None + BACKFILL_RUN_ONCE: bool = Field(default=True, description="Idle after run to avoid restart loops") + name: Optional[str] = Field(default=None, description="Optional name for the ephemeral backfill container") + + +class BackfillRunResponse(BaseModel): + """Response for POST /api/backfill/run""" + success: bool + message: str + container_name: Optional[str] = None + container_id: Optional[str] = None + error: Optional[str] = None + + +class BackfillStatusItem(BaseModel): + container_name: str + container_id: str + state: str + started_at: Optional[str] = None + finished_at: Optional[str] = None + exit_code: Optional[int] = None + error: Optional[str] = None + + +class BackfillStatusResponse(BaseModel): + success: bool + message: str + runs: list[BackfillStatusItem] + + +class BackfillCleanupResponse(BaseModel): + success: bool + message: str + removed: list[str] = [] + errors: list[str] = [] + + +class BackfillLastResponse(BaseModel): + success: bool + message: str + payload: Optional[Dict[str, Any]] = None + result: Optional[Dict[str, Any]] = None + timestamp: Optional[str] = None + + +class BackfillCancelResponse(BaseModel): + success: bool + message: str + error: Optional[str] = None + + class URLContentRequest(BaseModel): """Request for URL validation/upload""" content: str = Field(..., description="URL file content (one URL per line)") diff --git a/control-ui/static/index.html b/control-ui/static/index.html index 0dc457e..0c9271b 100644 --- a/control-ui/static/index.html +++ b/control-ui/static/index.html @@ -61,6 +61,9 @@
Replay historical visits over a bounded window. Provide either absolute dates or a relative window.
- -When enabled, visits are replayed across the selected date window instead of realtime.
-Per-day cap (default 2,000; max 10,000)
-Global cap; 0 to disable
-Start date (YYYY-MM-DD)
-End date (cannot be in future)
-Days back (1=yesterday)
-Duration in days
-Use either absolute dates OR relative window, not both.
-Throttle requests/sec during backfill (optional)
-Stable runs (per-day offset applied)
-