Skip to content

Latest commit

 

History

History
525 lines (441 loc) · 18.7 KB

File metadata and controls

525 lines (441 loc) · 18.7 KB

WineBot Internal API

WineBot includes an internal HTTP API to facilitate programmatic control from within the container or (if ports are mapped) from the host.

Base URL: http://localhost:8000

Security

Authentication

To secure the API, set the API_TOKEN environment variable. All requests must then include the token in the header:

  • Header: X-API-Key: <your-token>

If API_TOKEN is not set, the API is open (not recommended for shared environments).

Path Safety

Endpoints accepting file paths (/apps/run) are restricted to specific directories to prevent traversal attacks. Absolute paths must start with an allowed prefix. "Naked" filenames (no separators) are allowed and assumed to be in the Wine search path.

Allowed prefixes:

  • /apps
  • /wineprefix
  • /tmp
  • /artifacts
  • /opt/winebot
  • /usr/bin

Unified CLI

Use scripts/winebotctl for a single CLI entrypoint to the API.

Examples:

  • scripts/winebotctl health
  • scripts/winebotctl sessions list
  • scripts/winebotctl recording start --session-root /artifacts/sessions
  • scripts/winebotctl api POST /sessions/suspend --json '{"shutdown_wine":true}'

Idempotent mode is supported (see --idempotent / --no-idempotent) so repeat invocations can safely reuse the same response when desired.

Versioning and Compatibility

WineBot publishes explicit API and artifact/event schema versions.

  • HTTP responses include:
    • X-WineBot-API-Version
    • X-WineBot-Build-Version
    • X-WineBot-Artifact-Schema-Version
    • X-WineBot-Event-Schema-Version
  • GET /version returns the same values as JSON fields.
  • session.json, segment_*.json, and JSONL event streams include schema_version.
  • Readers default missing schema_version to 1.0 for backward compatibility with older artifacts.
  • Invalid configuration values fail closed at startup with explicit validation errors.

Endpoints

Quick Reference

Method Path Purpose
GET /health High‑level health summary
GET /health/system System stats
GET /health/x11 X11 status
GET /health/windows X11 window list
GET /health/wine Wine prefix details
GET /health/tools Tool availability
GET /health/storage Storage stats
GET /health/recording Recorder status
GET /health/invariants Runtime invariant validation report
GET /lifecycle/status Lifecycle status for core components
GET /lifecycle/events Recent lifecycle events
POST /lifecycle/shutdown Gracefully stop the container
POST /openbox/reconfigure Reload Openbox config
POST /openbox/restart Restart Openbox
GET /sessions List session directories
POST /sessions/suspend Suspend a session (keep container alive)
POST /sessions/resume Resume a session directory
GET /sessions/{session_id}/control Current control state for active session
POST /sessions/{session_id}/control/challenge Issue one-time grant challenge token
POST /sessions/{session_id}/control/grant Grant agent control (requires challenge)
POST /sessions/{session_id}/control/renew Renew agent control lease for active session
POST /sessions/{session_id}/control/mode Set session control mode
GET /control/mode Instance/session/effective control modes
POST /control/mode Set instance control mode
GET /ui noVNC + API dashboard UI
GET /windows List visible windows
POST /windows/focus Focus a window
POST /input/mouse/click Click at coordinates
POST /apps/run Run a Windows app
GET /screenshot Capture screenshot (metadata sidecar + header)
POST /recording/start Start recording session
POST /recording/pause Pause recording session
POST /recording/resume Resume recording session
POST /recording/stop Stop recording session
GET /recording/perf/summary Summarize recording performance metrics
POST /run/ahk Run AutoHotkey script
POST /run/autoit Run AutoIt script
POST /run/python Run Windows Python
POST /inspect/window WinSpy‑style inspection

Health & State

GET /health

High-level health summary (X11, Wine prefix, tools, storage).

  • Response: {"status": "ok", "x11": "connected", "wineprefix": "ready", "tools_ok": true, "security_warning": "...", ...}
  • Security Warning: Reports if VNC is exposed on a public IP or running without a password.

GET /health/system

System stats (uptime, load average, CPU count, memory).

GET /health/x11

X11/display status and active window.

GET /health/windows

Window list and active window details.

GET /health/wine

Wine prefix status and wine --version.

GET /health/tools

Presence/paths of key tools (winedbg, gdb, ffmpeg, etc).

GET /health/storage

Disk space and writeability for /wineprefix, /artifacts, and /tmp.

GET /health/recording

Recorder status and current session info (if any).

GET /health/invariants

Runtime invariant report for lifecycle/control/config constraints.

  • Response: {"ok": true|false, "violations": [{"code":"...", "detail":"..."}]}

GET /lifecycle/status

Status for core WineBot components (Xvfb, Openbox, VNC/noVNC, recorder, etc).

  • Response: includes session_id, session_dir, user_dir, processes, and lifecycle_log.

GET /lifecycle/events

Return recent lifecycle events.

  • Parameters:
    • limit (optional): Max events to return (default: 100).
  • Response: {"events":[ ... ]}

POST /lifecycle/shutdown

Gracefully stop the recorder and UI components, shut down Wine, and terminate the container process.

  • Parameters:
    • delay (optional): Seconds to wait before terminating (default: 0.5).
    • wine_shutdown (optional): Whether to run wineboot --shutdown and wineserver -k before exiting (default: true).
    • power_off (optional): Immediately terminate the container (unsafe; skips graceful shutdown).
  • Response (graceful): {"status":"shutting_down","delay_seconds":0.5,"results":{"recorder":"ok","wine":{...},"components":{...}}}
  • Response (power_off): {"status":"powering_off","delay_seconds":0.5}
  • Idempotency: Repeated requests during an in-progress shutdown return {"status":"already_shutting_down","mode":"graceful|power_off"}.

POST /openbox/reconfigure

Reload the Openbox configuration.

  • Response: {"status":"ok","action":"reconfigure","result":{...}}

POST /openbox/restart

Restart the Openbox window manager.

  • Response: {"status":"ok","action":"restart","result":{...}}

GET /sessions

List session directories in the session root.

  • Parameters:
    • root (optional): Override session root (default: /artifacts/sessions).
    • limit (optional): Max sessions to return (default: 100).
  • Response: {"root":"...","sessions":[...]}

POST /sessions/suspend

Suspend a session without terminating the container.

  • Body (JSON):
    • session_id or session_dir (optional): Target session (default: current).
    • session_root (optional): Session root when using session_id.
    • shutdown_wine (optional): Stop Wine services (default: true).
    • stop_recording (optional): Stop active recording (default: true).
  • Transactional behavior: If pre-suspend steps fail (e.g., stop recording or Wine shutdown), suspend is aborted with 500 and session state is not mutated.
  • Response: {"status":"suspended|completed","session_dir":"...","session_id":"...","session_mode":"persistent|oneshot","wine_shutdown":{...}}

POST /sessions/resume

Resume an existing session directory.

  • Body (JSON):
    • session_id or session_dir (required).
    • session_root (optional): Session root when using session_id.
    • restart_wine (optional): Restart Wine services (default: true).
    • stop_recording (optional): Stop active recording before switching (default: true).

POST /sessions/{session_id}/control/renew

Renew an existing agent-control lease.

  • Requires session_id to be the currently active session.
  • Returns 409 if the target session is not active.

Dashboard

GET /ui

Serve the built‑in dashboard (noVNC + API controls). If API_TOKEN is set, enter it in the UI to authenticate API requests.

  • Includes lifecycle controls (graceful shutdown and power off), component status badges, and an activity log console.

GET /windows

List currently visible windows.

  • Response:
    {
      "windows": [
        {"id": "0x123456", "title": "Untitled - Notepad"},
        ...
      ]
    }

Vision

GET /screenshot

Capture a screenshot via /automation/screenshot.sh.

  • Parameters:
    • output_dir (optional): Override output directory. If omitted, the session screenshots directory is used.
  • Response: PNG image file.
    • Headers: X-Screenshot-Path (saved path inside container)
    • Default storage: If no session exists, /tmp is used.

POST /recording/start

Start a recording session.

  • Body (optional):
    {
      "session_label": "smoke",
      "session_root": "/artifacts/sessions",
      "display": ":99",
      "resolution": "1920x1080",
      "fps": 30,
      "new_session": false
    }
  • Response: {"status":"started","session_id":"...","session_dir":"...","segment":1,"output_file":"...","events_file":"..."} If a session already exists and new_session is false, each start creates a new numbered segment in the same session directory.

Recording Action Contract

All recording action endpoints (/recording/start, /recording/pause, /recording/resume, /recording/stop) return a shared envelope:

{
  "action": "start|pause|resume|stop",
  "status": "legacy_status_string",
  "result": "converged|accepted",
  "converged": true,
  "recording_timeline_id": "timeline-...",
  "session_dir": "/artifacts/sessions/session-...",
  "operation_id": "uuid-when-available",
  "warning": "optional warning text"
}
  • status is preserved for backward compatibility with existing scripts/UI.
  • result and converged are the canonical state-convergence signals:
    • result=converged and converged=true: target state reached before response.
    • result=accepted and converged=false: command accepted, convergence still in progress.
  • recording_timeline_id is stable for the full recording lifecycle and is written to session.json, segment_*.json, and recording_artifacts_manifest.json.

Recording Transition Guarantees

Endpoint Typical status values Convergence guarantee
POST /recording/start started, resumed, already_recording Converged on response
POST /recording/pause paused, already_paused, idle Converged on response
POST /recording/resume resumed, already_recording, idle, resume_requested resume_requested means accepted, not yet converged
POST /recording/stop stopped, already_stopped, stop_requested stop_requested means accepted, not yet converged

Recording Idempotency Examples

Example: repeated pause (safe no-op):

curl -s -X POST http://localhost:8000/recording/pause -H 'X-API-Key: TOKEN'
curl -s -X POST http://localhost:8000/recording/pause -H 'X-API-Key: TOKEN'

Second call may return:

{"action":"pause","status":"already_paused","result":"converged","converged":true}

Example: stop accepted but still finalizing:

curl -s -X POST http://localhost:8000/recording/stop -H 'X-API-Key: TOKEN'

Possible response:

{
  "action":"stop",
  "status":"stop_requested",
  "result":"accepted",
  "converged":false,
  "warning":"Recorder stop requested; finalization still in progress."
}

Client behavior: treat as success-accepted, then poll GET /health/recording until state=idle.

POST /recording/pause

Pause the active recording session.

  • Response: Contract envelope above. Typical status: paused or already_paused.

POST /recording/resume

Resume the active recording session.

  • Response: Contract envelope above. Typical status: resumed, already_recording, resume_requested.

POST /recording/stop

Stop the active recording session.

  • Response: Contract envelope above. Typical status: stopped, already_stopped, stop_requested.

GET /recording/perf/summary

Summarize metrics from logs/perf_metrics.jsonl for a session.

  • Parameters (optional):
    • session_id
    • session_dir
    • session_root (used with session_id)
  • Response: includes metrics keyed by metric name with percentile stats (count, min_ms, mean_ms, p50_ms, p90_ms, p95_ms, p99_ms, max_ms).

Control

POST /windows/focus

Focus a specific window.

  • Body: {"window_id": "0x123456"}
  • Response: {"status": "focused", "id": "..."}

POST /input/mouse/click

Click at specific coordinates.

  • Body:
    {
      "x": 100, 
      "y": 200, 
      "button": 1, 
      "window_title": "Notepad", 
      "relative": true
    }
  • Response: {"status": "clicked", "trace_id": "...", ...}
  • Notes: Supports relative clicking if window_title is provided and relative is true. Clicks are validated against the current SCREEN resolution if not relative.

GET /input/trace/status

Get input trace status for the active session.

  • Response: {"running": true, "pid": 123, "state": "running", "log_path": "...", ...}

POST /input/trace/start

Start input tracing (mouse motion, clicks, keypresses).

  • Body (optional):
    {
      "include_raw": false,
      "motion_sample_ms": 0
    }
  • Response: {"status": "started", "pid": 123, "log_path": "..."}

POST /input/trace/stop

Stop input tracing.

  • Response: {"status": "stopped", "session_dir": "..."}

GET /input/trace/x11core/status

Get X11 core input trace status (xinput test).

  • Response: {"running": true, "pid": 123, "state": "running", "log_path": "...", ...}

POST /input/trace/x11core/start

Start X11 core input tracing.

  • Body (optional):
    {
      "motion_sample_ms": 10
    }
  • Response: {"status": "started", "pid": 123, "log_path": "..."}

POST /input/trace/x11core/stop

Stop X11 core input tracing.

  • Response: {"status": "stopped", "session_dir": "..."}

GET /input/trace/client/status

Get client (noVNC) input trace status.

  • Response: {"enabled": true, "log_path": "...", ...}

POST /input/trace/client/start

Enable client (noVNC) input trace collection.

  • Response: {"status":"enabled","log_path":"..."}

POST /input/trace/client/stop

Disable client (noVNC) input trace collection.

  • Response: {"status":"disabled","session_dir":"..."}

POST /input/client/event

Ingest a client (noVNC UI) input event.

  • Body: arbitrary JSON fields for the event.
  • Response: {"status":"ok"} (or {"status":"ignored"} when disabled).

GET /input/trace/windows/status

Get Windows-side input trace status.

  • Response: {"running": true, "pid": 123, "state": "running", "backend": "hook", "log_path": "...", ...}

POST /input/trace/windows/start

Start Windows-side input tracing.

  • Body (optional):
    {
      "motion_sample_ms": 10,
      "debug_keys": ["vk41", "LButton"],
      "debug_keys_csv": "vk41,LButton",
      "debug_sample_ms": 200,
      "backend": "auto"
    }
  • Notes: backend may be auto, ahk, or hook. hook uses the low-level Windows hook observer; ahk uses AutoHotkey.
  • Response: {"status":"started","pid":123,"log_path":"...","backend":"hook"}

POST /input/trace/windows/stop

Stop Windows-side input tracing.

  • Response: {"status":"stopped","session_dir":"..."}

GET /input/trace/network/status

Get network input trace status (VNC proxy).

  • Response: {"running": true, "pid": 123, "state": "enabled", "log_path": "...", ...}

POST /input/trace/network/start

Enable network input trace logging (proxy must be running).

  • Response: {"status":"enabled","session_dir":"..."}

POST /input/trace/network/stop

Disable network input trace logging (proxy must be running).

  • Response: {"status":"disabled","session_dir":"..."}

GET /input/events

Return recent input trace events.

  • Query params: limit (default 200), since_epoch_ms (optional), source (client, x11_core, windows, network, or default X11 trace file)
  • Response: {"events":[...], "log_path":"..."} Each event includes origin (user/agent/unknown) and tool when known.

POST /apps/run

Run a Windows application.

  • Body:
    {
      "path": "C:/Program Files/App/App.exe",
      "args": "-debug",
      "detach": true
    }
  • Response (Finished):
    {
      "status": "finished",
      "stdout": "...",
      "stderr": "..."
    }
  • Response (Failed):
    {
      "status": "failed",
      "exit_code": 1,
      "stdout": "...",
      "stderr": "..."
    }
  • Response (Detached): {"status":"detached","pid":...}

Automation

POST /run/ahk

Run an AutoHotkey script.

  • Body:
    {
      "script": "MsgBox, Hello from API"
    }
  • Response: {"status": "ok", "stdout": "..."}

POST /run/autoit

Run an AutoIt v3 script.

  • Body:
    {
      "script": "MsgBox(0, 'Title', 'Hello from API')"
    }
  • Response: {"status": "ok", "stdout": "..."}

POST /run/python

Run a Python script using the embedded Windows Python environment (winpy).

  • Body: {"script": "import sys; print(sys.version)"}
  • Response: {"status": "ok", "stdout": "..."}

POST /inspect/window

Inspect a Windows window and its controls (WinSpy-style) via AutoIt.

  • Body (inspect by title or handle):
    {
      "title": "Untitled - Notepad",
      "text": "",
      "handle": "",
      "include_controls": true,
      "max_controls": 200
    }
  • Body (list windows):
    {
      "list_only": true,
      "include_empty": false
    }
  • Response: {"status": "ok", "details": {}} (current implementation returns a placeholder payload).

Usage

To enable the API server, set ENABLE_API=1 when starting the container. For security, also set API_TOKEN.

ENABLE_API=1 API_TOKEN=mysecret docker compose up

You can then interact with it via curl or any HTTP client inside the container or mapped host port.

curl -H "X-API-Key: mysecret" http://localhost:8000/health

Auth-Protected Host Testing

If you expose port 8000 (see compose/docker-compose.yml) and set API_TOKEN, you can test from the host:

API_TOKEN=mysecret docker compose -f compose/docker-compose.yml --profile headless up -d
API_TOKEN=mysecret ./scripts/winebotctl health
API_TOKEN=mysecret ./scripts/winebotctl health system