SimpleOpenSoftware
diff --git a/‎CLAUDE.md‎
Lines changed: 41 additions & 122 deletions b/‎CLAUDE.md‎
Lines changed: 41 additions & 122 deletions
diff --git a/‎backends/advanced/cleanup.sh‎
Lines changed: 15 additions & 0 deletions b/‎backends/advanced/cleanup.sh‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎tests/.gitignore‎
Lines changed: 10 additions & 0 deletions b/‎tests/.gitignore‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎tests/Makefile‎
Lines changed: 88 additions & 5 deletions b/‎tests/Makefile‎
Lines changed: 88 additions & 5 deletions
@@ -86,130 +86,54 @@ cp .env.template .env  # Configure environment variables
 sudo rm -rf backends/advanced/data/
 ```
 
-### Testing Infrastructure
+### Running Tests
 
-#### Local Test Scripts
-The project includes simplified test scripts that mirror CI workflows:
+#### Quick Commands
+All test operations are managed through a simple Makefile interface:
 
-```bash
-# Run all tests from project root
-./run-test.sh [advanced-backend|speaker-recognition|all]
-
-# Advanced backend tests only
-./run-test.sh advanced-backend
-
-# Speaker recognition tests only
-./run-test.sh speaker-recognition
-
-# Run all test suites (default)
-./run-test.sh all
-```
-
-#### Advanced Backend Integration Tests
-
-**Three Test Execution Modes:**
-
-1. **No-API Tests** (Fast, No External Dependencies)
 ```bash
 cd tests
 
-# Run tests without API keys (excludes requires-api-keys tag)
-./run-no-api-tests.sh
+# Full test workflow (recommended)
+make test              # Start containers + run all tests
 
-# ~70% of test suite
-# Uses mock services config
-# No DEEPGRAM_API_KEY or OPENAI_API_KEY required
-# Fast feedback (~10-15 minutes)
-```
+# Or step by step
+make start             # Start test containers (with health checks)
+make test-all          # Run all test suites
+make stop              # Stop containers (preserves volumes)
 
-2. **Full Tests with API Keys** (Comprehensive)
-```bash
-cd tests
+# Run specific test suites
+make test-endpoints    # API endpoint tests (~40 tests, fast)
+make test-integration  # End-to-end workflows (~15 tests, slower)
+make test-infra        # Infrastructure resilience (~5 tests)
 
-# Requires .env file with DEEPGRAM_API_KEY and OPENAI_API_KEY
-cp setup/.env.test.template setup/.env.test  # Configure API keys
+# Quick iteration (reuse existing containers)
+make test-quick        # Run tests without restarting containers
+```
 
-# Run full integration test suite (100% of tests)
-./run-robot-tests.sh
+#### Container Management
+All container operations automatically preserve logs before cleanup:
 
-# Leave test containers running for debugging
-CLEANUP_CONTAINERS=false ./run-robot-tests.sh
+```bash
+make start             # Start test containers
+make stop              # Stop containers (keep volumes)
+make restart           # Restart without rebuild
+make rebuild           # Rebuild images + restart (for code changes)
+make containers-clean  # SAVES LOGS → removes everything
+make status            # Show container health
+make logs SERVICE=<name>  # View specific service logs
 ```
 
-3. **API-Only Tests** (Optional)
-```bash
-cd tests
+**Log Preservation:** All cleanup operations save container logs to `tests/logs/YYYY-MM-DD_HH-MM-SS/`
 
-# Run only tests that require API keys
-./run-api-tests.sh
+#### Test Environment
 
-# ~30% of test suite
-# Only E2E tests with transcription/memory extraction
-```
+Test services use isolated ports and database:
+- **Ports:** Backend (8001), MongoDB (27018), Redis (6380), Qdrant (6337/6338)
+- **Database:** `test_db` (separate from production)
+- **Credentials:** `test-admin@example.com` / `test-admin-password-123`
 
-#### Test Separation by API Requirements
-
-Tests are separated into two categories:
-
-- **No API Keys Required** (~70%): Endpoint tests, infrastructure tests, basic integration
-  - Uses `configs/mock-services.yml`
-  - Runs on all PRs by default
-  - Fast CI feedback
-
-- **API Keys Required** (~30%): Full E2E tests with transcription and memory extraction
-  - Uses `configs/deepgram-openai.yml`
-  - Tagged with `requires-api-keys`
-  - Runs on dev/main branches or when PR labeled with `test-with-api-keys`
-
-#### Test Configuration Flags
-- **CLEANUP_CONTAINERS** (default: false): Automatically stop and remove test containers after test completion
-  - Set to `true` for cleanup: `CLEANUP_CONTAINERS=true ./run-robot-tests.sh`
-- **CONFIG_FILE**: Choose test configuration
-  - `configs/mock-services.yml` - No API keys (default for run-no-api-tests.sh)
-  - `configs/deepgram-openai.yml` - With API keys (default for run-robot-tests.sh)
-  - `configs/parakeet-ollama.yml` - Fully local (no external APIs)
-
-#### Test Environment Variables
-Tests use isolated test environment with overridden credentials:
-- **Test Database**: `test_db` (MongoDB on port 27018, separate from production)
-- **Test Ports**: Backend (8001), Qdrant (6337/6338), WebUI (3001)
-- **Test Credentials**:
-  - `AUTH_SECRET_KEY`: test-jwt-signing-key-for-integration-tests
-  - `ADMIN_EMAIL`: test-admin@example.com
-  - `ADMIN_PASSWORD`: test-admin-password-123
-- **API Keys**: Loaded from `.env` file (DEEPGRAM_API_KEY, OPENAI_API_KEY)
-- **Test Settings**: `DISABLE_SPEAKER_RECOGNITION=true` to prevent segment duplication
-
-#### Test Script Features
-- **Environment Compatibility**: Works with both local .env files and CI environment variables
-- **Isolated Test Environment**: Separate ports and database prevent conflicts with running services
-- **Automatic Cleanup**: Configurable via CLEANUP_CONTAINERS flag (default: false for faster re-runs)
-- **Colored Output**: Clear progress indicators and error reporting
-- **Timeout Protection**: 30-minute timeout for test execution
-- **Fresh Testing**: Clean database and containers for each test run
-- **API Key Separation**: Ability to run tests with or without external API dependencies
-
-#### GitHub Workflows
-
-**Three workflows handle test execution:**
-
-1. **`robot-tests.yml`** - PR Tests (No API Keys)
-   - Triggers: All pull requests
-   - Execution: Excludes `requires-api-keys` tests (~70% of suite)
-   - No secrets required
-   - Fast feedback for contributors
-
-2. **`full-tests-with-api.yml`** - Dev/Main Tests (Full Suite)
-   - Triggers: Push to dev/main branches
-   - Execution: All tests including API-dependent (~100% of suite)
-   - Requires: DEEPGRAM_API_KEY, OPENAI_API_KEY
-   - Comprehensive validation before deployment
-
-3. **`pr-tests-with-api.yml`** - Label-Triggered PR Tests
-   - Triggers: PR with `test-with-api-keys` label
-   - Execution: Full test suite before merge
-   - Requires: DEEPGRAM_API_KEY, OPENAI_API_KEY
-   - Useful for testing API integration changes
+**For complete test documentation, see `tests/README.md`**
 
 ### Mobile App Development
 ```bash
@@ -571,12 +495,11 @@ tailscale ip -4
 - **Docker**: Primary deployment method with docker-compose
 
 ### Testing Strategy
-- **Local Test Scripts**: Simplified scripts (`./run-test.sh`) mirror CI workflows for local development
-- **End-to-End Integration**: Robot Framework tests (`tests/integration/integration_test.robot`) validate complete audio processing pipeline
-- **Speaker Recognition Tests**: `test_speaker_service_integration.py` validates speaker identification
+- **Makefile-Based**: All test operations through simple `make` commands (`make test`, `make start`, `make stop`)
+- **Log Preservation**: Container logs always saved before cleanup (never lose debugging info)
+- **End-to-End Integration**: Robot Framework validates complete audio processing pipeline
 - **Environment Flexibility**: Tests work with both local .env files and CI environment variables
-- **Automated Cleanup**: Test containers are automatically removed after execution
-- **CI/CD Integration**: GitHub Actions use the same local test scripts for consistency
+- **CI/CD Integration**: Same test logic locally and in GitHub Actions
 
 ### Code Style
 - **Python**: Black formatter with 100-character line length, isort for imports
@@ -603,14 +526,10 @@ The system includes comprehensive health checks:
 - Memory debug system for transcript processing monitoring
 
 ### Integration Test Infrastructure
-- **Unified Test Scripts**: Local `./run-test.sh` scripts mirror GitHub Actions workflows
-- **Test Environment**: `docker-compose-test.yml` provides isolated services on separate ports
-- **Test Database**: Uses `test_db` database with isolated collections
-- **Service Ports**: Backend (8001), MongoDB (27018), Qdrant (6335/6336), WebUI (5174)
-- **Test Credentials**: Auto-generated `.env.test` files with secure test configurations
-- **Ground Truth**: Expected transcript established via `scripts/test_deepgram_direct.py`
-- **AI Validation**: OpenAI-powered transcript similarity comparison
-- **Test Audio**: 4-minute glass blowing tutorial (`extras/test-audios/DIY*mono*.wav`)
+- **Makefile Interface**: Simple `make` commands for all operations (see `tests/README.md`)
+- **Test Environment**: `docker-compose-test.yml` with isolated services on separate ports
+- **Test Database**: Uses `test_db` database (separate from production)
+- **Log Preservation**: All cleanup operations save logs to `tests/logs/` automatically
 - **CI Compatibility**: Same test logic runs locally and in GitHub Actions
 
 ### Cursor Rule Integration
 
@@ -0,0 +1,15 @@
+#!/bin/bash
+# Wrapper script for cleanup_state.py
+# Usage: ./cleanup.sh --backup --export-audio
+#
+# This script runs the cleanup_state.py script inside the chronicle-backend container
+# to handle data ownership and permissions correctly.
+#
+# Examples:
+#   ./cleanup.sh --dry-run              # Preview what would be deleted
+#   ./cleanup.sh --backup               # Cleanup with metadata backup
+#   ./cleanup.sh --backup --export-audio  # Full backup including audio
+#   ./cleanup.sh --backup --force       # Skip confirmation prompts
+
+cd "$(dirname "$0")"
+docker compose exec chronicle-backend uv run python src/scripts/cleanup_state.py "$@"
@@ -0,0 +1,10 @@
+# Test output files
+output.xml
+log.html
+report.html
+results/
+results-no-api/
+
+# Saved container logs (automatically generated)
+logs/*
+!logs/.gitkeep
@@ -1,31 +1,54 @@
 # Chronicle Test Makefile
-# Shortcuts for running tests
+# Shortcuts for running tests and managing test containers
 
-.PHONY: help all clean
+.PHONY: help all clean \
+        containers-start containers-stop containers-restart containers-rebuild \
+        containers-clean containers-status containers-logs \
+        start stop restart rebuild status logs \
+        test test-quick clean-all
 
 # Default output directory
 OUTPUTDIR ?= results
 TEST_DIR = endpoints integration infrastructure
+SERVICE ?= chronicle-backend-test
 
 help:
 	@echo "Chronicle Test Targets:"
 	@echo ""
+	@echo "Quick Commands:"
+	@echo "  make test        - Start containers + run all tests"
+	@echo "  make test-quick  - Run tests on existing containers"
+	@echo "  make start       - Start test containers"
+	@echo "  make stop        - Stop containers (keep volumes)"
+	@echo "  make status      - Show container status"
+	@echo ""
 	@echo "Running Tests:"
 	@echo "  make all         - Run all tests"
 	@echo "  make endpoints   - Run only endpoint tests"
 	@echo "  make integration - Run only integration tests"
 	@echo "  make infra       - Run only infrastructure tests"
 	@echo ""
+	@echo "Container Management:"
+	@echo "  make containers-start    - Start test containers"
+	@echo "  make containers-stop     - Stop containers (keep volumes)"
+	@echo "  make containers-restart  - Restart containers"
+	@echo "  make containers-rebuild  - Rebuild + restart containers"
+	@echo "  make containers-clean    - Save logs + remove everything"
+	@echo "  make containers-status   - Show container health"
+	@echo "  make containers-logs     - View service logs (use SERVICE=name)"
+	@echo ""
 	@echo "Utilities:"
 	@echo "  make clean       - Remove test output files"
+	@echo "  make clean-all   - Clean results + containers (saves logs)"
 	@echo ""
 	@echo "Environment Variables:"
 	@echo "  OUTPUTDIR        - Output directory (default: results)"
+	@echo "  SERVICE          - Service name for logs (default: chronicle-backend-test)"
 	@echo ""
 	@echo "Examples:"
-	@echo "  make all                     # Full test suite"
-	@echo "  make endpoints               # Only endpoint tests"
-	@echo "  make all OUTPUTDIR=/tmp/out  # Custom output dir"
+	@echo "  make test                              # Full workflow"
+	@echo "  make endpoints                         # Only endpoint tests"
+	@echo "  make containers-logs SERVICE=workers-test  # View worker logs"
 
 # Run all tests
 # Creates a persistent fixture conversation that won't be deleted between suites
@@ -34,6 +57,7 @@ all:
 	CREATE_FIXTURE=true uv run --with-requirements test-requirements.txt robot --outputdir $(OUTPUTDIR) \
 		--name "All Tests" \
 		--console verbose \
+		--loglevel INFO:INFO \
 		$(TEST_DIR)
 
 # Run only endpoint tests
@@ -42,6 +66,7 @@ endpoints:
 	uv run --with-requirements test-requirements.txt robot --outputdir $(OUTPUTDIR) \
 		--name "Endpoint Tests" \
 		--console verbose \
+		--loglevel INFO:INFO \
 		endpoints
 
 # Run only integration tests
@@ -50,6 +75,7 @@ integration:
 	CREATE_FIXTURE=true uv run --with-requirements test-requirements.txt robot --outputdir $(OUTPUTDIR) \
 		--name "Integration Tests" \
 		--console verbose \
+		--loglevel INFO:INFO \
 		integration
 
 # Run only infrastructure tests
@@ -58,6 +84,7 @@ infra:
 	uv run --with-requirements test-requirements.txt robot --outputdir $(OUTPUTDIR) \
 		--name "Infrastructure Tests" \
 		--console verbose \
+		--loglevel INFO:INFO \
 		infrastructure
 
 # Clean up test output files
@@ -66,3 +93,59 @@ clean:
 	rm -f output.xml log.html report.html
 	rm -rf $(OUTPUTDIR)
 	@echo "Clean complete!"
+
+# ============================================================================
+# Container Management Targets
+# ============================================================================
+
+# Start test containers
+containers-start:
+	@./bin/start-containers.sh
+
+# Stop test containers (preserve volumes)
+containers-stop:
+	@./bin/stop-containers.sh
+
+# Restart test containers
+containers-restart:
+	@./bin/restart-containers.sh
+
+# Rebuild test containers
+containers-rebuild:
+	@./bin/rebuild-containers.sh
+
+# Clean test containers (ALWAYS saves logs first!)
+containers-clean:
+	@./bin/clean-containers.sh
+
+# Show container status
+containers-status:
+	@./bin/status-containers.sh
+
+# View container logs
+containers-logs:
+	@./bin/logs-containers.sh $(SERVICE)
+
+# ============================================================================
+# Convenient Aliases
+# ============================================================================
+
+start: containers-start
+stop: containers-stop
+restart: containers-restart
+rebuild: containers-rebuild
+status: containers-status
+logs: containers-logs
+
+# ============================================================================
+# Combined Workflows
+# ============================================================================
+
+# Full workflow: start containers + run all tests
+test: containers-start all
+
+# Quick workflow: run tests on existing containers
+test-quick: all
+
+# Complete cleanup: test results + containers (saves logs)
+clean-all: clean containers-clean