rpothin · rpothin · Jan 28, 2026 · Jan 26, 2026 · Jan 28, 2026 · Jan 28, 2026
diff --git a/JOURNAL.md b/JOURNAL.md
@@ -1226,3 +1226,117 @@ The Ralph realignment is working! The CLI now:
 - `npm run typecheck`
 - `npm test`
 - `npm run build`
+
+## 2026-01-28 - CRITICAL FIX: Multi-task plan processing loop
+
+### Problem
+The `ghcralph run --file PLAN.md` command only processed ONE task per invocation, then exited. After successfully completing the first task in a plan file, the CLI would terminate instead of continuing to the remaining tasks. This was a critical bug that broke the core functionality of the CLI.
+
+**Root Cause**: The `run` command in `src/commands/run.ts` only processed **one task per invocation**. There was no outer loop to continue processing the remaining pending tasks after the first task completed.
+
+### Fix
+Implemented Option A from the remediation plan (`plans/LOOP_MAJOR_BUG_REMEDIATION_PLAN.md`):
+
+1. **Core Multi-Task Loop** (`src/commands/run.ts`):
+   - Added outer `while (currentTask)` loop that processes ALL pending tasks
+   - Creates **fresh AI agent instance** for each task (Ralph pattern core principle)
+   - Added task-level retry loop with configurable `maxRetriesPerTask` (default: 2)
+   - Prints final summary with total tasks processed/completed/failed
+
+2. **New CLI Flag**:
+   - Added `--pause-between-tasks` flag for strict Ralph mode (human review after each task)
+
+3. **New Configuration Options** (`src/core/config-schema.ts`):
+   - `maxRetriesPerTask: number` (default: 2) - retries per task before marking failed
+   - `autoPush: boolean` (default: false) - auto-push after each task completion
+
+4. **New CheckpointManager Methods** (`src/core/checkpoint-manager.ts`):
+   - `createTaskCheckpoint()` - commits after successful task completion
+   - `createFailureCheckpoint()` - commits after failed task attempt (preserves state for post-mortem)
+
+5. **New GitBranchManager Methods** (`src/core/git-branch-manager.ts`):
+   - `pushToRemote()` - pushes current branch to remote
+   - `hasRemote()` - checks if a remote exists
+
+6. **New ProgressTracker Methods** (`src/core/progress-tracker.ts`):
+   - `loadPreviousTaskResults()` - loads previous task results for context injection
+   - `appendTaskResult()` - appends task result to progress file for tracking
+
+7. **New PlanManager Interface Method** (`src/core/plan-manager.ts`):
+   - `reload?()` - optional method to reload plan from source (already implemented in LocalMarkdownPlan)
+
+8. **Prompt Engineering for Honesty** (`src/core/context-builder.ts`):
+   - Added `HONESTY_GUIDANCE` section to prompt template
+   - Encourages agents to be honest about failures
+   - Documents blockers instead of false completion claims
+
+9. **New STUCK Action** (`src/core/response-parser.ts`, `src/core/action-executor.ts`):
+   - Added `[ACTION:STUCK]` action type for graceful failure signaling
+   - Agents can report: attempted actions, blockers, and suggestions
+   - STUCK triggers retry with fresh agent (benefits from progress documentation)
+
+10. **Utility Function** (`src/utils/shell.ts`):
+    - Added `waitForKeypress()` for `--pause-between-tasks` mode
+
+### Files Modified
+- `src/commands/run.ts` - Core fix with multi-task loop
+- `src/core/config-schema.ts` - New config options
+- `src/core/checkpoint-manager.ts` - Task-level checkpoints
+- `src/core/git-branch-manager.ts` - Push to remote
+- `src/core/progress-tracker.ts` - Multi-task progress tracking
+- `src/core/plan-manager.ts` - Optional reload method
+- `src/core/context-builder.ts` - Honesty guidance in prompt
+- `src/core/response-parser.ts` - STUCK action type
+- `src/core/action-executor.ts` - STUCK action handling
+- `src/utils/shell.ts` - waitForKeypress utility
+- `src/core/config-schema.test.ts` - Updated test for new config keys
+
+### Validation
+- `npm run typecheck` ✅
+- `npm test` ✅ (285 tests passing)
+- `npm run build` ✅
+
+## 2026-01-28 - Model Compatibility Improvements
+
+### Context
+Following the multi-task loop fix, analyzed the `MODEL_COMPAT_TEST_PLAN.md` to address model compatibility concerns:
+1. The `ghcralph init` command had a hardcoded list of 5 models
+2. GitHub Copilot CLI actually offers 14+ models
+3. The SDK provides `client.listModels()` API for dynamic model discovery
+4. No tests existed to validate parsing across different model output styles
+
+### Changes
+
+1. **Dynamic Model Listing** (`src/integrations/copilot-agent.ts`):
+   - Added `listAvailableModels()` instance method - fetches models from existing client
+   - Added static `fetchAvailableModels()` method - creates temporary client to fetch models
+   - Re-exported `ModelInfo` type from SDK for consumers
+
+2. **Dynamic Model Selection in Init** (`src/commands/init.ts`):
+   - Added `fetchModelOptions()` helper that calls `CopilotAgent.fetchAvailableModels()`
+   - Updated model selection prompt to use dynamically fetched models
+   - Falls back to hardcoded list if SDK fetch fails
+   - Maintains "Custom (enter manually)" option
+
+3. **Model Compatibility Tests** (`src/core/model-compatibility.test.ts`):
+   - Created parameterized test suite for response parsing across model variations
+   - Tests CREATE, EDIT, EXECUTE, COMPLETE, and STUCK action parsing
+   - Documents current parser behavior with different formatting styles
+   - Tests edge cases: Windows line endings, mixed case action types, malformed blocks
+
+4. **Updated CopilotAgent Tests** (`src/integrations/copilot-agent.test.ts`):
+   - Added `mockListModels` for SDK mock
+   - Added tests for `listAvailableModels()` and `fetchAvailableModels()`
+   - Tests error handling when SDK fetch fails
+
+### Files Modified
+- `src/integrations/copilot-agent.ts` - listAvailableModels methods
+- `src/integrations/index.ts` - Export ModelInfo type
+- `src/commands/init.ts` - Dynamic model fetching
+- `src/core/model-compatibility.test.ts` - New parameterized tests
+- `src/integrations/copilot-agent.test.ts` - listModels tests
+
+### Validation
+- `npm run typecheck` ✅
+- `npm test` ✅ (305 tests passing)
+- `npm run build` ✅
diff --git a/README.md b/README.md
@@ -11,6 +11,7 @@ Run **autonomous, checkpointed coding loops** with GitHub Copilot—designed to
 
 - 🌿 **Branch isolation**: works on a `ghcralph/*` branch (never modifies `main`/`master` directly)
 - 💾 **Automatic checkpoints**: commits after each iteration for easy rollback
+- 🔄 **Multi-task processing**: processes ALL tasks in plan files automatically
 - 🛡️ **Guardrails**: iteration limits, token budgets, timeouts, circuit breaker on repeated failures
 - 📋 **Flexible plan sources**: GitHub Issues or local Markdown task lists
 - 💻 **Cross-platform**: Windows, macOS, Linux
@@ -76,14 +77,15 @@ This approach prioritizes **safety** (automatic checkpoints, git isolation) and
 
 ## Key Features
 
-- 🔄 **Autonomous Loop**: Repeatedly invokes AI agent until task completion
+- 🔄 **Multi-Task Loop**: Processes ALL tasks in a plan file automatically with fresh AI agent per task
 - 📋 **Flexible Plan Sources**: GitHub Issues or local Markdown task lists
 - 🛡️ **Safety First**: Git branch isolation, file deletion safeguards
-- 💾 **Automatic Checkpoints**: Git commits after each iteration for easy rollback
+- 💾 **Automatic Checkpoints**: Git commits after each task completion for easy rollback
 - 📊 **Progress Tracking**: Real-time status, token usage, and session logs
-- ⚡ **Guardrails**: Iteration limits, token budgets, timeout controls
+- ⚡ **Guardrails**: Iteration limits, token budgets, timeout controls, task-level retries
 - 🔧 **Highly Configurable**: Customize behavior via CLI, env vars, or config files
 - 💻 **Cross-Platform**: Works on Windows, macOS, and Linux
+- 🤖 **Dynamic Model Discovery**: Fetches available models from Copilot SDK
 
 ## Commands
 
@@ -157,6 +159,9 @@ ghcralph run --github
 # Control iterations, tokens, and model via configuration
 # (set maxIterations / maxTokens / defaultModel in .ghcralph/config.json)
 
+# Pause between tasks for human review (strict Ralph mode)
+ghcralph run --file PLAN.md --pause-between-tasks
+
 # Specify context files
 ghcralph run --task "Fix tests" --context "src/**/*.test.ts"
 
@@ -184,19 +189,21 @@ GitHub Copilot Ralph uses a hierarchical configuration system:
 
 ### Configuration Options
 
-| Option          | Default     | Description                                           |
-| --------------- | ----------- | ----------------------------------------------------- |
-| `planSource`    | `local`     | Plan source: `github` or `local`                      |
-| `maxIterations` | `10`        | Maximum loop iterations                               |
-| `maxTokens`     | `100000`    | Token budget                                          |
-| `defaultModel`  | `gpt-4.1`   | Copilot model to use                                  |
-| `autoCommit`    | `true`      | Auto-commit after iterations                          |
-| `branchPrefix`  | `ghcralph/` | Prefix for GitHub Copilot Ralph branches              |
-| `githubRepo`    | -           | GitHub repository (owner/repo) for GitHub plan source |
-| `githubLabel`   | -           | Default GitHub issue label filter for GitHub plan      |
-| `githubMilestone` | -         | Default GitHub issue milestone filter for GitHub plan  |
-| `githubAssignee` | -          | Default GitHub issue assignee filter for GitHub plan   |
-| `localPlanFile` | -           | Path to local plan file                               |
+| Option             | Default     | Description                                           |
+| ------------------ | ----------- | ----------------------------------------------------- |
+| `planSource`       | `local`     | Plan source: `github` or `local`                      |
+| `maxIterations`    | `10`        | Maximum loop iterations per task                      |
+| `maxTokens`        | `100000`    | Token budget per task                                 |
+| `defaultModel`     | `gpt-4.1`   | Copilot model to use (dynamically fetched from SDK)   |
+| `autoCommit`       | `true`      | Auto-commit after iterations                          |
+| `branchPrefix`     | `ghcralph/` | Prefix for GitHub Copilot Ralph branches              |
+| `maxRetriesPerTask`| `2`         | Retries per task before marking as failed             |
+| `autoPush`         | `false`     | Auto-push to remote after each task completion        |
+| `githubRepo`       | -           | GitHub repository (owner/repo) for GitHub plan source |
+| `githubLabel`      | -           | Default GitHub issue label filter for GitHub plan     |
+| `githubMilestone`  | -           | Default GitHub issue milestone filter for GitHub plan |
+| `githubAssignee`   | -           | Default GitHub issue assignee filter for GitHub plan  |
+| `localPlanFile`    | -           | Path to local plan file                               |
 
 ### Environment Variables
 
@@ -208,6 +215,8 @@ export GHCRALPH_MAX_TOKENS=50000
 export GHCRALPH_DEFAULT_MODEL=gpt-4.1
 export GHCRALPH_AUTO_COMMIT=true
 export GHCRALPH_BRANCH_PREFIX=ghcralph/
+export GHCRALPH_MAX_RETRIES_PER_TASK=3
+export GHCRALPH_AUTO_PUSH=true
 export GHCRALPH_PLAN_SOURCE=local
 export GHCRALPH_GITHUB_REPO=owner/repo
 export GHCRALPH_GITHUB_LABEL=ralph-ready
@@ -225,6 +234,8 @@ export GHCRALPH_GITHUB_ASSIGNEE=octocat
   "defaultModel": "gpt-4.1",
   "autoCommit": true,
   "branchPrefix": "ghcralph/",
+  "maxRetriesPerTask": 2,
+  "autoPush": false,
   "githubRepo": "owner/repo",
   "githubLabel": "ralph-ready",
   "githubMilestone": "v1.0",

diff --git a/docs/architecture.md b/docs/architecture.md
@@ -457,6 +457,8 @@ graph LR
 | **Context accumulation**       | Model drifts with long context           | Conversation history accumulates        | ✅ FIXED |
 | **Complex prompt template**    | Meta-info confuses weaker models         | Iteration/token counts in prompt        | ✅ FIXED |
 | **Model sensitivity**          | Weaker models perform poorly             | Prompt relies on implicit understanding | ✅ FIXED |
+| **Single task per run**        | Only first task processed, then exits    | No outer loop for multi-task iteration  | ✅ FIXED v0.1.2 |
+| **Hardcoded model list**       | Init shows outdated model options        | Model list not fetched from SDK         | ✅ FIXED v0.1.2 |
 
 ### Current vs Expected Flow
 
@@ -538,20 +540,38 @@ graph LR
 The action executor component has been implemented in `src/core/action-executor.ts`:
 
 **Supported Actions:**
-| Action     | Description        | Example                                         |
-| ---------- | ------------------ | ----------------------------------------------- |
-| `CREATE`   | Create a new file  | `[ACTION:CREATE] path: file.txt`                |
-| `EDIT`     | Edit existing file | `[ACTION:EDIT] path: file.txt [OLD]...[NEW]...` |
-| `DELETE`   | Delete a file      | `[ACTION:DELETE] path: file.txt`                |
-| `EXECUTE`  | Run shell command  | `[ACTION:EXECUTE] command: npm test`            |
-| `COMPLETE` | Mark task done     | `[ACTION:COMPLETE] reason: Tests pass`          |
+| Action     | Description                | Example                                         |
+| ---------- | -------------------------- | ----------------------------------------------- |
+| `CREATE`   | Create a new file          | `[ACTION:CREATE] path: file.txt`                |
+| `EDIT`     | Edit existing file         | `[ACTION:EDIT] path: file.txt [OLD]...[NEW]...` |
+| `DELETE`   | Delete a file              | `[ACTION:DELETE] path: file.txt`                |
+| `EXECUTE`  | Run shell command          | `[ACTION:EXECUTE] command: npm test`            |
+| `COMPLETE` | Mark task done             | `[ACTION:COMPLETE] reason: Tests pass`          |
+| `STUCK`    | Signal blocked/unable      | `[ACTION:STUCK] attempted:... blocker:...`      |
 
 **Safety Features:**
 - Path validation (prevents escaping working directory)
 - File safeguard integration (protects baseline files from deletion)
 - Command timeout (30 seconds default)
 - Dry run mode for testing
 
+### 2.1.1 STUCK Action ✅ NEW in v0.1.2
+
+The STUCK action allows the AI agent to signal when it cannot complete a task:
+
+```
+[ACTION:STUCK]
+attempted: What the agent tried to do
+blocker: What is preventing completion
+suggestion: Optional suggestion for next steps
+```
+
+**Behavior:**
+- STUCK triggers a task retry with a fresh AI agent
+- The progress file documents the failed attempt for context
+- After `maxRetriesPerTask` (default: 2) STUCKs, the task is marked failed
+- Prevents false completion claims - encourages honest failure reporting
+
 ### 2.2 Verification Hooks ✅ IMPLEMENTED
 
 The verification hooks component has been implemented in `src/core/verification-hooks.ts`:
@@ -898,16 +918,19 @@ graph TB
 The current architecture successfully:
 - ✅ Authenticates with GitHub Copilot
 - ✅ Manages iteration loops with limits and guards
+- ✅ **Processes ALL tasks in plan files** (multi-task loop)
+- ✅ Creates **fresh AI agent per task** (Ralph pattern core)
 - ✅ Builds context-rich prompts
 - ✅ Sends/receives from Copilot SDK
 - ✅ Tracks progress and tokens
-
-The current architecture lacks:
-- ❌ Structured output format specification
-- ❌ Response parsing for file operations
-- ❌ Action execution (file create/edit/delete)
-- ❌ Command execution for verification
-- ❌ Feedback loop to inform AI of results
-- ❌ Clear task completion detection
-
-To work reliably with models like gpt-4.1, the CLI needs to move from a "chat wrapper" to a true "agent executor" that defines explicit action formats, parses responses, executes actions, and provides feedback.
+- ✅ Parses structured ACTION responses
+- ✅ Executes file and shell actions
+- ✅ Supports graceful failure with STUCK action
+- ✅ Dynamic model discovery from SDK
+
+The CLI has evolved from a "chat wrapper" to a true "agent executor" that:
+1. Defines explicit action formats (CREATE, EDIT, DELETE, EXECUTE, COMPLETE, STUCK)
+2. Parses AI responses for structured actions
+3. Executes actions on the filesystem
+4. Provides feedback to inform subsequent iterations
+5. Processes multiple tasks with task-level retries and checkpoints
diff --git a/docs/cookbook.md b/docs/cookbook.md
@@ -92,6 +92,32 @@ ghcralph run --task "Implement user authentication with JWT" \
 - [ ] Add integration tests
 ```
 
+### Multi-Task Processing
+
+When you run `ghcralph run --file PLAN.md`, Ralph will:
+
+1. **Process ALL tasks** in the plan file automatically
+2. **Create a fresh AI agent** for each task (prevents context pollution)
+3. **Retry failed tasks** up to `maxRetriesPerTask` times (default: 2)
+4. **Commit after each task** with `createTaskCheckpoint()`
+5. **Print a final summary** showing tasks processed/completed/failed
+
+```bash
+# Process all tasks in a plan file
+ghcralph run --file TODO.md
+
+# Pause between tasks for human review (strict Ralph mode)
+ghcralph run --file TODO.md --pause-between-tasks
+```
+
+**Configuration:**
+```json
+{
+  "maxRetriesPerTask": 2,
+  "autoPush": false
+}
+```
+
 ---
 
 ## Pattern: Refactoring Session
@@ -246,6 +272,28 @@ ghcralph rollback --list
 ghcralph rollback --iterations 1
 ```
 
+### Task marked as STUCK
+
+If a task is marked as STUCK (agent signaled it cannot complete):
+
+```bash
+# Check the progress file for details on what was attempted
+cat .ghcralph/progress.md
+
+# The agent will retry with fresh context up to maxRetriesPerTask times
+# If all retries fail, review the blocker and consider:
+# 1. Breaking the task into smaller pieces
+# 2. Providing more context with --context
+# 3. Resolving the blocker manually and re-running
+```
+
+**Configure retry behavior:**
+```json
+{
+  "maxRetriesPerTask": 3
+}
+```
+
 ### Token budget exhausted
 
 ```bash

diff --git a/package-lock.json b/package-lock.json
diff --git a/package.json b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "ghcralph",
-  "version": "0.1.1",
+  "version": "0.1.2",
   "description": "GitHub Copilot Ralph - A cross-platform CLI for running autonomous agentic coding loops using the Ralph Wiggum pattern with GitHub Copilot",
   "main": "dist/index.js",
   "types": "dist/index.d.ts",