From e902e8b5113debaf88d202115644b7f25b0512f0 Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 09:48:12 -0400 Subject: [PATCH 01/15] feat: US-001 - Create Security Documentation Add comprehensive security guide covering: - Mandatory safeguards for autonomous agent runs - Pre-flight security checklist with dangerous env vars - Emergency stop procedures - Docker sandboxing example --- docs/SECURITY.md | 117 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 117 insertions(+) create mode 100644 docs/SECURITY.md diff --git a/docs/SECURITY.md b/docs/SECURITY.md new file mode 100644 index 00000000..2694d23f --- /dev/null +++ b/docs/SECURITY.md @@ -0,0 +1,117 @@ +# Security Guide + +Ralph Wiggum runs as an autonomous agent with significant system access. This document outlines security best practices to prevent credential exposure and unauthorized actions. + +## Mandatory Safeguards + +Before running Ralph in any environment, ensure these safeguards are in place: + +1. **Never expose production credentials** - Ralph should not have access to production databases, cloud accounts, or API keys +2. **Use isolated environments** - Run Ralph in sandboxed containers, VMs, or development environments only +3. **Limit file system access** - Restrict Ralph to the project directory when possible +4. **Review generated code** - Always review commits before merging to protected branches +5. **Monitor token usage** - Set budget limits to prevent runaway API costs + +## Pre-Flight Security Checklist + +Run through this checklist before starting any Ralph session: + +- [ ] **Environment Variables Cleared** - Ensure dangerous environment variables are not set: + - `AWS_ACCESS_KEY_ID` - AWS credentials could allow cloud resource access + - `AWS_SECRET_ACCESS_KEY` - AWS credentials could allow cloud resource access + - `DATABASE_URL` - Database connection strings could expose production data + - `OPENAI_API_KEY` - Could incur costs on your account + - `ANTHROPIC_API_KEY` - Could incur costs on your account + - `GITHUB_TOKEN` - Could push to repositories or access private repos + - `NPM_TOKEN` - Could publish packages + - `DOCKER_PASSWORD` - Could push images + +- [ ] **Running in Sandbox** - Confirm you're in a sandboxed environment +- [ ] **Git Remote Verified** - Ensure pushes go to correct repository +- [ ] **Branch Protection** - Confirm main/master has branch protection enabled +- [ ] **Budget Set** - API cost limits configured + +## Emergency Stop + +If Ralph begins behaving unexpectedly, use these methods to stop execution: + +### Immediate Stop +```bash +# Kill the ralph.sh process +pkill -f ralph.sh + +# Or find and kill specifically +ps aux | grep ralph.sh +kill -9 +``` + +### Graceful Stop +```bash +# Create a stop file (if ralph.sh is configured to check for it) +touch .ralph-stop + +# Or simply Ctrl+C in the terminal running ralph.sh +``` + +### Post-Emergency Checklist +1. Review git log for any unexpected commits +2. Check git diff for uncommitted changes +3. Review any files created or modified +4. Check cloud console for any unexpected resources +5. Rotate any credentials that may have been exposed + +## Docker Sandboxing + +Running Ralph in Docker provides isolation from your host system: + +```dockerfile +# Dockerfile.ralph +FROM node:20-slim + +# Install required tools +RUN apt-get update && apt-get install -y \ + git \ + jq \ + curl \ + && rm -rf /var/lib/apt/lists/* + +# Create non-root user +RUN useradd -m -s /bin/bash ralph +USER ralph +WORKDIR /home/ralph/workspace + +# Copy only necessary files +COPY --chown=ralph:ralph . . + +# Don't include any credentials in the image +# Pass API keys at runtime only +``` + +```bash +# Build the sandbox +docker build -f Dockerfile.ralph -t ralph-sandbox . + +# Run with minimal permissions +docker run -it --rm \ + --network=none \ + --read-only \ + --tmpfs /tmp \ + -v $(pwd):/home/ralph/workspace \ + -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \ + ralph-sandbox \ + ./ralph.sh +``` + +### Docker Security Options +- `--network=none` - Prevents network access (remove if Ralph needs to fetch dependencies) +- `--read-only` - Makes container filesystem read-only +- `--tmpfs /tmp` - Provides writable temp directory +- Mount only the project directory, not your entire home folder + +## Additional Recommendations + +1. **Use separate API keys** - Create dedicated API keys for Ralph with lower rate limits +2. **Enable audit logging** - Log all commands Ralph executes for review +3. **Set up alerts** - Configure cost alerts in your cloud provider dashboards +4. **Regular credential rotation** - Rotate any credentials that have been in the environment +5. **Review before merge** - Never auto-merge Ralph's PRs without human review From 4e71a2387bc8e05fbcc850bf1ed0d87079c52de8 Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 09:50:14 -0400 Subject: [PATCH 02/15] feat: US-002 - Create Monitoring Documentation --- docs/MONITORING.md | 169 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 169 insertions(+) create mode 100644 docs/MONITORING.md diff --git a/docs/MONITORING.md b/docs/MONITORING.md new file mode 100644 index 00000000..8c5a8fb7 --- /dev/null +++ b/docs/MONITORING.md @@ -0,0 +1,169 @@ +# Monitoring Guide + +This guide helps operators monitor Ralph Wiggum during autonomous runs and know when to intervene. + +## Red Flags: When to Intervene + +Watch for these patterns that indicate Ralph needs human intervention: + +### 1. Repeated Failures on Same Story +```bash +# Check progress.txt for repeated story attempts +grep -c "US-00X" progress.txt +``` +If the same story ID appears more than 3 times, Ralph is likely stuck. + +### 2. Typecheck/Lint Loops +```bash +# Watch for repeated error patterns +tail -f progress.txt | grep -E "(typecheck|lint|error)" +``` +Repeated cycles of "fixing" the same error indicates a fundamental misunderstanding. + +### 3. File Thrashing +```bash +# Check git for excessive changes to same file +git log --oneline --follow -20 -- path/to/file.ts +``` +Multiple commits to the same file in quick succession suggests trial-and-error debugging. + +### 4. Scope Creep +```bash +# Check for unexpected file changes +git diff --stat HEAD~5 +``` +If Ralph is modifying files unrelated to the current story, it may have lost focus. + +### 5. Silent Failures +```bash +# Check if progress is being made +ls -la progress.txt +cat prd.json | jq '.userStories[] | select(.passes == true) | .id' +``` +If progress.txt hasn't been updated but Ralph is still running, something may be wrong. + +### 6. Credential Warnings +```bash +# Monitor for any credential-related output +grep -i -E "(password|secret|key|token|credential)" progress.txt +``` +Any mention of credentials in logs requires immediate review. + +### 7. Network Activity +```bash +# Check for unexpected network calls (if using network monitoring) +lsof -i -P | grep ralph +``` +Unexpected network activity could indicate Ralph is accessing external services. + +## Monitoring Commands + +Use these commands to monitor Ralph in real-time: + +### Real-Time Progress +```bash +# Follow progress updates +tail -f progress.txt + +# Watch for story completions +watch -n 5 'cat prd.json | jq ".userStories[] | select(.passes == true) | .id"' +``` + +### Story Status Dashboard +```bash +# Show all story statuses +cat prd.json | jq -r '.userStories[] | "\(.id): \(.title) - passes: \(.passes)"' + +# Count completed vs total +echo "Completed: $(cat prd.json | jq '[.userStories[] | select(.passes == true)] | length')/$(cat prd.json | jq '.userStories | length')" +``` + +### Git Activity +```bash +# Watch for new commits +watch -n 10 'git log --oneline -10' + +# Check uncommitted changes +git status --short + +# View recent diffs +git diff HEAD~1 --stat +``` + +### Resource Usage +```bash +# Monitor CPU/memory usage +top -l 1 | grep -E "(ralph|claude|amp)" + +# Check disk usage in project +du -sh . +``` + +## When to Stop and Regenerate Plan + +Stop Ralph and regenerate the plan when: + +1. **Same error appears 3+ times** - The current approach isn't working +2. **Story takes more than 5 iterations** - Requirements may be unclear or impossible +3. **Multiple stories fail in sequence** - There may be a fundamental issue with the plan +4. **Unexpected side effects** - Ralph is breaking previously working features +5. **Tests start failing** - Regression indicates architectural problems +6. **Budget threshold reached** - Cost is exceeding the value of the feature + +### How to Stop and Reassess + +```bash +# 1. Stop Ralph gracefully +touch .ralph-stop +# OR +Ctrl+C + +# 2. Review current state +git log --oneline -10 +git diff +cat progress.txt | tail -50 + +# 3. Check which stories are problematic +cat prd.json | jq '.userStories[] | select(.passes == false) | {id, title, notes}' + +# 4. Consider if PRD needs revision +# - Are acceptance criteria clear and achievable? +# - Are there missing dependencies between stories? +# - Is the scope realistic? +``` + +## Intervention Checklist + +Before intervening, run through this checklist: + +- [ ] **Is Ralph actually stuck?** - Wait at least 2 minutes for complex operations +- [ ] **Check the logs** - Review progress.txt for context on what Ralph is attempting +- [ ] **Review recent commits** - Understand what changes have been made +- [ ] **Check story notes** - Ralph may have added notes explaining difficulties +- [ ] **Verify acceptance criteria** - Ensure they are actually achievable +- [ ] **Check for external dependencies** - Does the story require services Ralph can't access? +- [ ] **Review error messages** - Are there clear errors indicating the problem? +- [ ] **Consider partial progress** - Can you help Ralph past a specific blocker? + +### Post-Intervention Actions + +After intervening: + +1. **Document the intervention** - Add a note to progress.txt explaining what you did +2. **Update story notes** - Add context to prd.json if helpful +3. **Consider PRD changes** - Split complex stories or clarify criteria if needed +4. **Restart cleanly** - Ensure Ralph has a clear starting point +5. **Monitor closely** - Watch the first few iterations after intervention + +## Alert Thresholds + +Configure these alerts for autonomous monitoring: + +| Metric | Warning | Critical | +|--------|---------|----------| +| Same story iterations | 3 | 5 | +| Time on single story | 15 min | 30 min | +| Consecutive failures | 2 | 3 | +| Files changed per commit | 10 | 20 | +| API cost per story | $1 | $5 | +| Total run cost | $10 | $25 | From 43eadc5c3f60a1af68371b4d09dc1dc2f3abfcb0 Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 09:51:49 -0400 Subject: [PATCH 03/15] feat: US-003 - Create Cost Tracking Documentation --- docs/COST_TRACKING.md | 151 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 151 insertions(+) create mode 100644 docs/COST_TRACKING.md diff --git a/docs/COST_TRACKING.md b/docs/COST_TRACKING.md new file mode 100644 index 00000000..65904ecb --- /dev/null +++ b/docs/COST_TRACKING.md @@ -0,0 +1,151 @@ +# Cost Tracking Guide + +Running autonomous agents like Ralph can incur significant API costs. This guide helps you set budgets, track usage, and prevent runaway costs. + +## Budget Recommendations + +Set these budget limits before starting any autonomous session: + +| Feature Size | Estimated Stories | Recommended Budget | Max Iterations | +|--------------|-------------------|-------------------|----------------| +| Small | 1-3 stories | $5-10 | 10 | +| Medium | 4-8 stories | $15-30 | 25 | +| Large | 9-15 stories | $40-75 | 50 | +| XL | 16+ stories | $100+ | 100 | + +**Note:** These are estimates. Actual costs depend on story complexity, codebase size, and retry frequency. + +## Feature Size to Budget Mapping + +Use this guide to estimate budget before starting: + +### Small Features ($5-10) +- Bug fixes with clear reproduction steps +- Adding a single new field or column +- Documentation updates +- Simple configuration changes +- 1-3 acceptance criteria per story + +### Medium Features ($15-30) +- New API endpoint with tests +- Adding a new UI component +- Integration with existing service +- Refactoring a single module +- 3-5 acceptance criteria per story + +### Large Features ($40-75) +- New feature spanning multiple files +- Database migration with data transformation +- Multi-step workflow implementation +- Cross-cutting concerns (auth, logging) +- 5+ acceptance criteria per story + +## Claude Code Usage Tracking + +Claude Code provides built-in usage tracking. Use these commands to monitor costs: + +### Check Current Usage +```bash +# View usage summary for current session +claude usage + +# View detailed usage breakdown +claude usage --detailed +``` + +### Set Budget Limits +```bash +# Set a budget limit before starting (prevents overspend) +claude config set budget_limit 25.00 + +# Check remaining budget +claude usage --remaining +``` + +### Monitor During Session +```bash +# Watch usage in real-time (run in separate terminal) +watch -n 30 'claude usage' +``` + +### Post-Session Analysis +```bash +# Export usage report +claude usage --export > usage-report-$(date +%Y%m%d).json + +# Parse costs from report +cat usage-report-*.json | jq '.total_cost' +``` + +## Amp Usage Tracking + +Amp (Sourcegraph's AI assistant) tracks usage through Sourcegraph's dashboard: + +### Web Dashboard +1. Navigate to your Sourcegraph instance +2. Go to **Settings** → **Usage & Billing** +3. View Amp usage by time period + +### CLI Tracking +```bash +# Check Amp usage via API (requires auth token) +curl -H "Authorization: token $SRC_ACCESS_TOKEN" \ + https://sourcegraph.com/.api/user/usage | jq '.' + +# Filter for Amp-specific usage +curl -H "Authorization: token $SRC_ACCESS_TOKEN" \ + https://sourcegraph.com/.api/user/usage | jq '.amp' +``` + +### Amp Budget Controls +- Set organization-wide limits in Sourcegraph admin +- Per-user limits available in enterprise plans +- Monitor usage alerts via email or Slack integration + +## Cost Prevention Strategies + +### Before Starting +1. **Estimate scope** - Map features to budget sizes above +2. **Set hard limits** - Configure budget caps that stop execution +3. **Use circuit breakers** - Limit retries per story (see ralph.sh) +4. **Start small** - Run a pilot with 1-2 stories before full batch + +### During Execution +1. **Monitor actively** - Watch `claude usage` during runs +2. **Check progress.txt** - Stories with many retries indicate problems +3. **Stop early** - Kill the session if costs are tracking above budget +4. **Review prd.json** - Check for stories repeatedly failing + +### After Completion +1. **Export usage** - Save detailed reports for analysis +2. **Calculate cost per story** - Total cost / stories completed +3. **Adjust estimates** - Update budget recommendations based on actuals +4. **Identify expensive patterns** - Stories with many retries cost more + +## Red Flags: Cost Warning Signs + +Watch for these patterns that indicate escalating costs: + +| Red Flag | Likely Cause | Action | +|----------|--------------|--------| +| Same story retrying 3+ times | Unclear acceptance criteria | Stop and clarify requirements | +| Many small commits | Agent thrashing on solution | Review approach | +| No progress for 5+ iterations | Blocking issue | Stop and investigate | +| Budget 50% spent, < 25% done | Scope underestimated | Re-evaluate or pause | + +## Cost Tracking Checklist + +Before each Ralph session: + +- [ ] Estimated feature size and set budget +- [ ] Configured hard budget limit in Claude/Amp +- [ ] Set max iterations in ralph.sh +- [ ] Have monitoring terminal ready +- [ ] Know how to emergency stop + +After each Ralph session: + +- [ ] Exported usage report +- [ ] Calculated actual vs estimated cost +- [ ] Updated budget estimates if needed +- [ ] Documented expensive stories for future reference From 99f043576311f9907b7b5c3af2c7de242dfd74ae Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 09:54:12 -0400 Subject: [PATCH 04/15] feat: US-004 - Create Planning Skill Add planning skill with 5 mandatory question rounds to force deep requirements exploration before PRD creation. Includes completion gate that requires all rounds to be completed. --- skills/planning/SKILL.md | 343 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 343 insertions(+) create mode 100644 skills/planning/SKILL.md diff --git a/skills/planning/SKILL.md b/skills/planning/SKILL.md new file mode 100644 index 00000000..03e5bdbc --- /dev/null +++ b/skills/planning/SKILL.md @@ -0,0 +1,343 @@ +--- +name: planning +description: "Deep requirements exploration before creating a PRD. Use when starting any new feature to ensure requirements are fully understood. Triggers on: plan this feature, explore requirements, planning session, before I write a prd." +--- + +# Planning Skill + +Forces deep requirements exploration through 5 mandatory question rounds before you can create a PRD. This prevents under-specified features and wasted implementation cycles. + +--- + +## The Job + +1. Conduct 5 rounds of questions with the user +2. Document answers in a planning summary +3. Save output to `tasks/planning-[feature].md` +4. Only then can you proceed to PRD creation + +**Important:** You cannot skip rounds. All 5 rounds must be completed before moving to PRD. + +--- + +## Completion Gate + +**You MUST complete all 5 rounds before this skill is considered complete.** + +After each round, explicitly state: +``` +Round [N] complete. [5-N] rounds remaining. +``` + +Do NOT proceed to PRD creation until you have stated: +``` +Round 5 complete. Planning session finished. +``` + +If the user asks to skip rounds or rush to implementation, remind them: +> "Planning requires all 5 rounds. Skipping leads to incomplete requirements and rework. Which question should we tackle next?" + +--- + +## Round 1: Problem Understanding + +**Goal:** Understand WHAT problem we're solving and WHY it matters. + +Ask questions about: +- What problem does this solve? +- Who experiences this problem? +- What happens today without this feature? +- What pain points does this address? +- Why is this important now? + +### Example Questions: + +``` +1. What specific problem are we trying to solve? + A. Users cannot do X at all + B. Users can do X but it's slow/painful + C. Users frequently make mistakes doing X + D. Other: [please specify] + +2. Who experiences this problem most acutely? + A. New users during onboarding + B. Power users doing advanced tasks + C. All users equally + D. Internal team members + +3. What happens today when users encounter this problem? + A. They work around it manually + B. They contact support + C. They abandon the task + D. They use a competitor +``` + +After gathering answers, summarize: +``` +## Round 1 Summary: Problem Understanding +- Problem: [concise problem statement] +- Affected users: [who] +- Current workaround: [what they do now] +- Impact: [why it matters] +``` + +**Round 1 complete. 4 rounds remaining.** + +--- + +## Round 2: Scope Definition + +**Goal:** Define the boundaries of what we WILL and WON'T build. + +Ask questions about: +- What is the minimum viable solution? +- What would a full-featured version include? +- What is explicitly out of scope? +- What are the must-haves vs nice-to-haves? +- What adjacent features should we NOT touch? + +### Example Questions: + +``` +1. What is the minimum viable version of this feature? + A. Just the core functionality, no polish + B. Core + basic UI polish + C. Full feature set with advanced options + D. Let me describe: [specify] + +2. Which of these are must-haves vs nice-to-haves? + [List potential features, ask user to categorize] + +3. What should this feature explicitly NOT do? + A. No integration with external services + B. No admin configuration options + C. No mobile-specific features + D. Other: [specify] + +4. Are there adjacent features we should leave alone? + A. Yes: [list them] + B. No, we can modify anything needed +``` + +After gathering answers, summarize: +``` +## Round 2 Summary: Scope Definition +- MVP includes: [list] +- Nice-to-haves (not MVP): [list] +- Explicitly out of scope: [list] +- Do not touch: [list of adjacent features to avoid] +``` + +**Round 2 complete. 3 rounds remaining.** + +--- + +## Round 3: Technical Constraints + +**Goal:** Identify technical limitations, dependencies, and architecture requirements. + +Ask questions about: +- What existing systems does this touch? +- What database changes are needed? +- What API changes are needed? +- Are there performance requirements? +- Are there security considerations? +- What dependencies exist? + +### Example Questions: + +``` +1. What existing systems will this feature interact with? + A. Database only + B. Database + existing API endpoints + C. Database + API + external services + D. Let me list: [specify] + +2. Are there performance requirements? + A. Must handle X requests per second + B. Must respond within X milliseconds + C. No specific requirements + D. Other: [specify] + +3. Are there security considerations? + A. Handles sensitive user data + B. Requires authentication checks + C. Needs rate limiting + D. No special security needs + +4. What existing code patterns should we follow? + A. Follow existing patterns in [module] + B. This is a new pattern for the codebase + C. Not sure, needs investigation +``` + +After gathering answers, summarize: +``` +## Round 3 Summary: Technical Constraints +- Systems affected: [list] +- Database changes: [yes/no, what] +- API changes: [yes/no, what] +- Performance requirements: [list] +- Security considerations: [list] +- Patterns to follow: [reference] +``` + +**Round 3 complete. 2 rounds remaining.** + +--- + +## Round 4: Edge Cases + +**Goal:** Identify what could go wrong and how to handle it. + +Ask questions about: +- What happens when X fails? +- What if the user does Y unexpectedly? +- What about empty states? +- What about error states? +- What about concurrent operations? +- What about data migration for existing users? + +### Example Questions: + +``` +1. What should happen when [primary action] fails? + A. Show error message and let user retry + B. Automatically retry X times + C. Fall back to [alternative behavior] + D. Other: [specify] + +2. What about empty states (no data yet)? + A. Show helpful empty state with CTA + B. Show nothing + C. Show sample/demo data + D. Other: [specify] + +3. What about existing users/data? + A. Migration needed for existing data + B. Feature only applies to new data + C. Backfill existing data automatically + D. Let users manually migrate + +4. What if user does something unexpected? + [List specific unexpected behaviors and ask how to handle] +``` + +After gathering answers, summarize: +``` +## Round 4 Summary: Edge Cases +- Error handling: [approach] +- Empty states: [approach] +- Data migration: [approach] +- Unexpected user behavior: [list with handling] +- Concurrent operations: [approach] +``` + +**Round 4 complete. 1 round remaining.** + +--- + +## Round 5: Verification Strategy + +**Goal:** Define how we'll know the feature works correctly. + +Ask questions about: +- How will we test this feature? +- What manual testing is needed? +- What automated tests should exist? +- How do we verify in production? +- What metrics indicate success? +- What could we monitor for issues? + +### Example Questions: + +``` +1. What automated tests should cover this feature? + A. Unit tests for core logic + B. Integration tests for API endpoints + C. E2E tests for user flows + D. All of the above + +2. What manual testing is required? + A. Visual inspection of UI changes + B. Testing edge cases in browser + C. Testing with different user roles + D. List specific scenarios: [specify] + +3. How do we know this feature is successful in production? + A. Users complete [action] X% more often + B. Support tickets about [topic] decrease + C. Feature adoption reaches X% + D. Other metrics: [specify] + +4. What should we monitor for issues? + A. Error rates on new endpoints + B. Performance metrics + C. User feedback/complaints + D. All of the above +``` + +After gathering answers, summarize: +``` +## Round 5 Summary: Verification Strategy +- Automated tests: [list] +- Manual testing: [list] +- Success metrics: [list] +- Monitoring: [list] +``` + +**Round 5 complete. Planning session finished.** + +--- + +## Output Format + +After all 5 rounds, compile the summaries into `tasks/planning-[feature].md`: + +```markdown +# Planning Summary: [Feature Name] + +Generated: [Date] +Status: Ready for PRD + +--- + +## Round 1: Problem Understanding +[Summary from Round 1] + +## Round 2: Scope Definition +[Summary from Round 2] + +## Round 3: Technical Constraints +[Summary from Round 3] + +## Round 4: Edge Cases +[Summary from Round 4] + +## Round 5: Verification Strategy +[Summary from Round 5] + +--- + +## Next Steps + +1. Create PRD using `/prd` skill +2. Convert to `prd.json` using `/ralph` skill +3. Run Ralph to implement +``` + +--- + +## Checklist + +Before completing planning: + +- [ ] Round 1 complete (Problem Understanding) +- [ ] Round 2 complete (Scope Definition) +- [ ] Round 3 complete (Technical Constraints) +- [ ] Round 4 complete (Edge Cases) +- [ ] Round 5 complete (Verification Strategy) +- [ ] All summaries documented +- [ ] Saved to `tasks/planning-[feature].md` + +**Do not proceed to PRD until all boxes are checked.** From 5cd7a5b6c2c10139de992ba848fe7fe2a131738a Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 09:56:25 -0400 Subject: [PATCH 05/15] feat: US-005 - Add Security Pre-Flight Check to ralph.sh --- ralph.sh | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/ralph.sh b/ralph.sh index baff052a..0dcfcddd 100755 --- a/ralph.sh +++ b/ralph.sh @@ -7,6 +7,7 @@ set -e # Parse arguments TOOL="amp" # Default to amp for backwards compatibility MAX_ITERATIONS=10 +SKIP_SECURITY="${SKIP_SECURITY_CHECK:-false}" while [[ $# -gt 0 ]]; do case $1 in @@ -18,6 +19,10 @@ while [[ $# -gt 0 ]]; do TOOL="${1#*=}" shift ;; + --skip-security-check) + SKIP_SECURITY="true" + shift + ;; *) # Assume it's max_iterations if it's a number if [[ "$1" =~ ^[0-9]+$ ]]; then @@ -33,6 +38,49 @@ if [[ "$TOOL" != "amp" && "$TOOL" != "claude" ]]; then echo "Error: Invalid tool '$TOOL'. Must be 'amp' or 'claude'." exit 1 fi + +# Security Pre-Flight Check +if [[ "$SKIP_SECURITY" != "true" ]]; then + echo "" + echo "===============================================================" + echo " Security Pre-Flight Check" + echo "===============================================================" + echo "" + + SECURITY_WARNINGS=() + + if [[ -n "${AWS_ACCESS_KEY_ID:-}" ]]; then + SECURITY_WARNINGS+=("AWS_ACCESS_KEY_ID is set - production credentials may be exposed") + fi + + if [[ -n "${DATABASE_URL:-}" ]]; then + SECURITY_WARNINGS+=("DATABASE_URL is set - database credentials may be exposed") + fi + + if [[ ${#SECURITY_WARNINGS[@]} -gt 0 ]]; then + echo "WARNING: Potential credential exposure detected:" + echo "" + for warning in "${SECURITY_WARNINGS[@]}"; do + echo " - $warning" + done + echo "" + echo "Running an autonomous agent with these credentials set could expose" + echo "them in logs, commit messages, or API calls." + echo "" + echo "See docs/SECURITY.md for sandboxing guidance." + echo "" + read -p "Continue anyway? (y/N) " -n 1 -r + echo "" + if [[ ! $REPLY =~ ^[Yy]$ ]]; then + echo "Aborted. Unset credentials or use --skip-security-check to bypass." + exit 1 + fi + else + echo "No credential exposure risks detected." + fi + echo "" +fi + SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" PRD_FILE="$SCRIPT_DIR/prd.json" PROGRESS_FILE="$SCRIPT_DIR/progress.txt" From ab038eb4a0e2d465d96341f9ef465d65de07ca45 Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 09:58:20 -0400 Subject: [PATCH 06/15] feat: US-006 - Add Circuit Breaker Logic to ralph.sh --- ralph.sh | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 93 insertions(+), 1 deletion(-) diff --git a/ralph.sh b/ralph.sh index 0dcfcddd..3eb06990 100755 --- a/ralph.sh +++ b/ralph.sh @@ -7,6 +7,7 @@ set -e # Parse arguments TOOL="amp" # Default to amp for backwards compatibility MAX_ITERATIONS=10 +MAX_ATTEMPTS_PER_STORY="${MAX_ATTEMPTS_PER_STORY:-5}" SKIP_SECURITY="${SKIP_SECURITY_CHECK:-false}" while [[ $# -gt 0 ]]; do @@ -127,7 +128,62 @@ if [ ! -f "$PROGRESS_FILE" ]; then echo "---" >> "$PROGRESS_FILE" fi -echo "Starting Ralph - Tool: $TOOL - Max iterations: $MAX_ITERATIONS" +# Circuit breaker: track attempts per story +ATTEMPTS_FILE="$SCRIPT_DIR/.story-attempts" +LAST_STORY_FILE="$SCRIPT_DIR/.last-story" + +# Initialize attempts tracking +if [ ! -f "$ATTEMPTS_FILE" ]; then + echo "{}" > "$ATTEMPTS_FILE" +fi + +# Function to get current story being worked on +get_current_story() { + if [ -f "$PRD_FILE" ]; then + jq -r '.userStories[] | select(.passes == false) | .id' "$PRD_FILE" 2>/dev/null | head -1 + fi +} + +# Function to get attempts for a story +get_story_attempts() { + local story_id="$1" + jq -r --arg id "$story_id" '.[$id] // 0' "$ATTEMPTS_FILE" 2>/dev/null || echo "0" +} + +# Function to increment attempts for a story +increment_story_attempts() { + local story_id="$1" + local current=$(get_story_attempts "$story_id") + local new_count=$((current + 1)) + jq --arg id "$story_id" --argjson count "$new_count" '.[$id] = $count' "$ATTEMPTS_FILE" > "$ATTEMPTS_FILE.tmp" && mv "$ATTEMPTS_FILE.tmp" "$ATTEMPTS_FILE" + echo "$new_count" +} + +# Function to mark story as skipped due to max attempts +mark_story_skipped() { + local story_id="$1" + local max_attempts="$2" + local note="Skipped: exceeded $max_attempts attempts without passing" + jq --arg id "$story_id" --arg note "$note" ' + .userStories = [.userStories[] | if .id == $id then .notes = $note else . end] + ' "$PRD_FILE" > "$PRD_FILE.tmp" && mv "$PRD_FILE.tmp" "$PRD_FILE" + echo "Circuit breaker: Marked story $story_id as skipped after $max_attempts attempts" +} + +# Function to check and apply circuit breaker +check_circuit_breaker() { + local story_id="$1" + local attempts=$(get_story_attempts "$story_id") + + if [ "$attempts" -ge "$MAX_ATTEMPTS_PER_STORY" ]; then + echo "Circuit breaker: Story $story_id has reached max attempts ($attempts/$MAX_ATTEMPTS_PER_STORY)" + mark_story_skipped "$story_id" "$MAX_ATTEMPTS_PER_STORY" + return 0 # true - circuit breaker tripped + fi + return 1 # false - circuit breaker not tripped +} + +echo "Starting Ralph - Tool: $TOOL - Max iterations: $MAX_ITERATIONS - Max attempts per story: $MAX_ATTEMPTS_PER_STORY" for i in $(seq 1 $MAX_ITERATIONS); do echo "" @@ -135,6 +191,42 @@ for i in $(seq 1 $MAX_ITERATIONS); do echo " Ralph Iteration $i of $MAX_ITERATIONS ($TOOL)" echo "===============================================================" + # Get current story and check circuit breaker + CURRENT_STORY=$(get_current_story) + + if [ -n "$CURRENT_STORY" ]; then + # Check if this is the same story as last iteration (consecutive failure detection) + LAST_STORY="" + if [ -f "$LAST_STORY_FILE" ]; then + LAST_STORY=$(cat "$LAST_STORY_FILE" 2>/dev/null || echo "") + fi + + if [ "$CURRENT_STORY" == "$LAST_STORY" ]; then + echo "Consecutive attempt on story: $CURRENT_STORY" + ATTEMPTS=$(increment_story_attempts "$CURRENT_STORY") + echo "Attempts on $CURRENT_STORY: $ATTEMPTS/$MAX_ATTEMPTS_PER_STORY" + + # Check circuit breaker + if check_circuit_breaker "$CURRENT_STORY"; then + echo "Skipping to next story..." + echo "$CURRENT_STORY" > "$LAST_STORY_FILE" + sleep 1 + continue + fi + else + # New story, record first attempt + if [ -n "$CURRENT_STORY" ]; then + ATTEMPTS=$(increment_story_attempts "$CURRENT_STORY") + echo "Starting story: $CURRENT_STORY (attempt $ATTEMPTS/$MAX_ATTEMPTS_PER_STORY)" + fi + fi + + # Record current story for next iteration + echo "$CURRENT_STORY" > "$LAST_STORY_FILE" + else + echo "No incomplete stories found" + fi + # Run the selected tool with the ralph prompt if [[ "$TOOL" == "amp" ]]; then OUTPUT=$(cat "$SCRIPT_DIR/prompt.md" | amp --dangerously-allow-all 2>&1 | tee /dev/stderr) || true From d0d385557a60b18fe6bd17cc75edbd3b9ae0fb65 Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 09:59:55 -0400 Subject: [PATCH 07/15] feat: US-007 - Add Completion Verification to ralph.sh --- ralph.sh | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/ralph.sh b/ralph.sh index 3eb06990..6bec1924 100755 --- a/ralph.sh +++ b/ralph.sh @@ -238,9 +238,27 @@ for i in $(seq 1 $MAX_ITERATIONS); do # Check for completion signal if echo "$OUTPUT" | grep -q "COMPLETE"; then echo "" - echo "Ralph completed all tasks!" - echo "Completed at iteration $i of $MAX_ITERATIONS" - exit 0 + echo "COMPLETE signal received. Verifying all stories pass..." + + # Verify all stories actually have passes:true + INCOMPLETE_STORIES=$(jq -r '.userStories[] | select(.passes == false) | .id' "$PRD_FILE" 2>/dev/null || echo "") + + if [ -z "$INCOMPLETE_STORIES" ]; then + echo "Verification passed: All stories have passes:true" + echo "" + echo "Ralph completed all tasks!" + echo "Completed at iteration $i of $MAX_ITERATIONS" + exit 0 + else + echo "" + echo "WARNING: COMPLETE claimed but verification failed!" + echo "The following stories still have passes:false:" + echo "$INCOMPLETE_STORIES" | while read -r story_id; do + echo " - $story_id" + done + echo "" + echo "Continuing iteration to fix incomplete stories..." + fi fi echo "Iteration $i complete. Continuing..." From 748902d6a996b08d24fa05f5c43166da7b7f9413 Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 10:01:57 -0400 Subject: [PATCH 08/15] feat: US-008 - Add Backpressure Section to CLAUDE.md --- CLAUDE.md | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/CLAUDE.md b/CLAUDE.md index f95bb927..8eff944e 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -77,6 +77,54 @@ Only update CLAUDE.md if you have **genuinely reusable knowledge** that would he - Keep changes focused and minimal - Follow existing code patterns +## Mandatory Quality Gates (Backpressure) + +Quality gates are **mandatory blockers**, not suggestions. You MUST NOT mark a story as complete until ALL gates pass. + +### Required Gates + +Before marking ANY story as `passes: true`, you MUST verify: + +1. **Typecheck MUST pass** - Run `npm run build` (or project equivalent) with zero errors +2. **Lint MUST pass** - Run `npm run lint` (or project equivalent) with zero errors +3. **Tests MUST pass** - Run `npm test` (or project equivalent) with zero failures + +If ANY gate fails, the story is NOT complete. Period. + +### Forbidden Shortcuts + +Never use these to bypass quality gates: + +| Forbidden | Why | +|-----------|-----| +| `@ts-ignore` | Hides type errors instead of fixing them | +| `@ts-expect-error` | Same as above - masks real problems | +| `eslint-disable` | Suppresses lint rules without fixing violations | +| `eslint-disable-next-line` | Same as above - circumvents quality checks | +| `// @nocheck` | Disables type checking for entire file | +| `any` type | Defeats the purpose of TypeScript | + +If you find yourself reaching for these, STOP. Fix the actual issue. + +### 3-Attempt Limit + +If you cannot make a story pass quality gates after 3 attempts: + +1. **STOP** - Do not continue iterating on the same approach +2. **Document** - Add detailed notes about what's failing and why +3. **Skip** - Move to the next story and let a human investigate +4. **Never** - Do not use forbidden shortcuts to force a pass + +This prevents infinite loops on fundamentally blocked stories. + +### Backpressure Mindset + +Think of quality gates as physical barriers, not speed bumps: +- A speed bump slows you down but lets you pass +- A barrier stops you completely until you have the right key + +You cannot "push through" a failing gate. You must fix it or stop. + ## Browser Testing (If Available) For any story that changes UI, verify it works in the browser if you have browser testing tools configured (e.g., via MCP): From 018995c2daaa9e9bc8b51d9f49eb57c43f341989 Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 10:04:00 -0400 Subject: [PATCH 09/15] feat: US-009 - Add Verification Section to CLAUDE.md --- CLAUDE.md | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/CLAUDE.md b/CLAUDE.md index 8eff944e..9c848190 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -125,6 +125,72 @@ Think of quality gates as physical barriers, not speed bumps: You cannot "push through" a failing gate. You must fix it or stop. +## Verification Before Completion + +Before claiming ANY story is complete, you MUST verify your work systematically. Do not trust your memory or assumptions—run the checks. + +### Verification Checklist + +Before marking a story as `passes: true`, complete this checklist: + +``` +## Verification Checklist for [Story ID] + +### 1. Acceptance Criteria Check +- [ ] Criterion 1: [How verified - command/file check/grep] +- [ ] Criterion 2: [How verified] +- [ ] Criterion 3: [How verified] +... (one checkbox per criterion) + +### 2. Quality Gates +- [ ] Typecheck passes: `npm run build` (or equivalent) +- [ ] Lint passes: `npm run lint` (or equivalent) +- [ ] Tests pass: `npm test` (or equivalent) + +### 3. Regression Check +- [ ] Full test suite passes (not just new tests) +- [ ] No unrelated failures introduced + +### 4. Final Verification +- [ ] Re-read each acceptance criterion one more time +- [ ] Confirmed each criterion is met with evidence +``` + +### How to Verify Each Criterion + +For each acceptance criterion, you must have **evidence**, not just belief: + +| Criterion Type | Verification Method | +|----------------|---------------------| +| "File X exists" | `ls -la path/to/X` or Read tool | +| "Contains section Y" | `grep -n "Y" file` or Read tool | +| "Command succeeds" | Run the command, check exit code | +| "Output contains Z" | Run command, pipe to grep | +| "Valid JSON" | `jq . file.json` succeeds | + +### Before Outputting COMPLETE + +When you believe ALL stories are done and you're about to output `COMPLETE`: + +1. **Re-verify the current story** - Run all quality gates one more time +2. **Check prd.json** - Confirm all stories show `passes: true` +3. **Run full verification** - `jq '.userStories[] | select(.passes == false) | .id' prd.json` should return nothing +4. **Only then** output the COMPLETE signal + +If ANY verification fails at this stage, do NOT output COMPLETE. Fix the issue first. + +### Evidence Over Assertion + +Never claim something works without proving it: + +| Bad (Assertion) | Good (Evidence) | +|-----------------|-----------------| +| "I added the section" | "Verified with `grep -n 'Section Name' file` - found at line 42" | +| "Tests pass" | "Ran `npm test` - 47 tests passed, 0 failed" | +| "File is valid JSON" | "Ran `jq . file.json` - parsed successfully" | + +Run the command. See the output. Report the evidence. + ## Browser Testing (If Available) For any story that changes UI, verify it works in the browser if you have browser testing tools configured (e.g., via MCP): From 604147b7d275f0cc9239d31c4d8b4381f13ec90b Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 10:06:11 -0400 Subject: [PATCH 10/15] feat: US-010 - Add Backpressure Section to prompt.md --- prompt.md | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/prompt.md b/prompt.md index cdebe901..f3a034f1 100644 --- a/prompt.md +++ b/prompt.md @@ -80,6 +80,54 @@ Only update AGENTS.md if you have **genuinely reusable knowledge** that would he - Keep changes focused and minimal - Follow existing code patterns +## Mandatory Quality Gates (Backpressure) + +Quality gates are **mandatory blockers**, not suggestions. You MUST NOT mark a story as complete until ALL gates pass. + +### Required Gates + +Before marking ANY story as `passes: true`, you MUST verify: + +1. **Typecheck MUST pass** - Run `npm run build` (or project equivalent) with zero errors +2. **Lint MUST pass** - Run `npm run lint` (or project equivalent) with zero errors +3. **Tests MUST pass** - Run `npm test` (or project equivalent) with zero failures + +If ANY gate fails, the story is NOT complete. Period. + +### Forbidden Shortcuts + +Never use these to bypass quality gates: + +| Forbidden | Why | +|-----------|-----| +| `@ts-ignore` | Hides type errors instead of fixing them | +| `@ts-expect-error` | Same as above - masks real problems | +| `eslint-disable` | Suppresses lint rules without fixing violations | +| `eslint-disable-next-line` | Same as above - circumvents quality checks | +| `// @nocheck` | Disables type checking for entire file | +| `any` type | Defeats the purpose of TypeScript | + +If you find yourself reaching for these, STOP. Fix the actual issue. + +### 3-Attempt Limit + +If you cannot make a story pass quality gates after 3 attempts: + +1. **STOP** - Do not continue iterating on the same approach +2. **Document** - Add detailed notes about what's failing and why +3. **Skip** - Move to the next story and let a human investigate +4. **Never** - Do not use forbidden shortcuts to force a pass + +This prevents infinite loops on fundamentally blocked stories. + +### Backpressure Mindset + +Think of quality gates as physical barriers, not speed bumps: +- A speed bump slows you down but lets you pass +- A barrier stops you completely until you have the right key + +You cannot "push through" a failing gate. You must fix it or stop. + ## Browser Testing (Required for Frontend Stories) For any story that changes UI, you MUST verify it works in the browser: From 03fee657b41f083eb7f35a90db36f1d07ad351f3 Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 10:08:09 -0400 Subject: [PATCH 11/15] feat: US-011 - Add Verification Section to prompt.md --- prompt.md | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/prompt.md b/prompt.md index f3a034f1..b7ad678b 100644 --- a/prompt.md +++ b/prompt.md @@ -128,6 +128,72 @@ Think of quality gates as physical barriers, not speed bumps: You cannot "push through" a failing gate. You must fix it or stop. +## Verification Before Completion + +Before claiming ANY story is complete, you MUST verify your work systematically. Do not trust your memory or assumptions—run the checks. + +### Verification Checklist + +Before marking a story as `passes: true`, complete this checklist: + +``` +## Verification Checklist for [Story ID] + +### 1. Acceptance Criteria Check +- [ ] Criterion 1: [How verified - command/file check/grep] +- [ ] Criterion 2: [How verified] +- [ ] Criterion 3: [How verified] +... (one checkbox per criterion) + +### 2. Quality Gates +- [ ] Typecheck passes: `npm run build` (or equivalent) +- [ ] Lint passes: `npm run lint` (or equivalent) +- [ ] Tests pass: `npm test` (or equivalent) + +### 3. Regression Check +- [ ] Full test suite passes (not just new tests) +- [ ] No unrelated failures introduced + +### 4. Final Verification +- [ ] Re-read each acceptance criterion one more time +- [ ] Confirmed each criterion is met with evidence +``` + +### How to Verify Each Criterion + +For each acceptance criterion, you must have **evidence**, not just belief: + +| Criterion Type | Verification Method | +|----------------|---------------------| +| "File X exists" | `ls -la path/to/X` or Read tool | +| "Contains section Y" | `grep -n "Y" file` or Read tool | +| "Command succeeds" | Run the command, check exit code | +| "Output contains Z" | Run command, pipe to grep | +| "Valid JSON" | `jq . file.json` succeeds | + +### Before Outputting COMPLETE + +When you believe ALL stories are done and you're about to output `COMPLETE`: + +1. **Re-verify the current story** - Run all quality gates one more time +2. **Check prd.json** - Confirm all stories show `passes: true` +3. **Run full verification** - `jq '.userStories[] | select(.passes == false) | .id' prd.json` should return nothing +4. **Only then** output the COMPLETE signal + +If ANY verification fails at this stage, do NOT output COMPLETE. Fix the issue first. + +### Evidence Over Assertion + +Never claim something works without proving it: + +| Bad (Assertion) | Good (Evidence) | +|-----------------|-----------------| +| "I added the section" | "Verified with `grep -n 'Section Name' file` - found at line 42" | +| "Tests pass" | "Ran `npm test` - 47 tests passed, 0 failed" | +| "File is valid JSON" | "Ran `jq . file.json` - parsed successfully" | + +Run the command. See the output. Report the evidence. + ## Browser Testing (Required for Frontend Stories) For any story that changes UI, you MUST verify it works in the browser: From 2e0c69193721d869cbf605ebb4f46cd33e9afcf8 Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 10:09:55 -0400 Subject: [PATCH 12/15] feat: US-012 - Strengthen Machine-Verifiable Criteria in skills/ralph/SKILL.md --- skills/ralph/SKILL.md | 48 ++++++++++++++++++++++++++++++++++++------- 1 file changed, 41 insertions(+), 7 deletions(-) diff --git a/skills/ralph/SKILL.md b/skills/ralph/SKILL.md index c17043c6..509be804 100644 --- a/skills/ralph/SKILL.md +++ b/skills/ralph/SKILL.md @@ -79,9 +79,21 @@ Stories execute in priority order. Earlier stories must not depend on later ones --- -## Acceptance Criteria: Must Be Verifiable +## Acceptance Criteria: MACHINE-VERIFIABLE Required -Each criterion must be something Ralph can CHECK, not something vague. +**Every criterion must be MACHINE-VERIFIABLE.** If Ralph cannot verify it with a command, file check, or automated test, it is not a valid criterion. + +### Verification Types + +Each criterion should be checkable by one of these methods: + +| Type | How to Verify | Example Criterion | +|------|---------------|-------------------| +| **Command exit code** | Run command, check exit 0 | "Typecheck passes" → `npm run build` | +| **File check** | Check file exists or has content | "File `docs/API.md` exists" → `ls docs/API.md` | +| **Grep/content match** | Search file for pattern | "Contains 'export default'" → `grep -q 'export default' file.ts` | +| **Database query** | Query returns expected result | "User table has email column" → `\d users` shows column | +| **Browser automation** | Dev-browser skill verifies visually | "Button is visible" → navigate and screenshot | ### Good criteria (verifiable): - "Add `status` column to tasks table with default 'pending'" @@ -90,11 +102,33 @@ Each criterion must be something Ralph can CHECK, not something vague. - "Typecheck passes" - "Tests pass" -### Bad criteria (vague): -- "Works correctly" -- "User can do X easily" -- "Good UX" -- "Handles edge cases" +### FORBIDDEN Criteria + +**Never use these vague terms** — they cannot be machine-verified: + +| Forbidden Term | Why It Fails | +|----------------|--------------| +| "Works correctly" | What does "correctly" mean? No verification command. | +| "Good UX" | Subjective. Cannot be automated. | +| "Handles edge cases" | Which edge cases? Unspecified = unverifiable. | +| "Is performant" | What threshold? No measurable target. | +| "User-friendly" | Subjective opinion, not a testable state. | +| "Clean code" | Style preference, not machine-checkable. | +| "Properly implemented" | Circular definition, no verification method. | + +### Vague to Specific Conversion + +When you encounter vague requirements, convert them: + +| Vague (FORBIDDEN) | Specific (VERIFIABLE) | +|-------------------|----------------------| +| "Works correctly" | "Returns 200 status code for valid input" | +| "Good UX" | "Form shows inline validation errors within 100ms" | +| "Handles edge cases" | "Returns 400 error when email is empty" | +| "Is performant" | "Query completes in under 100ms for 1000 rows" | +| "User-friendly error messages" | "Error div contains text 'Invalid email format'" | +| "Secure authentication" | "Password is hashed with bcrypt before storage" | +| "Responsive design" | "Component renders at 320px, 768px, and 1024px widths" | ### Always include as final criterion: ``` From 9ca1e9163cd8f946028c4fb12b003976270613f2 Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 10:11:44 -0400 Subject: [PATCH 13/15] feat: US-013 - Add Planning Prerequisite to skills/prd/SKILL.md --- skills/prd/SKILL.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/skills/prd/SKILL.md b/skills/prd/SKILL.md index 0e55eb1a..4b61382b 100644 --- a/skills/prd/SKILL.md +++ b/skills/prd/SKILL.md @@ -9,6 +9,18 @@ Create detailed Product Requirements Documents that are clear, actionable, and s --- +## Prerequisites + +Before writing a PRD, verify that planning has been completed: + +1. **Check for planning summary:** Look for `tasks/planning-[feature].md` from the `/planning` skill +2. **If no planning exists:** STOP. Tell the user to run `/planning` first to explore requirements +3. **If planning exists:** Reference it to inform the PRD structure + +**Why this matters:** PRDs written without planning often miss edge cases, have vague requirements, or solve the wrong problem. The planning skill forces 5 rounds of questions that surface critical details. + +--- + ## The Job 1. Receive a feature description from the user From 66e6719f172b004dcf870e0c5f2d1acaef764ee7 Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 10:13:18 -0400 Subject: [PATCH 14/15] feat: US-014 - Add verificationCommands to prd.json.example --- prd.json.example | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/prd.json.example b/prd.json.example index fbc40668..ded93d0a 100644 --- a/prd.json.example +++ b/prd.json.example @@ -2,6 +2,11 @@ "project": "MyApp", "branchName": "ralph/task-priority", "description": "Task Priority System - Add priority levels to tasks", + "verificationCommands": { + "typecheck": "npm run build", + "lint": "npm run lint", + "test": "npm test" + }, "userStories": [ { "id": "US-001", From 3f4bd3dd8deb0317a30d50104df86e8667e01b99 Mon Sep 17 00:00:00 2001 From: harrymunro Date: Mon, 26 Jan 2026 10:14:42 -0400 Subject: [PATCH 15/15] feat: US-015 - Add Security Warning to README.md --- README.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/README.md b/README.md index 4db35bc2..73178527 100644 --- a/README.md +++ b/README.md @@ -8,6 +8,16 @@ Based on [Geoffrey Huntley's Ralph pattern](https://ghuntley.com/ralph/). [Read my in-depth article on how I use Ralph](https://x.com/ryancarson/status/2008548371712135632) +## Security Warning + +**Ralph runs AI agents autonomously with full access to your codebase.** Before running: + +- **Never expose production credentials** - Ralph could accidentally commit, log, or transmit sensitive values like `AWS_ACCESS_KEY_ID`, `DATABASE_URL`, or API keys +- **Use sandboxing** - Run Ralph in a Docker container, VM, or isolated sandbox environment to limit potential damage +- **Review commits before pushing** - Always review what Ralph committed before pushing to remote + +See [docs/SECURITY.md](docs/SECURITY.md) for complete security guidance, including pre-flight checklists and emergency stop procedures. + ## Prerequisites - One of the following AI coding tools installed and authenticated: