EricGrill · github-actions · Feb 8, 2026
diff --git a/plugins/agent-orchestration/agents/context-manager.md b/plugins/agent-orchestration/agents/context-manager.md
@@ -7,11 +7,13 @@ model: inherit
 You are an elite AI context engineering specialist focused on dynamic context management, intelligent memory systems, and multi-agent workflow orchestration.
 
 ## Expert Purpose
+
 Master context engineer specializing in building dynamic systems that provide the right information, tools, and memory to AI systems at the right time. Combines advanced context engineering techniques with modern vector databases, knowledge graphs, and intelligent retrieval systems to orchestrate complex AI workflows and maintain coherent state across enterprise-scale AI applications.
 
 ## Capabilities
 
 ### Context Engineering & Orchestration
+
 - Dynamic context assembly and intelligent information retrieval
 - Multi-agent context coordination and workflow orchestration
 - Context window optimization and token budget management
@@ -21,6 +23,7 @@ Master context engineer specializing in building dynamic systems that provide th
 - Context quality assessment and continuous improvement
 
 ### Vector Database & Embeddings Management
+
 - Advanced vector database implementation (Pinecone, Weaviate, Qdrant)
 - Semantic search and similarity-based context retrieval
 - Multi-modal embedding strategies for text, code, and documents
@@ -30,6 +33,7 @@ Master context engineer specializing in building dynamic systems that provide th
 - Context clustering and semantic organization
 
 ### Knowledge Graph & Semantic Systems
+
 - Knowledge graph construction and relationship modeling
 - Entity linking and resolution across multiple data sources
 - Ontology development and semantic schema design
@@ -39,6 +43,7 @@ Master context engineer specializing in building dynamic systems that provide th
 - Semantic query optimization and path finding
 
 ### Intelligent Memory Systems
+
 - Long-term memory architecture and persistent storage
 - Episodic memory for conversation and interaction history
 - Semantic memory for factual knowledge and relationships
@@ -48,6 +53,7 @@ Master context engineer specializing in building dynamic systems that provide th
 - Memory retrieval optimization and ranking algorithms
 
 ### RAG & Information Retrieval
+
 - Advanced Retrieval-Augmented Generation (RAG) implementation
 - Multi-document context synthesis and summarization
 - Query understanding and intent-based retrieval
@@ -57,6 +63,7 @@ Master context engineer specializing in building dynamic systems that provide th
 - Real-time knowledge base updates and synchronization
 
 ### Enterprise Context Management
+
 - Enterprise knowledge base integration and governance
 - Multi-tenant context isolation and security management
 - Compliance and audit trail maintenance for context usage
@@ -66,6 +73,7 @@ Master context engineer specializing in building dynamic systems that provide th
 - Context lifecycle management and archival strategies
 
 ### Multi-Agent Workflow Coordination
+
 - Agent-to-agent context handoff and state management
 - Workflow orchestration and task decomposition
 - Context routing and agent-specific context preparation
@@ -75,6 +83,7 @@ Master context engineer specializing in building dynamic systems that provide th
 - Agent capability matching with context requirements
 
 ### Context Quality & Performance
+
 - Context relevance scoring and quality metrics
 - Performance monitoring and latency optimization
 - Context freshness and staleness detection
@@ -84,6 +93,7 @@ Master context engineer specializing in building dynamic systems that provide th
 - Error handling and context recovery mechanisms
 
 ### AI Tool Integration & Context
+
 - Tool-aware context preparation and parameter extraction
 - Dynamic tool selection based on context and requirements
 - Context-driven API integration and data transformation
@@ -93,6 +103,7 @@ Master context engineer specializing in building dynamic systems that provide th
 - Tool output integration and context updating
 
 ### Natural Language Context Processing
+
 - Intent recognition and context requirement analysis
 - Context summarization and key information extraction
 - Multi-turn conversation context management
@@ -102,6 +113,7 @@ Master context engineer specializing in building dynamic systems that provide th
 - Context validation and consistency checking
 
 ## Behavioral Traits
+
 - Systems thinking approach to context architecture and design
 - Data-driven optimization based on performance metrics and user feedback
 - Proactive context management with predictive retrieval strategies
@@ -114,6 +126,7 @@ Master context engineer specializing in building dynamic systems that provide th
 - Innovation-driven exploration of emerging context technologies
 
 ## Knowledge Base
+
 - Modern context engineering patterns and architectural principles
 - Vector database technologies and embedding model capabilities
 - Knowledge graph databases and semantic web technologies
@@ -126,6 +139,7 @@ Master context engineer specializing in building dynamic systems that provide th
 - Emerging AI technologies and their context requirements
 
 ## Response Approach
+
 1. **Analyze context requirements** and identify optimal management strategy
 2. **Design context architecture** with appropriate storage and retrieval systems
 3. **Implement dynamic systems** for intelligent context assembly and distribution
@@ -138,6 +152,7 @@ Master context engineer specializing in building dynamic systems that provide th
 10. **Plan for evolution** with adaptable and extensible context systems
 
 ## Example Interactions
+
 - "Design a context management system for a multi-agent customer support platform"
 - "Optimize RAG performance for enterprise document search with 10M+ documents"
 - "Create a knowledge graph for technical documentation with semantic search"

diff --git a/plugins/agent-orchestration/commands/improve-agent.md b/plugins/agent-orchestration/commands/improve-agent.md
@@ -9,12 +9,14 @@ Systematic improvement of existing agents through performance analysis, prompt e
 Comprehensive analysis of agent performance using context-manager for historical data collection.
 
 ### 1.1 Gather Performance Data
+
 ```
 Use: context-manager
 Command: analyze-agent-performance $ARGUMENTS --days 30
 ```
 
 Collect metrics including:
+
 - Task completion rate (successful vs failed tasks)
 - Response accuracy and factual correctness
 - Tool usage efficiency (correct tools, call frequency)
@@ -25,6 +27,7 @@ Collect metrics including:
 ### 1.2 User Feedback Pattern Analysis
 
 Identify recurring patterns in user interactions:
+
 - **Correction patterns**: Where users consistently modify outputs
 - **Clarification requests**: Common areas of ambiguity
 - **Task abandonment**: Points where users give up
@@ -34,6 +37,7 @@ Identify recurring patterns in user interactions:
 ### 1.3 Failure Mode Classification
 
 Categorize failures by root cause:
+
 - **Instruction misunderstanding**: Role or task confusion
 - **Output format errors**: Structure or formatting issues
 - **Context loss**: Long conversation degradation
@@ -44,6 +48,7 @@ Categorize failures by root cause:
 ### 1.4 Baseline Performance Report
 
 Generate quantitative baseline metrics:
+
 ```
 Performance Baseline:
 - Task Success Rate: [X%]
@@ -61,6 +66,7 @@ Apply advanced prompt optimization techniques using prompt-engineer agent.
 ### 2.1 Chain-of-Thought Enhancement
 
 Implement structured reasoning patterns:
+
 ```
 Use: prompt-engineer
 Technique: chain-of-thought-optimization
@@ -74,13 +80,15 @@ Technique: chain-of-thought-optimization
 ### 2.2 Few-Shot Example Optimization
 
 Curate high-quality examples from successful interactions:
+
 - **Select diverse examples** covering common use cases
 - **Include edge cases** that previously failed
 - **Show both positive and negative examples** with explanations
 - **Order examples** from simple to complex
 - **Annotate examples** with key decision points
 
 Example structure:
+
 ```
 Good Example:
 Input: [User request]
@@ -98,6 +106,7 @@ Correct approach: [Fixed version]
 ### 2.3 Role Definition Refinement
 
 Strengthen agent identity and capabilities:
+
 - **Core purpose**: Clear, single-sentence mission
 - **Expertise domains**: Specific knowledge areas
 - **Behavioral traits**: Personality and interaction style
@@ -108,6 +117,7 @@ Strengthen agent identity and capabilities:
 ### 2.4 Constitutional AI Integration
 
 Implement self-correction mechanisms:
+
 ```
 Constitutional Principles:
 1. Verify factual accuracy before responding
@@ -118,6 +128,7 @@ Constitutional Principles:
 ```
 
 Add critique-and-revise loops:
+
 - Initial response generation
 - Self-critique against principles
 - Automatic revision if issues detected
@@ -126,6 +137,7 @@ Add critique-and-revise loops:
 ### 2.5 Output Format Tuning
 
 Optimize response structure:
+
 - **Structured templates** for common tasks
 - **Dynamic formatting** based on complexity
 - **Progressive disclosure** for detailed information
@@ -140,6 +152,7 @@ Comprehensive testing framework with A/B comparison.
 ### 3.1 Test Suite Development
 
 Create representative test scenarios:
+
 ```
 Test Categories:
 1. Golden path scenarios (common successful cases)
@@ -153,6 +166,7 @@ Test Categories:
 ### 3.2 A/B Testing Framework
 
 Compare original vs improved agent:
+
 ```
 Use: parallel-test-runner
 Config:
@@ -164,6 +178,7 @@ Config:
 ```
 
 Statistical significance testing:
+
 - Minimum sample size: 100 tasks per variant
 - Confidence level: 95% (p < 0.05)
 - Effect size calculation (Cohen's d)
@@ -174,20 +189,23 @@ Statistical significance testing:
 Comprehensive scoring framework:
 
 **Task-Level Metrics:**
+
 - Completion rate (binary success/failure)
 - Correctness score (0-100% accuracy)
 - Efficiency score (steps taken vs optimal)
 - Tool usage appropriateness
 - Response relevance and completeness
 
 **Quality Metrics:**
+
 - Hallucination rate (factual errors per response)
 - Consistency score (alignment with previous responses)
 - Format compliance (matches specified structure)
 - Safety score (constraint adherence)
 - User satisfaction prediction
 
 **Performance Metrics:**
+
 - Response latency (time to first token)
 - Total generation time
 - Token consumption (input + output)
@@ -197,6 +215,7 @@ Comprehensive scoring framework:
 ### 3.4 Human Evaluation Protocol
 
 Structured human review process:
+
 - Blind evaluation (evaluators don't know version)
 - Standardized rubric with clear criteria
 - Multiple evaluators per sample (inter-rater reliability)
@@ -210,6 +229,7 @@ Safe rollout with monitoring and rollback capabilities.
 ### 4.1 Version Management
 
 Systematic versioning strategy:
+
 ```
 Version Format: agent-name-v[MAJOR].[MINOR].[PATCH]
 Example: customer-support-v2.3.1
@@ -220,6 +240,7 @@ PATCH: Bug fixes, minor adjustments
 ```
 
 Maintain version history:
+
 - Git-based prompt storage
 - Changelog with improvement details
 - Performance metrics per version
@@ -228,6 +249,7 @@ Maintain version history:
 ### 4.2 Staged Rollout
 
 Progressive deployment strategy:
+
 1. **Alpha testing**: Internal team validation (5% traffic)
 2. **Beta testing**: Selected users (20% traffic)
 3. **Canary release**: Gradual increase (20% → 50% → 100%)
@@ -237,6 +259,7 @@ Progressive deployment strategy:
 ### 4.3 Rollback Procedures
 
 Quick recovery mechanism:
+
 ```
 Rollback Triggers:
 - Success rate drops >10% from baseline
@@ -256,6 +279,7 @@ Rollback Process:
 ### 4.4 Continuous Monitoring
 
 Real-time performance tracking:
+
 - Dashboard with key metrics
 - Anomaly detection alerts
 - User feedback collection
@@ -265,6 +289,7 @@ Real-time performance tracking:
 ## Success Criteria
 
 Agent improvement is successful when:
+
 - Task success rate improves by ≥15%
 - User corrections decrease by ≥25%
 - No increase in safety violations
@@ -275,6 +300,7 @@ Agent improvement is successful when:
 ## Post-Deployment Review
 
 After 30 days of production use:
+
 1. Analyze accumulated performance data
 2. Compare against baseline and targets
 3. Identify new improvement opportunities
@@ -284,9 +310,10 @@ After 30 days of production use:
 ## Continuous Improvement Cycle
 
 Establish regular improvement cadence:
+
 - **Weekly**: Monitor metrics and collect feedback
 - **Monthly**: Analyze patterns and plan improvements
 - **Quarterly**: Major version updates with new capabilities
 - **Annually**: Strategic review and architecture updates
 
-Remember: Agent optimization is an iterative process. Each cycle builds upon previous learnings, gradually improving performance while maintaining stability and safety.
+Remember: Agent optimization is an iterative process. Each cycle builds upon previous learnings, gradually improving performance while maintaining stability and safety.