Skip to content

Conversation

@Mirrowel
Copy link
Owner

@Mirrowel Mirrowel commented Nov 19, 2025

Overview

This pull request implements the HiveMind Ensemble subsystem, a sophisticated orchestration engine for parallel model execution. It introduces two modes:

  1. Swarm Mode: Multiple parallel calls to the same model with configurable presets for response diversity, critique, and synthesis.
  2. Fusion Mode: Multiple parallel calls to different models with specialized roles and reusable templates.

This PR delivers a complete implementation of Swarm Mode (with preset-based configuration and omit_id discovery), Recursive Mode (autonomous refinement), Fusion Mode (including role templates and weight context), and comprehensive Documentation with detailed example configurations.


Key Features

Preset-Based Swarm System:

  • Flexible preset configuration with base_models for discovery control
  • omit_id field to control /v1/models display format (short vs explicit)
  • Array format support: multiple presets in a single file [{}, {}]
  • Runtime flexibility: explicit format works with ANY model/preset combo
  • Discovery vs runtime separation for clean API exposure

Role Templates:

  • Reusable role definitions in ensemble_configs/roles/
  • Supports both single and array formats for multiple roles per file
  • Specialist configs can reference templates with role_template field
  • Template fields can be overridden by specialist config

Comprehensive Example Configurations:

  • preset.example.json: Fully documented swarm preset with all fields explained
  • fusion.example.json: Fusion with inline and template-based specialist examples
  • role.example.json & roles-array.example.json: Role template formats
  • strategy.example.txt: Strategy template with field documentation

Implementation Details

This PR introduces the following key changes:

  • New Ensemble Subsystem (src/rotator_library/ensemble):

    • EnsembleManager (manager.py): Central orchestrator with preset resolution and omit_id support
    • ConfigLoader (config_loader.py): Manages loading configurations with omit_id mapping and role templates
  • Preset-Based Configuration (ensemble_configs):

    • Swarms: Preset files with id, base_models, and omit_id fields. Supports both single and array formats
    • Fusions: Support for both single and array formats
    • Roles: Reusable role templates (single and array formats)
    • Strategies: Customizable arbitration prompt templates
    • Examples: Comprehensive .example files with detailed field documentation
  • Core Integration:

    • Seamless integration into RotatingClient with enhanced is_ensemble() detection
    • Preset resolution: get_preset_for_model() for omit_id mapping
    • Conflict resolution for model naming with numeric suffixes
  • Complete Swarm Mode:

    • Parallel Execution: Concurrent "drone" execution with resilience
    • Streaming Support: Real-time streaming of Arbiter's synthesis
    • Advanced Features: Temperature Jitter, Adversarial Mode, Blind Switch
    • Preset Discovery: Smart /v1/models population based on omit_id and base_models
  • Recursive Mode:

    • Autonomous Decision Protocol: Arbiter decides to critique or synthesize based on consensus
    • Internal Reasoning: [INTERNAL] tags for transparent monologue handling
  • Fusion Mode:

    • Multi-Model Orchestration: Heterogeneous model configurations
    • Role Templates: Reusable specialist role definitions
    • Weight Context: Specialist expertise descriptions for smarter arbitration
    • Array Support: Multiple fusions or roles in a single file
  • Documentation:

    • HiveMind_User_Guide.md: Comprehensive guide with preset-based system explanation
    • HiveMind_API.md: Detailed API reference with updated signatures
    • ensemble_configs/README.md: Complete configuration guide
    • Main README: Updated with HiveMind features and examples
    • Library README: Added HiveMind usage examples
    • DOCUMENTATION.md: Architecture documentation for ensemble subsystem

Technical Highlights

Discovery Logic:

  • Preset WITH base_models + omit_id: true → Shows as {model}[swarm] (clean, short form)
  • Preset WITH base_models + omit_id: false → Shows as {model}-{preset}[swarm] (explicit)
  • Preset WITHOUT base_models → Not shown (invisible but usable)

Runtime Flexibility:

  • Short form model[swarm] → Uses omit_id preset OR default
  • Explicit form model-preset[swarm] → ALWAYS works with ANY model/preset combo
  • base_models controls ONLY discovery, NOT runtime restrictions

Role Template Resolution:

  • Specialists can reference templates: "role_template": "security-expert"
  • Template fields merged with specialist config (specialist overrides template)
  • Supports both single role per file and array of roles

Future Work

  • Performance Tuning: Benchmarking and optimization
  • Additional Presets: More domain-specific configurations
  • Enhanced Error Handling: Arbiter fallback strategies
  • Metrics Dashboard: Observability tooling for HiveMind execution

See the full Implementation Plan and Task Checklist for the complete roadmap.

…k list

Add two new documentation files: docs/HiveMind Plan.md (detailed architecture and implementation plan for Swarm/Fusion orchestration) and docs/HiveMind Task.md (phase-by-phase task checklist). Includes terminology, request flow, core components, config schemas, feature specs (jitter, adversarial, blind switch, recursive mode), streaming and usage tracking, integration points, error handling, testing strategy, and implementation phases to guide development and verification.
…ault configs

Introduce a new HiveMind ensemble subsystem to enable Swarm/Fusion orchestration.

- add src/rotator_library/ensemble: EnsembleManager, ConfigLoader, and package init
- add ensemble_configs: default swarm config, fusion example (dev-team), and strategy templates
- integrate EnsembleManager into RotatingClient to initialize the ensemble manager at startup
- implement conflict resolution, config loading, and ensemble detection logic; swarm/fusion execution paths staged as TODOs

This lays the groundwork for parallel model execution and intelligent arbitration without changing existing public APIs.
@Mirrowel Mirrowel added the Agent Monitored Monitored for AI Agent to review PR's and commits label Nov 19, 2025
@Mirrowel Mirrowel self-assigned this Nov 19, 2025
Detect ensemble model identifiers in RotatingClient and delegate handling to self.ensemble_manager.handle_request, enabling HiveMind (swarm/fusion) request routing. Preserve the existing iflow provider workaround (removing stream_options to avoid HTTP 406). Add documentation for ensemble configurations at src/rotator_library/ensemble_configs/README.md.
@mirrobot-agent
Copy link

I'm starting my review of the HiveMind Ensemble foundation. Diving into the new subsystem and integration points now.

Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall Assessment

This PR lays a solid foundation for the HiveMind subsystem. The directory-based configuration strategy (ensemble_configs) is a great choice for extensibility, allowing users to easily share and drop in new Swarm/Fusion definitions. The integration into `RotatingClient' is clean and unobtrusive.

Architectural Feedback

The separation of concerns between the EnsembleManager and the core client is well-handled. The decision to intercept requests based on the model ID suffix/ID is standard for this type of proxy, though the conflict resolution logic warrants a second look to ensure we don't accidentally make real provider models inaccessible.

Key Suggestions

  • Config Safety: Ensure the default configuration is deep-copied to prevent state leakage between requests (see comment in config_loader.py).
  • Conflict Resolution: Clarify the intended precedence when a provider model name collides with an ensemble ID.

Nitpicks

  • load_provider_models is currently a placeholder; ensure this is tracked in your Phase 2 tasks so conflict detection actually activates.

This review was generated by an AI assistant.

Merged configuration dictionary
"""
# Start with default
config = self.swarm_default.copy() if self.swarm_default else {}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.swarm_default.copy() creates a shallow copy. Since swarm_default contains nested dictionaries (e.g., temperature_jitter), modifying nested fields in the returned configuration could inadvertently mutate the global default configuration for subsequent requests.

Consider using copy.deepcopy() to ensure full isolation:

Suggested change
config = self.swarm_default.copy() if self.swarm_default else {}
import copy
config = copy.deepcopy(self.swarm_default) if self.swarm_default else {}

Comment on lines +117 to +164
def resolve_conflicts(self, ensemble_id: str) -> str:
"""
Resolve naming conflicts by appending numeric suffixes.

If an ensemble ID conflicts with a real provider model,
append -1, -2, -3, etc. until unique.

Args:
ensemble_id: Original ensemble ID (swarm or fusion)

Returns:
Resolved unique ensemble ID
"""
# Check cache first
if ensemble_id in self._resolved_names:
return self._resolved_names[ensemble_id]

# Load provider models if not cached
if self._provider_models is None:
self._load_provider_models()

# Check for conflict
if ensemble_id not in self._provider_models:
# No conflict, use original
self._resolved_names[ensemble_id] = ensemble_id
return ensemble_id

# Conflict detected, find available suffix
counter = 1
while True:
candidate = f"{ensemble_id}-{counter}"
if candidate not in self._provider_models:
lib_logger.warning(
f"[HiveMind] Naming conflict detected. "
f"Renamed '{ensemble_id}' to '{candidate}'"
)
self._resolved_names[ensemble_id] = candidate
return candidate
counter += 1

# Safety check (shouldn't happen in practice)
if counter > 100:
lib_logger.error(
f"[HiveMind] Could not resolve naming conflict for '{ensemble_id}' "
f"after 100 attempts"
)
return f"{ensemble_id}-{counter}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current conflict resolution logic renames the incoming ensemble request ID if a conflict is found. This effectively "shadows" the underlying provider model.

For example, if a provider actually offers a model named gpt-4o[swarm], this logic (combined with is_ensemble returning True) ensures the HiveMind subsystem intercepts the request. The user would have no way to bypass HiveMind and access the provider's native gpt-4o[swarm] model directly. Is this the intended precedence?

…rallel execution

Add a private _prepare_drones method to EnsembleManager to build configurations for parallel/swarm execution.
- Creates N identical copies of the original request parameters (default count = 3) and enforces the base_model.
- Deep-copies message payloads to prevent mutation and attaches _drone_index and _total_drones metadata for logging.
- Emits debug logs for each prepared drone.
- Leaves hooks for advanced jitter/adversarial behavior to be implemented in Phase 4.
Introduce async _execute_parallel to run drone requests concurrently using asyncio.gather, assemble tasks via RotatingClient retry logic, aggregate usage metrics (prompt_tokens, completion_tokens, total_tokens and optional fields), log per-drone outcomes, and return successful responses with aggregated usage. Raises if all drones fail and warns when some fail.

Add _format_for_arbiter to normalize and number successful drone responses for the arbiter, extracting OpenAI-style content, skipping empty responses, and producing a single formatted text blob for arbitration. These helpers prepare HiveMind ensemble requests for parallel execution and downstream arbitration.
… execution flow

Add _build_arbiter_prompt to construct system/user messages from strategy templates and original messages, and _call_arbiter to invoke a non-streaming arbiter model via RotatingClient and extract usage. Complete the swarm branch in handle_request to:

- prepare drones and execute them in parallel
- format drone responses for the arbiter
- build and resolve arbiter messages and model (handle "self" -> base model)
- call the arbiter and aggregate token usage (including cached_tokens and reasoning_tokens when present)
- attach aggregated usage back to the arbiter response and return it

Fallbacks and logging added for missing strategy templates and for observability. Fusion mode remains NotImplementedError (Phase 5).
Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review

I see you've wired up the core Swarm execution logic (Phase 2F). The structure for parallel drone execution and arbitration looks solid!

However, I've identified a critical blocking bug that will cause the new code to crash immediately upon execution, along with a few unaddressed points from the previous review.

Assessment of New Changes:

  • Crash in Parallel Execution: The call to _execute_with_retry passes api_call=None, which is incompatible with the client's retry logic.
  • Configuration Safety: The shallow copy issue in config_loader.py remains unaddressed, posing a risk of state mutation.
  • Model Shadowing: The conflict resolution logic handles internal uniqueness but does not solve the interception issue. Since is_ensemble() returns True before checking if the model exists natively, HiveMind will still hijack requests intended for a provider's native model named foo[swarm]. To fix this, is_ensemble() should return False if the model ID exists in _provider_models.

Overall Status:
Changes Requested. The code requires a fix for the TypeError crash before it can be run or merged.

This review was generated by an AI assistant.

Import litellm and pass its `acompletion` function to `rotating_client._execute_with_retry` when creating drone tasks. Remove the unused `asyncio` import. This fixes passing `None` as the api_call and ensures drones invoke the litellm API correctly.

addresses #8 (comment)
Add async _call_arbiter_streaming to stream arbiter model responses, track per-stream usage, and emit a final _hivemind_usage metadata chunk. Add async _handle_swarm_streaming to execute drones in parallel, format responses for the arbiter, stream the arbiter output, aggregate drone and arbiter token usage, and inject aggregated usage into the final SSE chunk(s). Uses rotating_client._streaming_acompletion_with_retry and preserves existing non-streaming flows.
…credential rotation

Normalize the model identifier up front by calling _resolve_model_id before entering the credential rotation loops and write the canonical value into kwargs["model"]. This eliminates per-iteration model lookups and duplicated log entries.

- remove inline resolution within rotation loops
- propagate the normalized model to litellm kwargs for consistent acquire/release/tracking
- reduces redundant work and log noise, and stabilizes model reporting across retries
Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review

I've reviewed the incremental changes, focusing on the new streaming implementation in manager.py and the refactoring in client.py.

Assessment of New Changes:

  • Streaming Support: The new _handle_swarm_streaming and _call_arbiter_streaming methods provide a good foundation for Phase 3. The usage aggregation logic looks correct.
  • Model Resolution: Moving _resolve_model_id earlier in client.py improves consistency.

Key Concerns (Unresolved & New):

  1. Mutable Default Configs (Critical): The feedback about using copy.deepcopy in config_loader.py was not addressed. Furthermore, this pattern has propagated to manager.py's streaming handler. This is a significant risk: modifying a shallow copy of the configuration will mutate the global defaults for all subsequent requests.
  2. Shadowing Logic: The conflict resolution strategy in EnsembleManager renames the internal request ID but does not solve the issue of the client intercepting the request first. Users still cannot access a provider model if it conflicts with an ensemble naming convention.

Overall Status:
The streaming logic is on the right track, but the configuration management needs to be fixed to avoid state pollution.

This review was generated by an AI assistant.


# Check for conflict
if ensemble_id not in self._provider_models:
# No conflict, use original

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have concerns about the shadowing logic.

While resolve_conflicts renames the internal ensemble ID (e.g. to model[swarm]-1), the interception in client.py (if self.ensemble_manager.is_ensemble(model)) happens before this resolution.

If a provider has a model named gpt-4o[swarm], is_ensemble will return True, RotatingClient will delegate to EnsembleManager, and the user will never be able to reach the actual provider model. This conflict resolution protects the Manager from crashing, but it effectively "shadows" the provider model from the user's perspective.

…=true

Introduce an `is_streaming` flag from kwargs and include it in the swarm processing log. When `stream` is true, route execution to `_handle_swarm_streaming` (returning an async generator). Otherwise preserve the existing non-streaming flow (prepare drones, execute in parallel, call arbiter, aggregate and attach usage). This enables streaming arbiter/swarm handling while keeping backward-compatible non-streaming behavior.
Add handling for new ensemble config options: `temperature_jitter` and `adversarial_config`. This enables:
- optional temperature jitter per drone (configurable `delta`, default 0.2), applied via random uniform jitter and clamped to [0.0, 2.0]
- optional adversarial mode where the last N drones inject a provided adversarial system prompt
- per-drone metadata (`_is_adversarial`, `_drone_index`, `_total_drones`) and enhanced debug logging for jitter and adversarial injections

These changes allow controlled diversity of drone behavior and targeted adversarial prompts for critical analysis without changing existing call signatures.
Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review

I've reviewed the incremental changes adding adversarial mode, temperature jitter, and streaming support. The implementation of these new features logic is generally sound and aligns with the planned architecture.

Assessment of New Changes:

  • Features: The adversarial and jitter logic is implemented correctly within the drone preparation phase.
  • Streaming: The split between streaming and non-streaming handlers (_handle_swarm_streaming) is a logical approach.
  • Unaddressed Feedback: I noticed that the feedback from the previous review regarding variable shadowing, configuration safety (deepcopy), and import placement has not been addressed. In fact, the new code introduces additional local imports inside loops.

Overall Status:
The functional logic is good, but the code quality issues (imports, safety) and the architectural shadowing behavior should be resolved before merging to ensure maintainability and correctness.

This review was generated by an AI assistant.

if self._provider_models is None:
self._load_provider_models()

# Check for conflict

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shadowing issue previously raised remains unresolved. Because is_ensemble checks the name before any conflict resolution happens, a provider's native model named model[swarm] is effectively unreachable—the EnsembleManager will always intercept it. This logic handles internal collisions but doesn't solve the user-facing availability problem.

…orts

Introduce a blind switch in arbiter formatting that defaults to anonymizing model names (blind_mode=true). When blind mode is disabled, model names are included in response labels. Hoist frequently used imports (litellm, asyncio, random, copy) to module scope and remove redundant local imports to reduce repeated imports and clarify code paths. Add debug logging for anonymization and include blind_mode in formatted output.
Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review

I've reviewed the latest changes. The cleanup of imports and the addition of blind_mode logic in manager.py look good. However, the critical issues raised in the previous review regarding configuration safety and model shadowing remain unresolved.

Assessment of New Changes:

  • Improvements: The import organization in manager.py is much cleaner. The blind_mode logic is a good addition for the privacy requirements.
  • Persistent Risks:
    • Configuration Mutation: config_loader.py still uses shallow copies. This is a latent bug that could cause request cross-contamination.
    • Model Shadowing: The is_ensemble logic still prioritizes HiveMind interception over native provider models.
    • Conflict Resolution: I noticed that _load_provider_models is currently a no-op, which means the conflict resolution logic (while present) is effectively disabled.

Overall Status:
The PR needs to address the configuration safety issue before merge to prevent runtime bugs. The shadowing and conflict resolution issues should at least be acknowledged or documented if they are deferred to a later phase, though fixing the precedence now is recommended.

This review was generated by an AI assistant.

Comment on lines 62 to 80
def is_ensemble(self, model_id: str) -> bool:
"""
Check if a model ID represents an ensemble request.

Args:
model_id: Full model ID from user request

Returns:
True if this is an ensemble (swarm or fusion), False otherwise
"""
# Check for fusion ID (exact match)
if model_id in self.config_loader.fusion_configs:
return True

# Check for swarm suffix
if self._is_swarm_request(model_id):
return True

return False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shadowing issue previously raised remains unresolved. is_ensemble returns True based solely on the suffix/ID match, which happens before any conflict resolution logic.

This means if a provider has a native model named gpt-4o[swarm], this method will return True, and the EnsembleManager will intercept the request. The user will be unable to reach the provider's native model.

To fix this, is_ensemble should ideally check if the model ID exists in self._provider_models (once populated) and return False if it's a native model, allowing RotatingClient to handle it normally.

Comment on lines 168 to 187
def _load_provider_models(self) -> None:
"""
Load all provider models from RotatingClient.

This is used for conflict detection.
"""
try:
# Get all available models (this might be async in the actual implementation)
# For now, we'll use a synchronous approach
# TODO: Handle async model loading properly
self._provider_models = set()

# Note: This will be implemented properly when we integrate with RotatingClient
# For now, just initialize an empty set
lib_logger.debug("[HiveMind] Provider models cache initialized (empty)")

except Exception as e:
lib_logger.error(f"[HiveMind] Failed to load provider models: {e}")
self._provider_models = set()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice _provider_models is currently initialized as an empty set and never populated with actual provider data. This renders the resolve_conflicts logic (lines 120-167) effectively non-functional, as it will never detect a collision.

While this is a "Foundation" PR, shipping the conflict resolution logic in a state where it cannot work might be misleading. Consider implementing a basic fetch from rotating_client or adding a clear TODO/warning that conflict detection is currently disabled.

Add a private `_prepare_fusion_models` method in EnsembleManager to build per-specialist model
configs for fusion execution.

- Iterates configured specialists and skips entries missing a model.
- Clones request params and deep-copies messages when present.
- Injects role-specific system prompts as the first message when provided.
- Attaches specialist metadata (`_specialist_index`, `_specialist_role`, `_specialist_weight`, `_total_specialists`) to each model config.
- Emits debug and warning logs for visibility.

This prepares a list of ready-to-run specialist model parameter sets for the fusion pipeline.
@Mirrowel
Copy link
Owner Author

Focus on reporting new issues. All these will be addressed later - they are not critical.

…ting, streaming, and usage aggregation

Implement full Fusion (Phase 5) flow in EnsembleManager:

- add optional specialist_metadata param to _format_for_arbiter and emit role-aware labels (role or "role (model)" depending on blind mode).
- prepare and execute specialist models in parallel, format their outputs for an arbiter, and build arbiter messages.
- support both streaming and non-streaming arbiter calls; stream yields intermediate chunks and aggregates usage when final chunk arrives.
- aggregate usage metrics across specialists and arbiter (prompt_tokens, completion_tokens, total_tokens, plus optional cached_tokens/reasoning_tokens) and attach aggregated usage to returned/streamed responses.
- add detailed logging for fusion lifecycle and usage totals.
- update fusion config handling to load specialists and pass specialist metadata into formatting/execution.

Also update example fusion config (dev-team.json):

- rename "models" -> "specialists"
- replace "system_prompt_append" with "system_prompt"
- normalize "weight" to numeric values
- set arbiter.blind to false by default with explanatory note

BREAKING CHANGE: Fusion configuration schema changed and may break existing fusion configs and integrations.
- Migration:
  - Rename top-level fusion arrays from "models" to "specialists".
  - Replace per-specialist key "system_prompt_append" with "system_prompt".
  - Convert "weight" values from descriptive strings to numeric weights (e.g., 1.0, 1.2).
  - Review arbiter.blind semantics (default now false); adjust configs if anonymous responses were previously required.
  - If any external callers invoked the old NotImplemented fusion path or relied on the previous _format_for_arbiter behavior, update calls to accommodate specialist_metadata and new labeling.
Update checklist in docs/HiveMind Task.md to reflect implemented work and current priorities.

- Mark Phase 1 items (design, EnsembleManager creation, client integration, ensemble_config.json) as complete
- Mark Phase 2 swarm features and associated tests as complete
- Mark Phase 3 temperature jitter, adversarial mode, and blind switch as complete
- Move Confidence Scoring to Recursive Mode and leave it pending
… routing

- add async `_handle_fusion_streaming` to execute specialists in parallel, format responses for the arbiter, stream arbiter chunks, and aggregate usage (including `cached_tokens` and `reasoning_tokens`)
- route streaming fusion in `handle_request` to the new handler and remove duplicated streaming logic
- aggregate and inject total usage for non-streaming fusion responses and improve logging of fusion usage
Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review

The Fusion mode implementation has been reviewed and appears robust, correctly following the architectural patterns established in the Swarm implementation.

Assessment of New Changes:

  • Fusion Logic: The implementation of _handle_fusion_streaming and the integration into handle_request aligns perfectly with the design plan.
  • Safety: I noticed _prepare_fusion_models uses copy.deepcopy for the messages list. This is excellent and prevents the reference modification issues flagged in other parts of the codebase.
  • Streaming: The usage aggregation logic in the streaming handler correctly accounts for both specialist and arbiter tokens.

Overall Status:
The new changes look good and are ready to merge.
Note: As discussed, the previously identified issues regarding global config shallow copies and provider model shadowing remain as tracked technical debt to be addressed in a future polish phase.

This review was generated by an AI assistant.

Introduce Phase 6 recursive/autonomous protocol for the arbiter:
- Append recursive instructions to the arbiter system prompt when config.recursive_mode.enabled is true.
- Honor configurable consensus_threshold and include autonomous decision phases (assessment, decision, synthesis).
- Stream handler now filters content wrapped in [INTERNAL]...[/INTERNAL], buffers internal reasoning, and only yields user-facing synthesis to callers.
- Add _log_recursive_markers helper to parse and log consensus scores, conflicts, and critique reasoning; improve related logging and default strategy fallback.
Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review

I've reviewed the incremental changes implementing the Recursive Mode and streaming filters.

Assessment of New Changes:
The Phase 6 implementation introduces a critical logic error in the streaming handler (_handle_swarm_streaming) that causes data loss. The current implementation discards parts of the response that share a chunk with the [INTERNAL] or [/INTERNAL] tags. This must be fixed before merge.

Previous Feedback Status:
As requested, I am not re-posting inline comments for previous findings, but I must note that they remain unaddressed and are still valid concerns:

  1. Unsafe Shallow Copies: config_loader.py and manager.py still use shallow copies, risking global state mutation.
  2. Model Shadowing: is_ensemble still shadows provider models with the same name.
  3. Broken Conflict Resolution: _provider_models is still never populated, rendering conflict detection dead code.

Overall Status:
Request Changes due to the new data loss bug in the streaming logic.

This review was generated by an AI assistant.

@Mirrowel
Copy link
Owner Author

@mirrobot-agent Can you check the current code status vs all the reviews and list them all here for me to fix. Goal here is to poll all problems i need to address in easy to read and parse form. Don't need code suggestions, just problem statements with references to where and how. If one of the reviews is no longer relevant - resolve it by marking it resolved.

@mirrobot-agent
Copy link

@Mirrowel, I'm analyzing the current code state against the previous reviews to compile the consolidated list of issues you requested. I'll post the summary shortly.

@mirrobot-agent
Copy link

@Mirrowel, I have analyzed the current code against the previous reviews. Here is the consolidated list of unresolved issues you requested.

1. Critical Data Loss in Streaming Logic (Internal Tags)

Location: src/rotator_library/ensemble/manager.py (Lines 806-833)
Problem: The logic for filtering [INTERNAL] and [/INTERNAL] tags causes data loss due to the use of continue statements.

  • Start Tag ([INTERNAL]): When a chunk contains [INTERNAL], the content following the tag (e.g., [INTERNAL] reasoning...) is skipped and never added to the internal_buffer because the loop continues immediately.
  • End Tag ([/INTERNAL]): When a chunk contains [/INTERNAL], the content preceding the tag (e.g., ...end reasoning [/INTERNAL]) is not added to the buffer before the buffer is processed and cleared.

2. Unsafe Shallow Copy of Configurations

Location:

  • src/rotator_library/ensemble/config_loader.py (Line 167)
  • src/rotator_library/ensemble/manager.py (Line 927)
    Problem: Using .copy() creates a shallow copy. Since configuration dictionaries contain nested structures (like temperature_jitter, arbiter, adversarial_config), modifying these nested fields in a request (e.g., manager.py line 929) effectively mutates the global default configuration for the entire application lifecycle.

3. Provider Model Shadowing

Location: src/rotator_library/ensemble/manager.py (Line 62, is_ensemble)
Problem: The is_ensemble check returns True based solely on the suffix/ID match before checking for conflicts. If a provider offers a native model named gpt-4o[swarm], the EnsembleManager will intercept the request, making the native provider model unreachable by the user. The conflict resolution logic (renaming the ensemble) happens too late (inside handle_request) to prevent this interception.

4. Conflict Resolution is Non-Functional

Location: src/rotator_library/ensemble/manager.py (Line 168, _load_provider_models)
Problem: The _provider_models set is initialized as empty and never populated (marked with a TODO). Consequently, resolve_conflicts will never detect a collision, rendering the conflict resolution logic completely inactive.

This summary was generated by an AI assistant.

…dle [INTERNAL] markers

Fixes #8 (comment)

- Use copy.deepcopy for swarm defaults, request cloning, and config copies to avoid shared mutable state and accidental cross-request mutations.
- Initialize and populate provider model cache from rotating_client.model_definitions (and call loader on init/lazily) to prevent provider model shadowing when detecting ensembles.
- Improve streaming handling of `[INTERNAL]...[/INTERNAL]` by buffering internal segments, logging internal content via _log_recursive_markers, and yielding surrounding content correctly.
- Move/remove redundant local regex import and add explicit re import at module level for consistency.
Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review: All Critical Issues Resolved

I have reviewed the latest changes and am pleased to confirm that all critical issues raised in the previous review have been addressed effectively.

Assessment of Fixes:

  1. Critical Data Loss in Streaming: The rewritten [INTERNAL] tag parsing logic in _call_arbiter_streaming is now robust. It correctly handles content before/after tags and buffers incomplete tags across chunks, preventing the data loss scenario.
  2. Unsafe Shallow Copy: The switch to copy.deepcopy() across config_loader.py and manager.py ensures that the global default configuration remains immutable, protecting the application state.
  3. Provider Model Shadowing: The is_ensemble method now correctly checks against self._provider_models before claiming a request. This ensures that native provider models (e.g., gpt-4o[swarm] if it existed) remain accessible to users.
  4. Conflict Resolution: The _load_provider_models method is now fully implemented, populating the provider cache from RotatingClient. This makes the conflict detection and resolution logic functional.

Overall Status:
The PR is now in excellent shape. The "HiveMind" ensemble foundation is solid, safe, and ready for merging.

This review was generated by an AI assistant.


# Handle [INTERNAL] start
if '[INTERNAL]' in content:
parts = content.split('[INTERNAL]')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified: This splitting logic correctly handles the [INTERNAL] tag boundaries and prevents the data loss issue I flagged previously. The buffering ensures partial tags are also handled correctly.

…king

Replace fixed token summation with dynamic aggregation of all numeric usage fields across drone, specialist, and arbiter flows. Track execution start/end times, compute latency (ms), and attempt cost calculation via litellm. Attach a supplementary `hivemind_details` breakdown to `arbiter_response.usage` containing mode, counts, token breakdown, rounded cost, and latency. Iterate safely over usage attributes (skip private/magic, non-numeric, and handle set/get errors) and log latency/cost info. Also add a time import and defensive exception handling for cost calculation.
…e context for arbiter

Add support for specialist weight descriptions and include specialist expertise context when building arbiter prompts.

- Extract `weight_description` from specialist configs and attach it to model params as `_specialist_weight_description`.
- Extend `EnsembleManager.build_arbiter_messages` to accept optional `specialist_metadata` and append a "SPECIALIST EXPERTISE" block to the strategy prompt that lists each specialist's role, model, and weight description; log debug info when added.
- Update dev-team fusion config to include `weight_description` entries for existing specialists.

No breaking changes.
…ongside formatted responses

Collect specialist metadata (role, model, weight_description) during arbiter formatting and return it together with the formatted text so arbiter prompts can include specialist expertise context in fusion mode.

- _format_for_arbiter now accumulates arbiter_metadata for each specialist when specialist_metadata is provided.
- Returns a tuple (formatted_text, metadata_for_arbiter) where metadata_for_arbiter is None for swarm mode or a list of dicts for fusion mode.
- Call sites are updated to unpack the tuple and forward specialist metadata into _build_arbiter_prompt.

BREAKING CHANGE: _format_for_arbiter previously returned a plain formatted string; it now returns a tuple (formatted_text, metadata_for_arbiter). Update any external callers to unpack the returned tuple and, if needed, pass the metadata as the specialist_metadata argument to _build_arbiter_prompt.
…sting

Previously get_all_available_models omitted HiveMind fusion models from the returned provider list. When an ensemble_manager is available, the method now fetches fusion IDs via ensemble_manager.config_loader.get_all_fusion_ids(), adds them under the "hivemind_fusion" provider key, and logs the number of fusion models added.
- Add comprehensive HiveMind_API.md covering EnsembleManager, ConfigLoader, response usage (standard + hivemind_details), configuration schemas (swarms/fusions/strategies), error handling, logging, advanced usage, migration, performance, and limitations
- Add HiveMind_User_Guide.md with quick start, swarm and fusion workflows, configuration examples, arbitration strategies, streaming support, usage & cost tracking, best practices, and troubleshooting
- Update docs/HiveMind Task.md to reflect implementation progress:
  - fusion: `_prepare_fusion_models`, role assignment, partial arbiter context/weights parsing
  - recursion: single-call autonomous mode, consensus/conflict parsing, `_trigger_round_2()` replaced by Autonomous Decision Protocol
  - polish: partial failure handling, infinite recursion prevention, latency/token logging, and rate limit mitigation marked implemented
Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review: Phase 6 & Polish

I have reviewed the latest updates, which include significant enhancements to usage tracking, fusion metadata, and comprehensive documentation.

Assessment of New Changes:

  • Documentation: The new HiveMind_User_Guide.md and API.md are excellent additions that clearly explain the complex features.
  • Usage Aggregation: The refactoring to dynamically sum all numeric usage fields (instead of hardcoding them) is a great move for future-proofing.
  • Observability: The addition of hivemind_details provides much-needed visibility into cost and latency without polluting standard metrics.
  • Safety Note: I've flagged a potential runtime risk regarding the injection of hivemind_details into the response object. If the underlying library returns strict objects, this could raise an AttributeError.

Overall Status:
The PR is in great shape. The safety note is a minor guardrail suggestion; assuming it passes your local tests, the logic is sound.

This review was generated by an AI assistant.

arbiter_usage = chunk["_hivemind_usage"]
continue # Don't yield metadata chunk

# For SSE chunks, check if this is the final chunk with usage

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dynamic aggregation of usage fields is a smart improvement! It makes the code much more robust to changes in the underlying library structures.

…andling

- add ConfigLoader.get_all_swarm_model_ids to discover swarm model variants from preset JSONs (generates IDs like {base_model}-{preset_id}[swarm])
- include discovered hivemind_swarm entries in RotatingClient's available models listing and add informative logging
- update EnsembleManager to detect `[swarm]` suffix and extract base model by recognizing preset IDs (checks swarms_dir/{preset_id}.json)
- update default swarm config to include `id` and `base_models` for preset-based discovery

BREAKING CHANGE: swarm suffix customization via `swarm_default.suffix` is no longer honored. Swarm model IDs must use the `[swarm]` suffix and swarm preset configs must include an `id` and `base_models` array. Update existing swarm configs and any code that relied on a custom swarm suffix.
…te in task checklist

Update docs/HiveMind Task.md to reflect progress on fusion features and documentation.

- Mark "Role context for Arbiter" and "Weight system" as completed in the Fusion Features section.
- Mark "Documentation" and its subitems (User guide, Example configs, API reference) as completed.
- Accept both single-fusion objects and new {"fusions": [...]} array files when loading fusion configs.
- Add _register_fusion helper to centralize validation, duplicate handling, and logging.
- Emit warnings for missing 'id' fields and for non-list 'fusions' entries.
- Return fusion IDs with a "[fusion]" suffix from get_all_fusion_ids and have EnsembleManager detect fusion IDs by that suffix.

BREAKING CHANGE: get_all_fusion_ids now appends "[fusion]" to fusion IDs and EnsembleManager detects fusions via the "[fusion]" suffix. Consumers that expect plain fusion IDs must update to handle or strip the suffix. Config files may use the new {"fusions": [...]} format; single-object configs remain supported.
Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review: Consolidated Issues List

Per your request to poll all outstanding problems, I have reviewed the latest incremental changes and checked against previous feedback.

Assessment of New Changes:
The new dynamic model discovery logic and configuration updates in config_loader.py are solid improvements. The support for the {"fusions": [...]} array format adds good flexibility.

Consolidated Issues List:
Here are the specific items that need to be addressed to clear the queue:

  1. Runtime Safety (Critical): The arbiter_response.usage assignment in manager.py (lines 1290 and 1409) is still unprotected. This is a potential crash vector if the response object is strict. (See inline comment).
  2. Naming Ambiguity (Edge Case): The new get_base_model logic has a potential collision where a model name ending in a hyphenated word (like gpt-4-turbo) could be misparsed if turbo exists as a preset. (See inline comment).

Overall Status:
The logic is maturing well. Addressing these two points (especially the safety check) should make this robust enough for the next phase.

This review was generated by an AI assistant.

…entation

Standardize references and headings to use "HiveMind Ensemble" for clarity and consistency across docs.

- Updated: docs/HiveMind Plan.md
- Updated: docs/HiveMind Task.md
- Updated: docs/HiveMind_API.md
- Updated: docs/HiveMind_User_Guide.md
- Updated: src/rotator_library/ensemble_configs/README.md
…mple configs

Refactor swarm configuration loading to use preset files identified by an "id" and "base_models". Changes include:
- Simplify _load_swarm_configs to assume preset-based discovery and defer per-preset loading to get_swarm_config.
- Replace get_swarm_config(model: str) with get_swarm_config(preset_id: str) which loads <preset_id>.json, validates presence of "id" and "base_models", and falls back to the default preset on error or missing file.
- Remove legacy per-model in-memory discovery/merging logic; presets are loaded on-demand.
- Add example configs: fusions/multi-provider-test.json and swarms/test-gemini.json.

BREAKING CHANGE: ConfigLoader.get_swarm_config signature and behavior changed. Previously callers passed a base model name and received a merged config with model-specific overrides applied; callers must now pass a preset ID (filename without .json) and will receive the preset config (with "id" and "base_models") or the default fallback. Migrate existing per-model configs to the preset format or update call sites to use preset IDs and new discovery helpers.
- add roles_dir and role_templates to ConfigLoader and implement _load_roles
- support single-role JSON and new {"roles": [...]} array format; register roles with normalized IDs and warn on duplicates
- expose get_role_template for runtime lookup
- resolve "role_template" in EnsembleManager by merging template with specialist config (specialist overrides template) and fallback to name-based role labels
- include sample role configs: architect, code-reviewer, security-expert
- update dev-team fusion arbiter to use blind=true and clarify note about hiding model names while preserving roles
Update ConfigLoader to count swarm presets by enumerating JSON files in the swarms directory and log that value instead of using the outdated self.swarm_configs length. This ensures the startup info reflects the actual number of preset files and avoids misleading logs when presets are used.
- remove redundant initialization log in RotatingClient to avoid duplicate messages
- update EnsembleManager logger message from "[HiveMind] EnsembleManager initialized" to "[HiveMind] Ensemble Manager initialized" for readability and consistency
Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review: Consolidated Issues List

@Mirrowel, as requested, I have analyzed the current codebase against all previous feedback. Good news: most critical issues (Configuration Safety, Model Shadowing, Conflict Resolution) are now RESOLVED.

Here is the consolidated list of the 2 remaining issues that need to be addressed to clear the queue:

  1. Runtime Safety (High Priority):

    • Location: manager.py lines 1303 and 1422.
    • Problem: The injection of hivemind_details into arbiter_response.usage is still unprotected. While you correctly wrapped the field copy loop, the direct assignment of the dictionary itself can still cause a crash if the response object is strict.
    • Action: Wrap these assignments in a try...except block (see inline suggestions).
  2. Naming Ambiguity (Edge Case):

    • Location: manager.py line 126 (get_base_model).
    • Problem: The logic splits on the last hyphen. If a user provides a base model that contains hyphens (e.g., gpt-4-turbo) and the suffix happens to match a valid preset name (e.g., turbo), the parser will incorrectly split it into gpt-4 (base) + turbo (preset).
    • Action: Acknowledge the limitation or implement a stricter separator (e.g., --) or prefix validation.

Overall Status:
Addressing these two points (specifically the safety check) will complete the review process for this PR.

This review was generated by an AI assistant.


# Add hivemind_details as SUPPLEMENTARY breakdown information
# This does NOT replace standard fields, but provides additional context
arbiter_response.usage.hivemind_details = hivemind_details

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment remains unprotected. While the loop above is now safe, this specific line will still crash if arbiter_response.usage does not allow dynamic attribute assignment (e.g., strict Pydantic models). Please wrap this in a try...except block.

Suggested change
arbiter_response.usage.hivemind_details = hivemind_details
try:
arbiter_response.usage.hivemind_details = hivemind_details
except (AttributeError, TypeError):
lib_logger.debug("[HiveMind] Could not attach hivemind_details to usage object")


# Add hivemind_details as SUPPLEMENTARY breakdown information
# This does NOT replace standard fields, but provides additional context
arbiter_response.usage.hivemind_details = hivemind_details

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same safety issue here. This assignment needs to be wrapped in a try...except block to prevent potential runtime crashes on strict usage objects.

Suggested change
arbiter_response.usage.hivemind_details = hivemind_details
try:
arbiter_response.usage.hivemind_details = hivemind_details
except (AttributeError, TypeError):
lib_logger.debug("[HiveMind] Could not attach hivemind_details to usage object")

Comment on lines +126 to +131
# Parse: {base_model}-{preset_id}
# preset_id is the last segment after the last hyphen
if "-" in swarm_id:
# Split and check if last segment is a preset ID
parts = swarm_id.rsplit("-", 1)
potential_preset = parts[1]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming ambiguity issue remains unresolved. Splitting on the last hyphen can misidentify the base model if the model name itself contains hyphens and ends with a word matching a preset ID (e.g. gpt-4-turbo vs turbo preset).

Consider enforcing a double-hyphen separator (--) for presets or validating the base model prefix against _provider_models.

… and filter ".example" artifacts

- Introduce an omit_id_presets registry in ConfigLoader by scanning presets that declare "omit_id": true so short swarm names like "model[swarm]" can be resolved to a concrete preset.
- Add get_preset_for_model(base_model) to encapsulate preset selection (prefer a registered omit_id mapping, otherwise fall back to "default") and emit debug diagnostics during resolution.
- Update discovery routines for swarms, fusions, strategies, and roles to ignore files whose stem ends with ".example" to avoid exposing sample artifacts.
- Change the discovery output to reflect omit_id behavior: emit the compact form "model[swarm]" for presets that omit their id, and the explicit "model-preset[swarm]" for others. Explicit identifiers remain valid at runtime.
- Modify EnsembleManager to return both the resolved base model and the preset id from get_base_model, and use the resolved preset when loading swarm configs; add clearer logging around routing.
- Add omit_id flags to example swarm configs and extra loader logging, including warnings when multiple presets claim the same base model.

BREAKING CHANGE: EnsembleManager.get_base_model now returns a tuple (base_model, preset_id) instead of a single string. Update call sites to unpack the result and pass the preset id when fetching swarm configs.

Migration example:
- Before: base = get_base_model(swarm_id); config = get_swarm_config(base)
- After: base, preset = get_base_model(swarm_id); config = get_swarm_config(preset)
…, strategies and examples

- Add comprehensive HiveMind Ensemble documentation across README, DOCUMENTATION.md, and docs/ (overview, swarm/fusion modes, arbitration strategies, recursive mode, usage tracking, architecture).
- Document preset-based swarm configuration and discovery rules for /v1/models; clarify runtime vs discovery behavior.
- Add ensemble_configs examples and samples:
  - fusions/fusion.example.json
  - roles/role.example.json
  - roles/roles-array.example.json
  - strategies/strategy.example.txt
  - swarms/preset.example.json
- Update library README and ensemble_configs README to describe presets, role templates, strategy templates, and usage examples.
- Documentation-only changes and sample config files; no runtime code changes or breaking API modifications.
Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review

I've reviewed the new Preset System implementation. The configuration logic and omit_id discovery features are well-implemented and add significant flexibility to the HiveMind system.

Assessment of New Changes:

  • Presets: The ConfigLoader updates correctly handle omit_id mapping.
  • Documentation: The updated guides clearly explain the new explicit vs. short syntax.

Critical Action Items (Unresolved):
However, the specific issues flagged in the "Consolidated Issues List" remain unaddressed in this iteration. These must be fixed to ensure stability:

  1. Runtime Safety (Blocking): The hivemind_details injection in manager.py (Lines 1307 & 1426) still lacks try...except protection. This is a crash vector for strict usage objects.
  2. Naming Ambiguity: The get_base_model logic (Line 132) creates a collision risk for models ending in words that match preset names (e.g., gemini-1.5-flash matching a flash preset).

Overall Status:
Request Changes. The preset feature is good, but we cannot merge with the known runtime crash risk in the usage reporting logic.

This review was generated by an AI assistant.

# preset_id is the last segment after the last hyphen
if "-" in swarm_id:
# Split and check if last segment is a preset ID
parts = swarm_id.rsplit("-", 1)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming ambiguity issue remains unresolved and is critical with the new preset system.

If a model name ends with a word that matches a preset ID (e.g., gemini-1.5-flash where flash.json exists), this logic will incorrectly parse it as Base: gemini-1.5, Preset: flash.

Consider enforcing a double-hyphen separator (e.g. model--preset[swarm]) or checking against the known provider model list before assuming a preset split.


# Add hivemind_details as SUPPLEMENTARY breakdown information
# This does NOT replace standard fields, but provides additional context
arbiter_response.usage.hivemind_details = hivemind_details

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Runtime Safety Risk (Unresolved)

This assignment is still unprotected. If arbiter_response.usage is a strict object (like a Pydantic model) that doesn't allow arbitrary attribute assignment, this will raise an AttributeError and crash the request.

Suggested change
arbiter_response.usage.hivemind_details = hivemind_details
try:
arbiter_response.usage.hivemind_details = hivemind_details
except (AttributeError, TypeError):
lib_logger.debug("[HiveMind] Could not attach hivemind_details to usage object")


# Add hivemind_details as SUPPLEMENTARY breakdown information
# This does NOT replace standard fields, but provides additional context
arbiter_response.usage.hivemind_details = hivemind_details

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Runtime Safety Risk (Unresolved)

Same safety issue here. This must be wrapped in a try...except block to prevent crashes with strict usage objects.

Suggested change
arbiter_response.usage.hivemind_details = hivemind_details
try:
arbiter_response.usage.hivemind_details = hivemind_details
except (AttributeError, TypeError):
lib_logger.debug("[HiveMind] Could not attach hivemind_details to usage object")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Agent Monitored Monitored for AI Agent to review PR's and commits

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants