-
Notifications
You must be signed in to change notification settings - Fork 9
feat: Introduce HiveMind Ensemble Subsystem for Swarm/Fusion Orchestration #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…k list Add two new documentation files: docs/HiveMind Plan.md (detailed architecture and implementation plan for Swarm/Fusion orchestration) and docs/HiveMind Task.md (phase-by-phase task checklist). Includes terminology, request flow, core components, config schemas, feature specs (jitter, adversarial, blind switch, recursive mode), streaming and usage tracking, integration points, error handling, testing strategy, and implementation phases to guide development and verification.
…ault configs Introduce a new HiveMind ensemble subsystem to enable Swarm/Fusion orchestration. - add src/rotator_library/ensemble: EnsembleManager, ConfigLoader, and package init - add ensemble_configs: default swarm config, fusion example (dev-team), and strategy templates - integrate EnsembleManager into RotatingClient to initialize the ensemble manager at startup - implement conflict resolution, config loading, and ensemble detection logic; swarm/fusion execution paths staged as TODOs This lays the groundwork for parallel model execution and intelligent arbitration without changing existing public APIs.
Detect ensemble model identifiers in RotatingClient and delegate handling to self.ensemble_manager.handle_request, enabling HiveMind (swarm/fusion) request routing. Preserve the existing iflow provider workaround (removing stream_options to avoid HTTP 406). Add documentation for ensemble configurations at src/rotator_library/ensemble_configs/README.md.
|
I'm starting my review of the HiveMind Ensemble foundation. Diving into the new subsystem and integration points now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall Assessment
This PR lays a solid foundation for the HiveMind subsystem. The directory-based configuration strategy (ensemble_configs) is a great choice for extensibility, allowing users to easily share and drop in new Swarm/Fusion definitions. The integration into `RotatingClient' is clean and unobtrusive.
Architectural Feedback
The separation of concerns between the EnsembleManager and the core client is well-handled. The decision to intercept requests based on the model ID suffix/ID is standard for this type of proxy, though the conflict resolution logic warrants a second look to ensure we don't accidentally make real provider models inaccessible.
Key Suggestions
- Config Safety: Ensure the default configuration is deep-copied to prevent state leakage between requests (see comment in
config_loader.py). - Conflict Resolution: Clarify the intended precedence when a provider model name collides with an ensemble ID.
Nitpicks
load_provider_modelsis currently a placeholder; ensure this is tracked in your Phase 2 tasks so conflict detection actually activates.
This review was generated by an AI assistant.
| Merged configuration dictionary | ||
| """ | ||
| # Start with default | ||
| config = self.swarm_default.copy() if self.swarm_default else {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.swarm_default.copy() creates a shallow copy. Since swarm_default contains nested dictionaries (e.g., temperature_jitter), modifying nested fields in the returned configuration could inadvertently mutate the global default configuration for subsequent requests.
Consider using copy.deepcopy() to ensure full isolation:
| config = self.swarm_default.copy() if self.swarm_default else {} | |
| import copy | |
| config = copy.deepcopy(self.swarm_default) if self.swarm_default else {} |
| def resolve_conflicts(self, ensemble_id: str) -> str: | ||
| """ | ||
| Resolve naming conflicts by appending numeric suffixes. | ||
|
|
||
| If an ensemble ID conflicts with a real provider model, | ||
| append -1, -2, -3, etc. until unique. | ||
|
|
||
| Args: | ||
| ensemble_id: Original ensemble ID (swarm or fusion) | ||
|
|
||
| Returns: | ||
| Resolved unique ensemble ID | ||
| """ | ||
| # Check cache first | ||
| if ensemble_id in self._resolved_names: | ||
| return self._resolved_names[ensemble_id] | ||
|
|
||
| # Load provider models if not cached | ||
| if self._provider_models is None: | ||
| self._load_provider_models() | ||
|
|
||
| # Check for conflict | ||
| if ensemble_id not in self._provider_models: | ||
| # No conflict, use original | ||
| self._resolved_names[ensemble_id] = ensemble_id | ||
| return ensemble_id | ||
|
|
||
| # Conflict detected, find available suffix | ||
| counter = 1 | ||
| while True: | ||
| candidate = f"{ensemble_id}-{counter}" | ||
| if candidate not in self._provider_models: | ||
| lib_logger.warning( | ||
| f"[HiveMind] Naming conflict detected. " | ||
| f"Renamed '{ensemble_id}' to '{candidate}'" | ||
| ) | ||
| self._resolved_names[ensemble_id] = candidate | ||
| return candidate | ||
| counter += 1 | ||
|
|
||
| # Safety check (shouldn't happen in practice) | ||
| if counter > 100: | ||
| lib_logger.error( | ||
| f"[HiveMind] Could not resolve naming conflict for '{ensemble_id}' " | ||
| f"after 100 attempts" | ||
| ) | ||
| return f"{ensemble_id}-{counter}" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current conflict resolution logic renames the incoming ensemble request ID if a conflict is found. This effectively "shadows" the underlying provider model.
For example, if a provider actually offers a model named gpt-4o[swarm], this logic (combined with is_ensemble returning True) ensures the HiveMind subsystem intercepts the request. The user would have no way to bypass HiveMind and access the provider's native gpt-4o[swarm] model directly. Is this the intended precedence?
…rallel execution Add a private _prepare_drones method to EnsembleManager to build configurations for parallel/swarm execution. - Creates N identical copies of the original request parameters (default count = 3) and enforces the base_model. - Deep-copies message payloads to prevent mutation and attaches _drone_index and _total_drones metadata for logging. - Emits debug logs for each prepared drone. - Leaves hooks for advanced jitter/adversarial behavior to be implemented in Phase 4.
Introduce async _execute_parallel to run drone requests concurrently using asyncio.gather, assemble tasks via RotatingClient retry logic, aggregate usage metrics (prompt_tokens, completion_tokens, total_tokens and optional fields), log per-drone outcomes, and return successful responses with aggregated usage. Raises if all drones fail and warns when some fail. Add _format_for_arbiter to normalize and number successful drone responses for the arbiter, extracting OpenAI-style content, skipping empty responses, and producing a single formatted text blob for arbitration. These helpers prepare HiveMind ensemble requests for parallel execution and downstream arbitration.
… execution flow Add _build_arbiter_prompt to construct system/user messages from strategy templates and original messages, and _call_arbiter to invoke a non-streaming arbiter model via RotatingClient and extract usage. Complete the swarm branch in handle_request to: - prepare drones and execute them in parallel - format drone responses for the arbiter - build and resolve arbiter messages and model (handle "self" -> base model) - call the arbiter and aggregate token usage (including cached_tokens and reasoning_tokens when present) - attach aggregated usage back to the arbiter response and return it Fallbacks and logging added for missing strategy templates and for observability. Fusion mode remains NotImplementedError (Phase 5).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review
I see you've wired up the core Swarm execution logic (Phase 2F). The structure for parallel drone execution and arbitration looks solid!
However, I've identified a critical blocking bug that will cause the new code to crash immediately upon execution, along with a few unaddressed points from the previous review.
Assessment of New Changes:
- Crash in Parallel Execution: The call to
_execute_with_retrypassesapi_call=None, which is incompatible with the client's retry logic. - Configuration Safety: The shallow copy issue in
config_loader.pyremains unaddressed, posing a risk of state mutation. - Model Shadowing: The conflict resolution logic handles internal uniqueness but does not solve the interception issue. Since
is_ensemble()returnsTruebefore checking if the model exists natively, HiveMind will still hijack requests intended for a provider's native model namedfoo[swarm]. To fix this,is_ensemble()should returnFalseif the model ID exists in_provider_models.
Overall Status:
Changes Requested. The code requires a fix for the TypeError crash before it can be run or merged.
This review was generated by an AI assistant.
Import litellm and pass its `acompletion` function to `rotating_client._execute_with_retry` when creating drone tasks. Remove the unused `asyncio` import. This fixes passing `None` as the api_call and ensures drones invoke the litellm API correctly. addresses #8 (comment)
Add async _call_arbiter_streaming to stream arbiter model responses, track per-stream usage, and emit a final _hivemind_usage metadata chunk. Add async _handle_swarm_streaming to execute drones in parallel, format responses for the arbiter, stream the arbiter output, aggregate drone and arbiter token usage, and inject aggregated usage into the final SSE chunk(s). Uses rotating_client._streaming_acompletion_with_retry and preserves existing non-streaming flows.
…credential rotation Normalize the model identifier up front by calling _resolve_model_id before entering the credential rotation loops and write the canonical value into kwargs["model"]. This eliminates per-iteration model lookups and duplicated log entries. - remove inline resolution within rotation loops - propagate the normalized model to litellm kwargs for consistent acquire/release/tracking - reduces redundant work and log noise, and stabilizes model reporting across retries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review
I've reviewed the incremental changes, focusing on the new streaming implementation in manager.py and the refactoring in client.py.
Assessment of New Changes:
- Streaming Support: The new
_handle_swarm_streamingand_call_arbiter_streamingmethods provide a good foundation for Phase 3. The usage aggregation logic looks correct. - Model Resolution: Moving
_resolve_model_idearlier inclient.pyimproves consistency.
Key Concerns (Unresolved & New):
- Mutable Default Configs (Critical): The feedback about using
copy.deepcopyinconfig_loader.pywas not addressed. Furthermore, this pattern has propagated tomanager.py's streaming handler. This is a significant risk: modifying a shallow copy of the configuration will mutate the global defaults for all subsequent requests. - Shadowing Logic: The conflict resolution strategy in
EnsembleManagerrenames the internal request ID but does not solve the issue of the client intercepting the request first. Users still cannot access a provider model if it conflicts with an ensemble naming convention.
Overall Status:
The streaming logic is on the right track, but the configuration management needs to be fixed to avoid state pollution.
This review was generated by an AI assistant.
|
|
||
| # Check for conflict | ||
| if ensemble_id not in self._provider_models: | ||
| # No conflict, use original |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have concerns about the shadowing logic.
While resolve_conflicts renames the internal ensemble ID (e.g. to model[swarm]-1), the interception in client.py (if self.ensemble_manager.is_ensemble(model)) happens before this resolution.
If a provider has a model named gpt-4o[swarm], is_ensemble will return True, RotatingClient will delegate to EnsembleManager, and the user will never be able to reach the actual provider model. This conflict resolution protects the Manager from crashing, but it effectively "shadows" the provider model from the user's perspective.
…=true Introduce an `is_streaming` flag from kwargs and include it in the swarm processing log. When `stream` is true, route execution to `_handle_swarm_streaming` (returning an async generator). Otherwise preserve the existing non-streaming flow (prepare drones, execute in parallel, call arbiter, aggregate and attach usage). This enables streaming arbiter/swarm handling while keeping backward-compatible non-streaming behavior.
Add handling for new ensemble config options: `temperature_jitter` and `adversarial_config`. This enables: - optional temperature jitter per drone (configurable `delta`, default 0.2), applied via random uniform jitter and clamped to [0.0, 2.0] - optional adversarial mode where the last N drones inject a provided adversarial system prompt - per-drone metadata (`_is_adversarial`, `_drone_index`, `_total_drones`) and enhanced debug logging for jitter and adversarial injections These changes allow controlled diversity of drone behavior and targeted adversarial prompts for critical analysis without changing existing call signatures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review
I've reviewed the incremental changes adding adversarial mode, temperature jitter, and streaming support. The implementation of these new features logic is generally sound and aligns with the planned architecture.
Assessment of New Changes:
- Features: The adversarial and jitter logic is implemented correctly within the drone preparation phase.
- Streaming: The split between streaming and non-streaming handlers (
_handle_swarm_streaming) is a logical approach. - Unaddressed Feedback: I noticed that the feedback from the previous review regarding variable shadowing, configuration safety (deepcopy), and import placement has not been addressed. In fact, the new code introduces additional local imports inside loops.
Overall Status:
The functional logic is good, but the code quality issues (imports, safety) and the architectural shadowing behavior should be resolved before merging to ensure maintainability and correctness.
This review was generated by an AI assistant.
| if self._provider_models is None: | ||
| self._load_provider_models() | ||
|
|
||
| # Check for conflict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shadowing issue previously raised remains unresolved. Because is_ensemble checks the name before any conflict resolution happens, a provider's native model named model[swarm] is effectively unreachable—the EnsembleManager will always intercept it. This logic handles internal collisions but doesn't solve the user-facing availability problem.
…orts Introduce a blind switch in arbiter formatting that defaults to anonymizing model names (blind_mode=true). When blind mode is disabled, model names are included in response labels. Hoist frequently used imports (litellm, asyncio, random, copy) to module scope and remove redundant local imports to reduce repeated imports and clarify code paths. Add debug logging for anonymization and include blind_mode in formatted output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review
I've reviewed the latest changes. The cleanup of imports and the addition of blind_mode logic in manager.py look good. However, the critical issues raised in the previous review regarding configuration safety and model shadowing remain unresolved.
Assessment of New Changes:
- Improvements: The import organization in
manager.pyis much cleaner. Theblind_modelogic is a good addition for the privacy requirements. - Persistent Risks:
- Configuration Mutation:
config_loader.pystill uses shallow copies. This is a latent bug that could cause request cross-contamination. - Model Shadowing: The
is_ensemblelogic still prioritizes HiveMind interception over native provider models. - Conflict Resolution: I noticed that
_load_provider_modelsis currently a no-op, which means the conflict resolution logic (while present) is effectively disabled.
- Configuration Mutation:
Overall Status:
The PR needs to address the configuration safety issue before merge to prevent runtime bugs. The shadowing and conflict resolution issues should at least be acknowledged or documented if they are deferred to a later phase, though fixing the precedence now is recommended.
This review was generated by an AI assistant.
| def is_ensemble(self, model_id: str) -> bool: | ||
| """ | ||
| Check if a model ID represents an ensemble request. | ||
|
|
||
| Args: | ||
| model_id: Full model ID from user request | ||
|
|
||
| Returns: | ||
| True if this is an ensemble (swarm or fusion), False otherwise | ||
| """ | ||
| # Check for fusion ID (exact match) | ||
| if model_id in self.config_loader.fusion_configs: | ||
| return True | ||
|
|
||
| # Check for swarm suffix | ||
| if self._is_swarm_request(model_id): | ||
| return True | ||
|
|
||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shadowing issue previously raised remains unresolved. is_ensemble returns True based solely on the suffix/ID match, which happens before any conflict resolution logic.
This means if a provider has a native model named gpt-4o[swarm], this method will return True, and the EnsembleManager will intercept the request. The user will be unable to reach the provider's native model.
To fix this, is_ensemble should ideally check if the model ID exists in self._provider_models (once populated) and return False if it's a native model, allowing RotatingClient to handle it normally.
| def _load_provider_models(self) -> None: | ||
| """ | ||
| Load all provider models from RotatingClient. | ||
|
|
||
| This is used for conflict detection. | ||
| """ | ||
| try: | ||
| # Get all available models (this might be async in the actual implementation) | ||
| # For now, we'll use a synchronous approach | ||
| # TODO: Handle async model loading properly | ||
| self._provider_models = set() | ||
|
|
||
| # Note: This will be implemented properly when we integrate with RotatingClient | ||
| # For now, just initialize an empty set | ||
| lib_logger.debug("[HiveMind] Provider models cache initialized (empty)") | ||
|
|
||
| except Exception as e: | ||
| lib_logger.error(f"[HiveMind] Failed to load provider models: {e}") | ||
| self._provider_models = set() | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice _provider_models is currently initialized as an empty set and never populated with actual provider data. This renders the resolve_conflicts logic (lines 120-167) effectively non-functional, as it will never detect a collision.
While this is a "Foundation" PR, shipping the conflict resolution logic in a state where it cannot work might be misleading. Consider implementing a basic fetch from rotating_client or adding a clear TODO/warning that conflict detection is currently disabled.
Add a private `_prepare_fusion_models` method in EnsembleManager to build per-specialist model configs for fusion execution. - Iterates configured specialists and skips entries missing a model. - Clones request params and deep-copies messages when present. - Injects role-specific system prompts as the first message when provided. - Attaches specialist metadata (`_specialist_index`, `_specialist_role`, `_specialist_weight`, `_total_specialists`) to each model config. - Emits debug and warning logs for visibility. This prepares a list of ready-to-run specialist model parameter sets for the fusion pipeline.
|
Focus on reporting new issues. All these will be addressed later - they are not critical. |
…ting, streaming, and usage aggregation Implement full Fusion (Phase 5) flow in EnsembleManager: - add optional specialist_metadata param to _format_for_arbiter and emit role-aware labels (role or "role (model)" depending on blind mode). - prepare and execute specialist models in parallel, format their outputs for an arbiter, and build arbiter messages. - support both streaming and non-streaming arbiter calls; stream yields intermediate chunks and aggregates usage when final chunk arrives. - aggregate usage metrics across specialists and arbiter (prompt_tokens, completion_tokens, total_tokens, plus optional cached_tokens/reasoning_tokens) and attach aggregated usage to returned/streamed responses. - add detailed logging for fusion lifecycle and usage totals. - update fusion config handling to load specialists and pass specialist metadata into formatting/execution. Also update example fusion config (dev-team.json): - rename "models" -> "specialists" - replace "system_prompt_append" with "system_prompt" - normalize "weight" to numeric values - set arbiter.blind to false by default with explanatory note BREAKING CHANGE: Fusion configuration schema changed and may break existing fusion configs and integrations. - Migration: - Rename top-level fusion arrays from "models" to "specialists". - Replace per-specialist key "system_prompt_append" with "system_prompt". - Convert "weight" values from descriptive strings to numeric weights (e.g., 1.0, 1.2). - Review arbiter.blind semantics (default now false); adjust configs if anonymous responses were previously required. - If any external callers invoked the old NotImplemented fusion path or relied on the previous _format_for_arbiter behavior, update calls to accommodate specialist_metadata and new labeling.
Update checklist in docs/HiveMind Task.md to reflect implemented work and current priorities. - Mark Phase 1 items (design, EnsembleManager creation, client integration, ensemble_config.json) as complete - Mark Phase 2 swarm features and associated tests as complete - Mark Phase 3 temperature jitter, adversarial mode, and blind switch as complete - Move Confidence Scoring to Recursive Mode and leave it pending
… routing - add async `_handle_fusion_streaming` to execute specialists in parallel, format responses for the arbiter, stream arbiter chunks, and aggregate usage (including `cached_tokens` and `reasoning_tokens`) - route streaming fusion in `handle_request` to the new handler and remove duplicated streaming logic - aggregate and inject total usage for non-streaming fusion responses and improve logging of fusion usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review
The Fusion mode implementation has been reviewed and appears robust, correctly following the architectural patterns established in the Swarm implementation.
Assessment of New Changes:
- Fusion Logic: The implementation of
_handle_fusion_streamingand the integration intohandle_requestaligns perfectly with the design plan. - Safety: I noticed
_prepare_fusion_modelsusescopy.deepcopyfor the messages list. This is excellent and prevents the reference modification issues flagged in other parts of the codebase. - Streaming: The usage aggregation logic in the streaming handler correctly accounts for both specialist and arbiter tokens.
Overall Status:
The new changes look good and are ready to merge.
Note: As discussed, the previously identified issues regarding global config shallow copies and provider model shadowing remain as tracked technical debt to be addressed in a future polish phase.
This review was generated by an AI assistant.
Introduce Phase 6 recursive/autonomous protocol for the arbiter: - Append recursive instructions to the arbiter system prompt when config.recursive_mode.enabled is true. - Honor configurable consensus_threshold and include autonomous decision phases (assessment, decision, synthesis). - Stream handler now filters content wrapped in [INTERNAL]...[/INTERNAL], buffers internal reasoning, and only yields user-facing synthesis to callers. - Add _log_recursive_markers helper to parse and log consensus scores, conflicts, and critique reasoning; improve related logging and default strategy fallback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review
I've reviewed the incremental changes implementing the Recursive Mode and streaming filters.
Assessment of New Changes:
The Phase 6 implementation introduces a critical logic error in the streaming handler (_handle_swarm_streaming) that causes data loss. The current implementation discards parts of the response that share a chunk with the [INTERNAL] or [/INTERNAL] tags. This must be fixed before merge.
Previous Feedback Status:
As requested, I am not re-posting inline comments for previous findings, but I must note that they remain unaddressed and are still valid concerns:
- Unsafe Shallow Copies:
config_loader.pyandmanager.pystill use shallow copies, risking global state mutation. - Model Shadowing:
is_ensemblestill shadows provider models with the same name. - Broken Conflict Resolution:
_provider_modelsis still never populated, rendering conflict detection dead code.
Overall Status:
Request Changes due to the new data loss bug in the streaming logic.
This review was generated by an AI assistant.
|
@mirrobot-agent Can you check the current code status vs all the reviews and list them all here for me to fix. Goal here is to poll all problems i need to address in easy to read and parse form. Don't need code suggestions, just problem statements with references to where and how. If one of the reviews is no longer relevant - resolve it by marking it resolved. |
|
@Mirrowel, I'm analyzing the current code state against the previous reviews to compile the consolidated list of issues you requested. I'll post the summary shortly. |
|
@Mirrowel, I have analyzed the current code against the previous reviews. Here is the consolidated list of unresolved issues you requested. 1. Critical Data Loss in Streaming Logic (Internal Tags)Location:
2. Unsafe Shallow Copy of ConfigurationsLocation:
3. Provider Model ShadowingLocation: 4. Conflict Resolution is Non-FunctionalLocation: This summary was generated by an AI assistant. |
…dle [INTERNAL] markers Fixes #8 (comment) - Use copy.deepcopy for swarm defaults, request cloning, and config copies to avoid shared mutable state and accidental cross-request mutations. - Initialize and populate provider model cache from rotating_client.model_definitions (and call loader on init/lazily) to prevent provider model shadowing when detecting ensembles. - Improve streaming handling of `[INTERNAL]...[/INTERNAL]` by buffering internal segments, logging internal content via _log_recursive_markers, and yielding surrounding content correctly. - Move/remove redundant local regex import and add explicit re import at module level for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review: All Critical Issues Resolved
I have reviewed the latest changes and am pleased to confirm that all critical issues raised in the previous review have been addressed effectively.
Assessment of Fixes:
- Critical Data Loss in Streaming: The rewritten
[INTERNAL]tag parsing logic in_call_arbiter_streamingis now robust. It correctly handles content before/after tags and buffers incomplete tags across chunks, preventing the data loss scenario. - Unsafe Shallow Copy: The switch to
copy.deepcopy()acrossconfig_loader.pyandmanager.pyensures that the global default configuration remains immutable, protecting the application state. - Provider Model Shadowing: The
is_ensemblemethod now correctly checks againstself._provider_modelsbefore claiming a request. This ensures that native provider models (e.g.,gpt-4o[swarm]if it existed) remain accessible to users. - Conflict Resolution: The
_load_provider_modelsmethod is now fully implemented, populating the provider cache fromRotatingClient. This makes the conflict detection and resolution logic functional.
Overall Status:
The PR is now in excellent shape. The "HiveMind" ensemble foundation is solid, safe, and ready for merging.
This review was generated by an AI assistant.
|
|
||
| # Handle [INTERNAL] start | ||
| if '[INTERNAL]' in content: | ||
| parts = content.split('[INTERNAL]') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified: This splitting logic correctly handles the [INTERNAL] tag boundaries and prevents the data loss issue I flagged previously. The buffering ensures partial tags are also handled correctly.
…king Replace fixed token summation with dynamic aggregation of all numeric usage fields across drone, specialist, and arbiter flows. Track execution start/end times, compute latency (ms), and attempt cost calculation via litellm. Attach a supplementary `hivemind_details` breakdown to `arbiter_response.usage` containing mode, counts, token breakdown, rounded cost, and latency. Iterate safely over usage attributes (skip private/magic, non-numeric, and handle set/get errors) and log latency/cost info. Also add a time import and defensive exception handling for cost calculation.
…e context for arbiter Add support for specialist weight descriptions and include specialist expertise context when building arbiter prompts. - Extract `weight_description` from specialist configs and attach it to model params as `_specialist_weight_description`. - Extend `EnsembleManager.build_arbiter_messages` to accept optional `specialist_metadata` and append a "SPECIALIST EXPERTISE" block to the strategy prompt that lists each specialist's role, model, and weight description; log debug info when added. - Update dev-team fusion config to include `weight_description` entries for existing specialists. No breaking changes.
…ongside formatted responses Collect specialist metadata (role, model, weight_description) during arbiter formatting and return it together with the formatted text so arbiter prompts can include specialist expertise context in fusion mode. - _format_for_arbiter now accumulates arbiter_metadata for each specialist when specialist_metadata is provided. - Returns a tuple (formatted_text, metadata_for_arbiter) where metadata_for_arbiter is None for swarm mode or a list of dicts for fusion mode. - Call sites are updated to unpack the tuple and forward specialist metadata into _build_arbiter_prompt. BREAKING CHANGE: _format_for_arbiter previously returned a plain formatted string; it now returns a tuple (formatted_text, metadata_for_arbiter). Update any external callers to unpack the returned tuple and, if needed, pass the metadata as the specialist_metadata argument to _build_arbiter_prompt.
…sting Previously get_all_available_models omitted HiveMind fusion models from the returned provider list. When an ensemble_manager is available, the method now fetches fusion IDs via ensemble_manager.config_loader.get_all_fusion_ids(), adds them under the "hivemind_fusion" provider key, and logs the number of fusion models added.
- Add comprehensive HiveMind_API.md covering EnsembleManager, ConfigLoader, response usage (standard + hivemind_details), configuration schemas (swarms/fusions/strategies), error handling, logging, advanced usage, migration, performance, and limitations - Add HiveMind_User_Guide.md with quick start, swarm and fusion workflows, configuration examples, arbitration strategies, streaming support, usage & cost tracking, best practices, and troubleshooting - Update docs/HiveMind Task.md to reflect implementation progress: - fusion: `_prepare_fusion_models`, role assignment, partial arbiter context/weights parsing - recursion: single-call autonomous mode, consensus/conflict parsing, `_trigger_round_2()` replaced by Autonomous Decision Protocol - polish: partial failure handling, infinite recursion prevention, latency/token logging, and rate limit mitigation marked implemented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review: Phase 6 & Polish
I have reviewed the latest updates, which include significant enhancements to usage tracking, fusion metadata, and comprehensive documentation.
Assessment of New Changes:
- Documentation: The new
HiveMind_User_Guide.mdandAPI.mdare excellent additions that clearly explain the complex features. - Usage Aggregation: The refactoring to dynamically sum all numeric usage fields (instead of hardcoding them) is a great move for future-proofing.
- Observability: The addition of
hivemind_detailsprovides much-needed visibility into cost and latency without polluting standard metrics. - Safety Note: I've flagged a potential runtime risk regarding the injection of
hivemind_detailsinto the response object. If the underlying library returns strict objects, this could raise anAttributeError.
Overall Status:
The PR is in great shape. The safety note is a minor guardrail suggestion; assuming it passes your local tests, the logic is sound.
This review was generated by an AI assistant.
| arbiter_usage = chunk["_hivemind_usage"] | ||
| continue # Don't yield metadata chunk | ||
|
|
||
| # For SSE chunks, check if this is the final chunk with usage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dynamic aggregation of usage fields is a smart improvement! It makes the code much more robust to changes in the underlying library structures.
…andling
- add ConfigLoader.get_all_swarm_model_ids to discover swarm model variants from preset JSONs (generates IDs like {base_model}-{preset_id}[swarm])
- include discovered hivemind_swarm entries in RotatingClient's available models listing and add informative logging
- update EnsembleManager to detect `[swarm]` suffix and extract base model by recognizing preset IDs (checks swarms_dir/{preset_id}.json)
- update default swarm config to include `id` and `base_models` for preset-based discovery
BREAKING CHANGE: swarm suffix customization via `swarm_default.suffix` is no longer honored. Swarm model IDs must use the `[swarm]` suffix and swarm preset configs must include an `id` and `base_models` array. Update existing swarm configs and any code that relied on a custom swarm suffix.
…te in task checklist Update docs/HiveMind Task.md to reflect progress on fusion features and documentation. - Mark "Role context for Arbiter" and "Weight system" as completed in the Fusion Features section. - Mark "Documentation" and its subitems (User guide, Example configs, API reference) as completed.
- Accept both single-fusion objects and new {"fusions": [...]} array files when loading fusion configs.
- Add _register_fusion helper to centralize validation, duplicate handling, and logging.
- Emit warnings for missing 'id' fields and for non-list 'fusions' entries.
- Return fusion IDs with a "[fusion]" suffix from get_all_fusion_ids and have EnsembleManager detect fusion IDs by that suffix.
BREAKING CHANGE: get_all_fusion_ids now appends "[fusion]" to fusion IDs and EnsembleManager detects fusions via the "[fusion]" suffix. Consumers that expect plain fusion IDs must update to handle or strip the suffix. Config files may use the new {"fusions": [...]} format; single-object configs remain supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review: Consolidated Issues List
Per your request to poll all outstanding problems, I have reviewed the latest incremental changes and checked against previous feedback.
Assessment of New Changes:
The new dynamic model discovery logic and configuration updates in config_loader.py are solid improvements. The support for the {"fusions": [...]} array format adds good flexibility.
Consolidated Issues List:
Here are the specific items that need to be addressed to clear the queue:
- Runtime Safety (Critical): The
arbiter_response.usageassignment inmanager.py(lines 1290 and 1409) is still unprotected. This is a potential crash vector if the response object is strict. (See inline comment). - Naming Ambiguity (Edge Case): The new
get_base_modellogic has a potential collision where a model name ending in a hyphenated word (likegpt-4-turbo) could be misparsed ifturboexists as a preset. (See inline comment).
Overall Status:
The logic is maturing well. Addressing these two points (especially the safety check) should make this robust enough for the next phase.
This review was generated by an AI assistant.
…entation Standardize references and headings to use "HiveMind Ensemble" for clarity and consistency across docs. - Updated: docs/HiveMind Plan.md - Updated: docs/HiveMind Task.md - Updated: docs/HiveMind_API.md - Updated: docs/HiveMind_User_Guide.md - Updated: src/rotator_library/ensemble_configs/README.md
…mple configs Refactor swarm configuration loading to use preset files identified by an "id" and "base_models". Changes include: - Simplify _load_swarm_configs to assume preset-based discovery and defer per-preset loading to get_swarm_config. - Replace get_swarm_config(model: str) with get_swarm_config(preset_id: str) which loads <preset_id>.json, validates presence of "id" and "base_models", and falls back to the default preset on error or missing file. - Remove legacy per-model in-memory discovery/merging logic; presets are loaded on-demand. - Add example configs: fusions/multi-provider-test.json and swarms/test-gemini.json. BREAKING CHANGE: ConfigLoader.get_swarm_config signature and behavior changed. Previously callers passed a base model name and received a merged config with model-specific overrides applied; callers must now pass a preset ID (filename without .json) and will receive the preset config (with "id" and "base_models") or the default fallback. Migrate existing per-model configs to the preset format or update call sites to use preset IDs and new discovery helpers.
- add roles_dir and role_templates to ConfigLoader and implement _load_roles
- support single-role JSON and new {"roles": [...]} array format; register roles with normalized IDs and warn on duplicates
- expose get_role_template for runtime lookup
- resolve "role_template" in EnsembleManager by merging template with specialist config (specialist overrides template) and fallback to name-based role labels
- include sample role configs: architect, code-reviewer, security-expert
- update dev-team fusion arbiter to use blind=true and clarify note about hiding model names while preserving roles
Update ConfigLoader to count swarm presets by enumerating JSON files in the swarms directory and log that value instead of using the outdated self.swarm_configs length. This ensures the startup info reflects the actual number of preset files and avoids misleading logs when presets are used.
- remove redundant initialization log in RotatingClient to avoid duplicate messages - update EnsembleManager logger message from "[HiveMind] EnsembleManager initialized" to "[HiveMind] Ensemble Manager initialized" for readability and consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review: Consolidated Issues List
@Mirrowel, as requested, I have analyzed the current codebase against all previous feedback. Good news: most critical issues (Configuration Safety, Model Shadowing, Conflict Resolution) are now RESOLVED.
Here is the consolidated list of the 2 remaining issues that need to be addressed to clear the queue:
-
Runtime Safety (High Priority):
- Location:
manager.pylines 1303 and 1422. - Problem: The injection of
hivemind_detailsintoarbiter_response.usageis still unprotected. While you correctly wrapped the field copy loop, the direct assignment of the dictionary itself can still cause a crash if the response object is strict. - Action: Wrap these assignments in a
try...exceptblock (see inline suggestions).
- Location:
-
Naming Ambiguity (Edge Case):
- Location:
manager.pyline 126 (get_base_model). - Problem: The logic splits on the last hyphen. If a user provides a base model that contains hyphens (e.g.,
gpt-4-turbo) and the suffix happens to match a valid preset name (e.g.,turbo), the parser will incorrectly split it intogpt-4(base) +turbo(preset). - Action: Acknowledge the limitation or implement a stricter separator (e.g.,
--) or prefix validation.
- Location:
Overall Status:
Addressing these two points (specifically the safety check) will complete the review process for this PR.
This review was generated by an AI assistant.
|
|
||
| # Add hivemind_details as SUPPLEMENTARY breakdown information | ||
| # This does NOT replace standard fields, but provides additional context | ||
| arbiter_response.usage.hivemind_details = hivemind_details |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assignment remains unprotected. While the loop above is now safe, this specific line will still crash if arbiter_response.usage does not allow dynamic attribute assignment (e.g., strict Pydantic models). Please wrap this in a try...except block.
| arbiter_response.usage.hivemind_details = hivemind_details | |
| try: | |
| arbiter_response.usage.hivemind_details = hivemind_details | |
| except (AttributeError, TypeError): | |
| lib_logger.debug("[HiveMind] Could not attach hivemind_details to usage object") |
|
|
||
| # Add hivemind_details as SUPPLEMENTARY breakdown information | ||
| # This does NOT replace standard fields, but provides additional context | ||
| arbiter_response.usage.hivemind_details = hivemind_details |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same safety issue here. This assignment needs to be wrapped in a try...except block to prevent potential runtime crashes on strict usage objects.
| arbiter_response.usage.hivemind_details = hivemind_details | |
| try: | |
| arbiter_response.usage.hivemind_details = hivemind_details | |
| except (AttributeError, TypeError): | |
| lib_logger.debug("[HiveMind] Could not attach hivemind_details to usage object") |
| # Parse: {base_model}-{preset_id} | ||
| # preset_id is the last segment after the last hyphen | ||
| if "-" in swarm_id: | ||
| # Split and check if last segment is a preset ID | ||
| parts = swarm_id.rsplit("-", 1) | ||
| potential_preset = parts[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming ambiguity issue remains unresolved. Splitting on the last hyphen can misidentify the base model if the model name itself contains hyphens and ends with a word matching a preset ID (e.g. gpt-4-turbo vs turbo preset).
Consider enforcing a double-hyphen separator (--) for presets or validating the base model prefix against _provider_models.
… and filter ".example" artifacts - Introduce an omit_id_presets registry in ConfigLoader by scanning presets that declare "omit_id": true so short swarm names like "model[swarm]" can be resolved to a concrete preset. - Add get_preset_for_model(base_model) to encapsulate preset selection (prefer a registered omit_id mapping, otherwise fall back to "default") and emit debug diagnostics during resolution. - Update discovery routines for swarms, fusions, strategies, and roles to ignore files whose stem ends with ".example" to avoid exposing sample artifacts. - Change the discovery output to reflect omit_id behavior: emit the compact form "model[swarm]" for presets that omit their id, and the explicit "model-preset[swarm]" for others. Explicit identifiers remain valid at runtime. - Modify EnsembleManager to return both the resolved base model and the preset id from get_base_model, and use the resolved preset when loading swarm configs; add clearer logging around routing. - Add omit_id flags to example swarm configs and extra loader logging, including warnings when multiple presets claim the same base model. BREAKING CHANGE: EnsembleManager.get_base_model now returns a tuple (base_model, preset_id) instead of a single string. Update call sites to unpack the result and pass the preset id when fetching swarm configs. Migration example: - Before: base = get_base_model(swarm_id); config = get_swarm_config(base) - After: base, preset = get_base_model(swarm_id); config = get_swarm_config(preset)
…, strategies and examples - Add comprehensive HiveMind Ensemble documentation across README, DOCUMENTATION.md, and docs/ (overview, swarm/fusion modes, arbitration strategies, recursive mode, usage tracking, architecture). - Document preset-based swarm configuration and discovery rules for /v1/models; clarify runtime vs discovery behavior. - Add ensemble_configs examples and samples: - fusions/fusion.example.json - roles/role.example.json - roles/roles-array.example.json - strategies/strategy.example.txt - swarms/preset.example.json - Update library README and ensemble_configs README to describe presets, role templates, strategy templates, and usage examples. - Documentation-only changes and sample config files; no runtime code changes or breaking API modifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review
I've reviewed the new Preset System implementation. The configuration logic and omit_id discovery features are well-implemented and add significant flexibility to the HiveMind system.
Assessment of New Changes:
- Presets: The
ConfigLoaderupdates correctly handleomit_idmapping. - Documentation: The updated guides clearly explain the new explicit vs. short syntax.
Critical Action Items (Unresolved):
However, the specific issues flagged in the "Consolidated Issues List" remain unaddressed in this iteration. These must be fixed to ensure stability:
- Runtime Safety (Blocking): The
hivemind_detailsinjection inmanager.py(Lines 1307 & 1426) still lackstry...exceptprotection. This is a crash vector for strict usage objects. - Naming Ambiguity: The
get_base_modellogic (Line 132) creates a collision risk for models ending in words that match preset names (e.g.,gemini-1.5-flashmatching aflashpreset).
Overall Status:
Request Changes. The preset feature is good, but we cannot merge with the known runtime crash risk in the usage reporting logic.
This review was generated by an AI assistant.
| # preset_id is the last segment after the last hyphen | ||
| if "-" in swarm_id: | ||
| # Split and check if last segment is a preset ID | ||
| parts = swarm_id.rsplit("-", 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming ambiguity issue remains unresolved and is critical with the new preset system.
If a model name ends with a word that matches a preset ID (e.g., gemini-1.5-flash where flash.json exists), this logic will incorrectly parse it as Base: gemini-1.5, Preset: flash.
Consider enforcing a double-hyphen separator (e.g. model--preset[swarm]) or checking against the known provider model list before assuming a preset split.
|
|
||
| # Add hivemind_details as SUPPLEMENTARY breakdown information | ||
| # This does NOT replace standard fields, but provides additional context | ||
| arbiter_response.usage.hivemind_details = hivemind_details |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assignment is still unprotected. If arbiter_response.usage is a strict object (like a Pydantic model) that doesn't allow arbitrary attribute assignment, this will raise an AttributeError and crash the request.
| arbiter_response.usage.hivemind_details = hivemind_details | |
| try: | |
| arbiter_response.usage.hivemind_details = hivemind_details | |
| except (AttributeError, TypeError): | |
| lib_logger.debug("[HiveMind] Could not attach hivemind_details to usage object") |
|
|
||
| # Add hivemind_details as SUPPLEMENTARY breakdown information | ||
| # This does NOT replace standard fields, but provides additional context | ||
| arbiter_response.usage.hivemind_details = hivemind_details |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same safety issue here. This must be wrapped in a try...except block to prevent crashes with strict usage objects.
| arbiter_response.usage.hivemind_details = hivemind_details | |
| try: | |
| arbiter_response.usage.hivemind_details = hivemind_details | |
| except (AttributeError, TypeError): | |
| lib_logger.debug("[HiveMind] Could not attach hivemind_details to usage object") |
Overview
This pull request implements the HiveMind Ensemble subsystem, a sophisticated orchestration engine for parallel model execution. It introduces two modes:
This PR delivers a complete implementation of Swarm Mode (with preset-based configuration and omit_id discovery), Recursive Mode (autonomous refinement), Fusion Mode (including role templates and weight context), and comprehensive Documentation with detailed example configurations.
Key Features
Preset-Based Swarm System:
base_modelsfor discovery controlomit_idfield to control/v1/modelsdisplay format (short vs explicit)[{}, {}]Role Templates:
ensemble_configs/roles/role_templatefieldComprehensive Example Configurations:
preset.example.json: Fully documented swarm preset with all fields explainedfusion.example.json: Fusion with inline and template-based specialist examplesrole.example.json&roles-array.example.json: Role template formatsstrategy.example.txt: Strategy template with field documentationImplementation Details
This PR introduces the following key changes:
New Ensemble Subsystem (
src/rotator_library/ensemble):EnsembleManager(manager.py): Central orchestrator with preset resolution and omit_id supportConfigLoader(config_loader.py): Manages loading configurations with omit_id mapping and role templatesPreset-Based Configuration (
ensemble_configs):id,base_models, andomit_idfields. Supports both single and array formats.examplefiles with detailed field documentationCore Integration:
RotatingClientwith enhancedis_ensemble()detectionget_preset_for_model()for omit_id mappingComplete Swarm Mode:
/v1/modelspopulation based onomit_idandbase_modelsRecursive Mode:
[INTERNAL]tags for transparent monologue handlingFusion Mode:
Documentation:
HiveMind_User_Guide.md: Comprehensive guide with preset-based system explanationHiveMind_API.md: Detailed API reference with updated signaturesensemble_configs/README.md: Complete configuration guideTechnical Highlights
Discovery Logic:
base_models+omit_id: true→ Shows as{model}[swarm](clean, short form)base_models+omit_id: false→ Shows as{model}-{preset}[swarm](explicit)base_models→ Not shown (invisible but usable)Runtime Flexibility:
model[swarm]→ Uses omit_id preset OR defaultmodel-preset[swarm]→ ALWAYS works with ANY model/preset combobase_modelscontrols ONLY discovery, NOT runtime restrictionsRole Template Resolution:
"role_template": "security-expert"Future Work
See the full
Implementation PlanandTask Checklistfor the complete roadmap.