feat(hooks/plugin): add lifecycle hooks + phase-1 plugin contract#473
feat(hooks/plugin): add lifecycle hooks + phase-1 plugin contract#473gh-xj wants to merge 19 commits intosipeed:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a lightweight, typed lifecycle hook registry to PicoClaw and integrates hook trigger points throughout the agent loop to enable observability, filtering, and guardrails with minimal overhead.
Changes:
- Introduces
pkg/hookswith typed event structs, aHookRegistry, and trigger/registration APIs (void + modifying hooks). - Integrates hook triggers into
pkg/agent/loop.goaround inbound messages, session boundaries, LLM calls, tool calls, and outbound publishing. - Adds unit tests for hook ordering, cancellation, concurrency, and concurrent registration/trigger behavior, plus a design doc.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/hooks/types.go | Defines the 8 hook event payload types used across the system. |
| pkg/hooks/hooks.go | Implements the hook registry, priority ordering, and void/modifying execution semantics. |
| pkg/hooks/hooks_test.go | Adds tests for concurrency, ordering, cancellation, and race-oriented behavior. |
| pkg/agent/loop.go | Wires hook triggers into the agent loop and wraps outbound publishing via sendOutbound. |
| cmd/picoclaw/main.go | Creates and attaches a hook registry during startup (agent + gateway commands). |
| docs/plans/2026-02-19-hook-system-design.md | Documents the hook model, execution patterns, and integration points. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| go func(reg HookRegistration[T]) { | ||
| defer wg.Done() | ||
| if err := reg.Handler(ctx, event); err != nil { | ||
| logger.WarnCF("hooks", "Hook error", | ||
| map[string]any{ | ||
| "hook": hookName, | ||
| "handler": reg.Name, | ||
| "error": err.Error(), | ||
| }) | ||
| } | ||
| }(h) |
There was a problem hiding this comment.
Hook handlers can panic (especially since they’re user/extension code), and right now a panic in any handler will crash the whole process (including from the goroutines in triggerVoid). Consider recovering around reg.Handler invocation and logging the panic as a hook failure so hooks can’t take down the agent.
pkg/agent/loop.go
Outdated
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| al.sendOutbound(context.Background(), bus.OutboundMessage{ | ||
| Channel: channel, | ||
| ChatID: chatID, | ||
| Content: content, | ||
| }) | ||
| return nil | ||
| }) |
There was a problem hiding this comment.
In SetHooks, the MessageTool callback always returns nil even when message_sending cancels delivery (sendOutbound just returns). That makes MessageTool report success and set sentInRound=true while the message was never published, which can also suppress the loop’s normal outbound response (alreadySent becomes true). Consider having sendOutbound return a “sent/canceled” status (and reason) so this callback can return an error on cancel and avoid marking the tool call as successful.
| if al.hooks != nil { | ||
| event := &hooks.MessageSendingEvent{Channel: msg.Channel, ChatID: msg.ChatID, Content: msg.Content} | ||
| al.hooks.TriggerMessageSending(ctx, event) | ||
| if event.Cancel { |
There was a problem hiding this comment.
When message_sending cancels, sendOutbound silently drops the message. Since MessageSendingEvent includes CancelReason, consider logging (or otherwise surfacing) the cancel reason here so operators can understand why messages disappeared (and so cancellations are debuggable).
| if event.Cancel { | |
| if event.Cancel { | |
| reason := strings.TrimSpace(event.CancelReason) | |
| if reason == "" { | |
| reason = "unspecified" | |
| } | |
| logger.Warnf("Outbound message to channel %s (chatID=%s) canceled by hook: %s", msg.Channel, msg.ChatID, reason) |
cmd/picoclaw/main.go
Outdated
| msgBus := bus.NewMessageBus() | ||
| agentLoop := agent.NewAgentLoop(cfg, msgBus, provider) | ||
| agentLoop.SetHooks(hooks.NewHookRegistry()) | ||
|
|
There was a problem hiding this comment.
agentLoop.SetHooks(hooks.NewHookRegistry()) installs an empty registry unconditionally. With this wiring, al.hooks is non-nil even when no handlers are registered, so the loop will still construct hook events and call Trigger* on every iteration (adding overhead and undermining the stated “zero-cost when unused” goal). Consider leaving hooks nil by default and only calling SetHooks when hooks are actually configured/registered, or add a cheap “has handlers” check before building events.
|
|
||
| for i := range 5 { | ||
| r.OnMessageReceived("hook-"+string(rune('A'+i)), i, func(_ context.Context, _ *MessageReceivedEvent) error { | ||
| count.Add(1) | ||
| // Small sleep to verify concurrency (all run in parallel). | ||
| time.Sleep(10 * time.Millisecond) | ||
| return nil | ||
| }) | ||
| } | ||
|
|
||
| start := time.Now() | ||
| r.TriggerMessageReceived(ctx, &MessageReceivedEvent{Content: "test"}) | ||
| elapsed := time.Since(start) | ||
|
|
||
| if count.Load() != 5 { | ||
| t.Errorf("Expected 5 handlers called, got %d", count.Load()) | ||
| } | ||
| // If truly concurrent, total time should be much less than 5*10ms = 50ms. | ||
| if elapsed > 40*time.Millisecond { | ||
| t.Errorf("Handlers appear sequential: elapsed %v", elapsed) | ||
| } |
There was a problem hiding this comment.
TestVoidHooksConcurrent relies on timing (sleeping 10ms per handler and asserting total elapsed < 40ms). This is likely to be flaky under CI load or on slower machines, even if handlers are concurrent. Prefer a synchronization-based assertion (e.g., a barrier channel all handlers must reach before any can proceed) so the test doesn’t depend on wall-clock thresholds.
| for i := range 5 { | |
| r.OnMessageReceived("hook-"+string(rune('A'+i)), i, func(_ context.Context, _ *MessageReceivedEvent) error { | |
| count.Add(1) | |
| // Small sleep to verify concurrency (all run in parallel). | |
| time.Sleep(10 * time.Millisecond) | |
| return nil | |
| }) | |
| } | |
| start := time.Now() | |
| r.TriggerMessageReceived(ctx, &MessageReceivedEvent{Content: "test"}) | |
| elapsed := time.Since(start) | |
| if count.Load() != 5 { | |
| t.Errorf("Expected 5 handlers called, got %d", count.Load()) | |
| } | |
| // If truly concurrent, total time should be much less than 5*10ms = 50ms. | |
| if elapsed > 40*time.Millisecond { | |
| t.Errorf("Handlers appear sequential: elapsed %v", elapsed) | |
| } | |
| started := make(chan struct{}, 5) | |
| release := make(chan struct{}) | |
| done := make(chan struct{}) | |
| for i := range 5 { | |
| r.OnMessageReceived("hook-"+string(rune('A'+i)), i, func(_ context.Context, _ *MessageReceivedEvent) error { | |
| started <- struct{}{} | |
| <-release | |
| count.Add(1) | |
| return nil | |
| }) | |
| } | |
| go func() { | |
| r.TriggerMessageReceived(ctx, &MessageReceivedEvent{Content: "test"}) | |
| close(done) | |
| }() | |
| // Ensure all handlers have started and reached the barrier. | |
| for i := 0; i < 5; i++ { | |
| select { | |
| case <-started: | |
| case <-time.After(1 * time.Second): | |
| t.Fatalf("timeout waiting for handler %d to start", i+1) | |
| } | |
| } | |
| // Release all handlers to complete. | |
| close(release) | |
| // Wait for TriggerMessageReceived to finish. | |
| select { | |
| case <-done: | |
| case <-time.After(1 * time.Second): | |
| t.Fatal("timeout waiting for handlers to complete") | |
| } | |
| if count.Load() != 5 { | |
| t.Errorf("Expected 5 handlers called, got %d", count.Load()) | |
| } |
pkg/hooks/hooks.go
Outdated
| for _, h := range hooks { | ||
| if err := h.Handler(ctx, event); err != nil { | ||
| logger.WarnCF("hooks", "Hook error", | ||
| map[string]any{ | ||
| "hook": hookName, | ||
| "handler": h.Name, | ||
| "error": err.Error(), | ||
| }) | ||
| } |
There was a problem hiding this comment.
A panic inside a modifying hook handler will crash the agent loop. Since these are intended as extension points, consider adding panic recovery around h.Handler(ctx, event) (and then logging) so a single bad hook can’t bring down the whole process.
pkg/hooks/types.go
Outdated
| ToolName string | ||
| Args map[string]any // Modifiable | ||
| Channel string | ||
| ChatID string | ||
| Cancel bool | ||
| CancelMsg string // Message returned to LLM when canceled |
There was a problem hiding this comment.
Cancellation fields are named inconsistently across events (MessageSendingEvent has CancelReason, BeforeToolCallEvent has CancelMsg). Since these structs are part of the public hook API, consider standardizing naming (e.g., CancelReason vs CancelMessage) so hook authors don’t have to remember special cases.
| ToolName string | |
| Args map[string]any // Modifiable | |
| Channel string | |
| ChatID string | |
| Cancel bool | |
| CancelMsg string // Message returned to LLM when canceled | |
| ToolName string | |
| Args map[string]any // Modifiable | |
| Channel string | |
| ChatID string | |
| Cancel bool | |
| CancelReason string // Reason for cancellation; preferred over CancelMsg for new hooks | |
| CancelMsg string // DEPRECATED: use CancelReason; kept for backward compatibility |
pkg/agent/loop.go
Outdated
| func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) { | ||
| al.hooks = h | ||
|
|
||
| // Rewire MessageTool callbacks to route through sendOutbound for hook interception. | ||
| for _, agentID := range al.registry.ListAgentIDs() { | ||
| if agent, ok := al.registry.GetAgent(agentID); ok { | ||
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { | ||
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| al.sendOutbound(context.Background(), bus.OutboundMessage{ | ||
| Channel: channel, | ||
| ChatID: chatID, | ||
| Content: content, | ||
| }) | ||
| return nil | ||
| }) | ||
| } |
There was a problem hiding this comment.
SetHooks mutates each MessageTool’s sendCallback without any synchronization, while MessageTool.Execute reads sendCallback without locking. If SetHooks could be called while the agent loop is running, this is a data race. Either document/enforce that SetHooks must be called before Run starts, or add synchronization (e.g., protect sendCallback with a mutex/atomic pointer or rewire at tool construction time only).
cmd/picoclaw/main.go
Outdated
| msgBus := bus.NewMessageBus() | ||
| agentLoop := agent.NewAgentLoop(cfg, msgBus, provider) | ||
| agentLoop.SetHooks(hooks.NewHookRegistry()) | ||
|
|
There was a problem hiding this comment.
agentLoop.SetHooks(hooks.NewHookRegistry()) installs an empty registry unconditionally. With this wiring, al.hooks is non-nil even when no handlers are registered, so the loop will still construct hook events and call Trigger* on every iteration (adding overhead and undermining the stated “zero-cost when unused” goal). Consider leaving hooks nil by default and only calling SetHooks when hooks are actually configured/registered, or add a cheap “has handlers” check before building events.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (1)
pkg/agent/loop.go:848
maybeSummarizelaunches a goroutine but captures and reuses the requestctxwhen callingsendOutbound. Since the goroutine may run after the parent request is canceled, hooks may see a canceled context and behave differently (e.g., skip work / cancel the message). Use a fresh context (e.g.,context.Background()or a derived context with timeout) for this background notification, or explicitly document that hooks will receive the caller's ctx even in background work.
go func() {
defer al.summarizing.Delete(summarizeKey)
if !constants.IsInternalChannel(channel) {
al.sendOutbound(ctx, bus.OutboundMessage{
Channel: channel,
ChatID: chatID,
Content: "Memory threshold reached. Optimizing conversation history...",
})
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
cmd/picoclaw/main.go
Outdated
| msgBus := bus.NewMessageBus() | ||
| agentLoop := agent.NewAgentLoop(cfg, msgBus, provider) | ||
| // Hook registry is nil by default for true zero-cost. | ||
| // Call agentLoop.SetHooks(hooks.NewHookRegistry()) to enable hooks. | ||
|
|
There was a problem hiding this comment.
Same as earlier in agentCmd: this is comment-only guidance and does not actually wire hooks. If the intent is to support hooks in gateway mode, consider adding an explicit enablement mechanism (config/env/flag) and calling agentLoop.SetHooks(...), or adjust the PR description accordingly.
| // BeforeToolCallEvent is fired before a tool is executed. | ||
| // Handlers can modify Args, or set Cancel to block execution. | ||
| type BeforeToolCallEvent struct { | ||
| ToolName string | ||
| Args map[string]any // Modifiable | ||
| Channel string | ||
| ChatID string | ||
| Cancel bool | ||
| CancelReason string // Message returned to LLM when canceled | ||
| } |
There was a problem hiding this comment.
BeforeToolCallEvent.Args is documented as modifiable but may be nil depending on the upstream tool call (arguments are omitempty). If the API expects modification, consider documenting that Args is always non-nil (and enforce it in the integration/trigger path), or explicitly state that handlers must handle nil.
| if al.hooks != nil { | ||
| btcEvent := &hooks.BeforeToolCallEvent{ | ||
| ToolName: tc.Name, | ||
| Args: tc.Arguments, | ||
| Channel: opts.Channel, | ||
| ChatID: opts.ChatID, | ||
| } | ||
| al.hooks.TriggerBeforeToolCall(ctx, btcEvent) | ||
| if btcEvent.Cancel { | ||
| toolCanceled = true | ||
| toolResult = tools.ErrorResult(btcEvent.CancelReason) | ||
| } | ||
| tc.Arguments = btcEvent.Args | ||
| } |
There was a problem hiding this comment.
BeforeToolCallEvent.Args is populated from tc.Arguments, which can be nil (tool call arguments are omitempty). If a hook tries to modify args (the documented use-case), writing to a nil map will panic and the modification will be silently dropped (panic is recovered in the hook runner). Initialize args to a non-nil map before triggering the hook (and/or when constructing btcEvent).
pkg/agent/loop.go
Outdated
| al.hooks.TriggerBeforeToolCall(ctx, btcEvent) | ||
| if btcEvent.Cancel { | ||
| toolCanceled = true | ||
| toolResult = tools.ErrorResult(btcEvent.CancelReason) |
There was a problem hiding this comment.
When a hook cancels a tool call, tools.ErrorResult(btcEvent.CancelReason) is used even if CancelReason is empty. That can produce an empty tool result message fed back into the LLM, which is hard to debug. Consider defaulting to a non-empty message (and/or including the tool name / handler) when CancelReason == "".
| toolResult = tools.ErrorResult(btcEvent.CancelReason) | |
| reason := btcEvent.CancelReason | |
| if strings.TrimSpace(reason) == "" { | |
| reason = fmt.Sprintf("tool call %q was cancelled by before_tool_call hook", tc.Name) | |
| } | |
| toolResult = tools.ErrorResult(reason) |
cmd/picoclaw/main.go
Outdated
| msgBus := bus.NewMessageBus() | ||
| agentLoop := agent.NewAgentLoop(cfg, msgBus, provider) | ||
| // Hook registry is nil by default for true zero-cost. | ||
| // Call agentLoop.SetHooks(hooks.NewHookRegistry()) to enable hooks. | ||
|
|
There was a problem hiding this comment.
The PR description claims main wiring will "create and set HookRegistry", but the code only adds comments and never actually enables hooks (no hooks.NewHookRegistry() import/call). Either add real wiring (e.g., a config/env/flag to enable and call agentLoop.SetHooks(...)) or update the PR description/checklist to reflect that hooks are intentionally opt-in and not wired by default.
Add a typed lifecycle hook system inspired by OpenClaw, designed for PicoClaw's ultra-lightweight philosophy. Provides 8 interception points around the agent loop for observability, content filtering, and guardrails. Two execution patterns: - Void hooks (concurrent): message_received, after_tool_call, llm_input, llm_output, session_start, session_end - Modifying hooks (sequential by priority, with cancel): message_sending, before_tool_call Key design choices: - Zero-cost when unused: all triggers check len==0 and return immediately - Copy-on-write registration: insertSorted allocates new backing array so concurrent readers never race with writers - Panic recovery in all handler dispatch paths - sendOutbound wrapper returns cancel status to callers - MessageTool callback rewired via SetHooks for content filtering 15 tests covering execution, priority ordering, cancel semantics, concurrency (barrier-based), panic recovery, and error swallowing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
96dc911 to
f73fb27
Compare
nikolasdehor
left a comment
There was a problem hiding this comment.
Well-designed hook system. The generic approach with HookHandler[T] and HookRegistration[T] is clean and type-safe.
Key strengths:
- Zero-cost when unused: The
len==0early return in trigger methods means no allocations or goroutines when no hooks are registered. Thehooksfield defaults to nil onAgentLoop, which is the right choice. - Copy-on-write
insertSorted: Allocating a new backing array on registration avoids data races between concurrent readers (trigger under RLock) and writers (register under Lock). Good design. - Panic recovery: Both
triggerVoid(concurrent goroutines) andtriggerModifying(sequential loop) havedefer recover()— prevents a misbehaving hook from crashing the agent loop. sendOutboundreturns bool: The caller (including the MessageTool callback path) can detect when delivery was canceled by a hook. Clean API.- Test coverage: 15 tests covering execution order, priority, cancellation, concurrency, and panic recovery.
Minor observations:
- The
SetHooksmethod rewires the MessageTool callback, which is a good approach for intercepting tool-generated outbound messages. Thecontext.Background()usage there is acceptable since the message context is not available at hook registration time. - The
maybeSummarizesignature change to includectx context.Contextis a clean way to thread context through to the hook triggers.
Solid addition to the codebase.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pkg/agent/loop.go
Outdated
| ChatID: chatID, | ||
| Content: content, | ||
| }) { | ||
| return fmt.Errorf("message canceled by hook") |
There was a problem hiding this comment.
When sendOutbound returns false, the MessageTool callback returns a generic "message canceled by hook" error. This drops the hook-provided CancelReason, making it hard for the LLM/operator to understand what policy blocked the message. Consider having sendOutbound return the cancel reason (or an error type) so it can be propagated here.
| // triggerVoid runs all handlers concurrently and waits for completion. | ||
| // Handlers MUST NOT mutate the event — it is shared across goroutines. | ||
| // Errors are logged but do not propagate to the caller. | ||
| func triggerVoid[T any](ctx context.Context, hooks []HookRegistration[T], event *T, hookName string) { |
There was a problem hiding this comment.
triggerVoid passes the same event pointer to multiple goroutines. Even with the comment saying handlers must not mutate, it only takes one handler to accidentally write to the struct/map/slice fields to introduce a data race between hooks. To make the system safer for third-party hooks, consider passing each handler its own (shallow or deep) copy of the event, or running void hooks sequentially when the event contains reference types.
pkg/agent/loop.go
Outdated
| } | ||
|
|
||
| // SetHooks installs a hook registry. Must be called before Run starts. | ||
| func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) { |
There was a problem hiding this comment.
SetHooks mutates al.hooks and rewires MessageTool callbacks without any guard against the loop already running. If SetHooks is called after Run starts (despite the comment), this can race with sendOutbound/trigger calls and tool execution. Consider enforcing this contract (e.g., return an error or panic when al.running is true, or protect the hooks pointer/callback rewiring with a mutex/atomic).
| func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) { | |
| func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) { | |
| if al.running.Load() { | |
| panic("SetHooks must be called before Run starts") | |
| } |
|
Rebased/merged with latest upstream Local checks all pass:
Current GitHub Actions status is |
|
Phase 1 plugin direction has now been implemented in this PR as requested, in separate commits:
Key docs:
This keeps risk bounded: dynamic plugin loading remains out-of-scope; current model is compile-time registration on top of typed lifecycle hooks. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.
Comments suppressed due to low confidence (1)
pkg/agent/loop.go:901
- The context passed to sendOutbound in the goroutine may be canceled before the message is sent. The ctx parameter comes from the parent request and could be canceled if the original request completes or times out. Consider using context.Background() or a detached context for goroutines that should continue after the request ends, similar to line 227 in SetHooks.
if st, ok := tool.(tools.ContextualTool); ok {
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pkg/hooks/hooks_test.go
Outdated
| wg.Add(1) | ||
| go func() { | ||
| defer wg.Done() | ||
| r.OnMessageReceived("reg-hook", i, func(_ context.Context, _ *MessageReceivedEvent) error { |
There was a problem hiding this comment.
All registration goroutines use the same handler name "reg-hook" but different priorities. When multiple hooks with the same name are registered, there's no deduplication, so this test registers 10 hooks all named "reg-hook". While this tests concurrent registration safety, it doesn't reflect realistic usage. Consider using unique names like fmt.Sprintf("reg-hook-%d", i) to better represent production scenarios.
| r.OnMessageReceived("reg-hook", i, func(_ context.Context, _ *MessageReceivedEvent) error { | |
| r.OnMessageReceived(fmt.Sprintf("reg-hook-%d", i), i, func(_ context.Context, _ *MessageReceivedEvent) error { |
| } | ||
|
|
||
| // SetHooks installs a hook registry. Must be called before Run starts. | ||
| func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) { | ||
| al.hooks = h | ||
|
|
||
| // Rewire MessageTool callbacks to route through sendOutbound for hook interception. | ||
| for _, agentID := range al.registry.ListAgentIDs() { | ||
| if agent, ok := al.registry.GetAgent(agentID); ok { | ||
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { | ||
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| if !al.sendOutbound(context.Background(), bus.OutboundMessage{ | ||
| Channel: channel, | ||
| ChatID: chatID, | ||
| Content: content, | ||
| }) { | ||
| return fmt.Errorf("message canceled by hook") | ||
| } | ||
| return nil | ||
| }) | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // SetPluginManager installs a plugin manager and routes its hook registry into the loop. | ||
| // Must be called before Run starts. | ||
| func (al *AgentLoop) SetPluginManager(pm *plugin.Manager) { | ||
| al.pluginManager = pm | ||
| if pm == nil { | ||
| al.SetHooks(nil) | ||
| return | ||
| } | ||
| al.SetHooks(pm.HookRegistry()) | ||
| } | ||
|
|
||
| // EnablePlugins is a convenience helper to build and install a plugin manager. | ||
| func (al *AgentLoop) EnablePlugins(plugins ...plugin.Plugin) error { | ||
| pm := plugin.NewManager() | ||
| if err := pm.RegisterAll(plugins...); err != nil { | ||
| return err | ||
| } | ||
| al.SetPluginManager(pm) | ||
| return nil | ||
| } | ||
|
|
||
| // sendOutbound wraps bus.PublishOutbound with the message_sending hook. |
There was a problem hiding this comment.
The hook integration in AgentLoop lacks test coverage. While pkg/hooks has comprehensive unit tests, there are no integration tests verifying that hooks are correctly triggered during message processing, tool execution, or LLM calls. Consider adding tests that verify hook execution in realistic agent loop scenarios, especially for the critical paths like before_tool_call cancellation and message_sending filtering.
| // triggerVoid runs all handlers concurrently and waits for completion. | ||
| // Handlers MUST NOT mutate the event — it is shared across goroutines. | ||
| // Errors are logged but do not propagate to the caller. | ||
| func triggerVoid[T any](ctx context.Context, hooks []HookRegistration[T], event *T, hookName string) { | ||
| if len(hooks) == 0 { | ||
| return | ||
| } | ||
| var wg sync.WaitGroup | ||
| for _, h := range hooks { | ||
| wg.Add(1) | ||
| go func(reg HookRegistration[T]) { | ||
| defer wg.Done() | ||
| defer func() { | ||
| if r := recover(); r != nil { | ||
| logger.ErrorCF("hooks", "Hook panic", | ||
| map[string]any{ | ||
| "hook": hookName, | ||
| "handler": reg.Name, | ||
| "panic": fmt.Sprintf("%v", r), | ||
| }) | ||
| } | ||
| }() | ||
| if err := reg.Handler(ctx, event); err != nil { | ||
| logger.WarnCF("hooks", "Hook error", | ||
| map[string]any{ | ||
| "hook": hookName, | ||
| "handler": reg.Name, | ||
| "error": err.Error(), | ||
| }) | ||
| } | ||
| }(h) | ||
| } | ||
| wg.Wait() | ||
| } |
There was a problem hiding this comment.
Void hooks documentation states "Handlers MUST NOT mutate the event" but there's no enforcement mechanism. Since event data is shared across concurrent goroutines, mutations could cause data races. Consider adding a note in public-facing documentation that hook handlers must be read-only for observe-only hooks, or consider using immutable event types (though this would add overhead).
| "iteration": iteration, | ||
| }) | ||
|
|
||
| // Create async callback for tools that implement AsyncTool | ||
| // NOTE: Following openclaw's design, async tools do NOT send results directly to users. | ||
| // Instead, they notify the agent via PublishInbound, and the agent decides | ||
| // whether to forward the result to the user (in processSystemMessage). | ||
| asyncCallback := func(callbackCtx context.Context, result *tools.ToolResult) { | ||
| // Log the async completion but don't send directly to user | ||
| // The agent will handle user notification via processSystemMessage | ||
| if !result.Silent && result.ForUser != "" { | ||
| logger.InfoCF("agent", "Async tool completed, agent will handle notification", | ||
| map[string]any{ | ||
| "tool": tc.Name, | ||
| "content_len": len(result.ForUser), | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| toolResult := agent.Tools.ExecuteWithContext( | ||
| ctx, | ||
| tc.Name, | ||
| tc.Arguments, | ||
| opts.Channel, | ||
| opts.ChatID, | ||
| asyncCallback, | ||
| ) | ||
| // Fire before_tool_call hook |
There was a problem hiding this comment.
When tc.Arguments is nil, a new empty map is created for the hook event (line 789), but modifications to this map are not propagated back to tc.Arguments because args is reassigned on line 806. However, if tc.Arguments was originally nil, line 806 correctly assigns the (possibly modified) map back to tc.Arguments. This logic is correct but subtle. Consider adding a comment explaining that the empty map creation ensures hooks always receive a non-nil map that can be safely modified.
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { | ||
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| if !al.sendOutbound(context.Background(), bus.OutboundMessage{ | ||
| Channel: channel, | ||
| ChatID: chatID, | ||
| Content: content, | ||
| }) { | ||
| return fmt.Errorf("message canceled by hook") | ||
| } |
There was a problem hiding this comment.
The SetSendCallback method on MessageTool is being called without synchronization. If a tool execution is in progress during SetHooks, there's a race condition between reading sendCallback in Execute() and writing it in SetSendCallback(). While unlikely due to expected pre-Run() usage, this violates Go's race detection rules.
pkg/hooks/hooks_test.go
Outdated
| // Register with priorities: 50, 10, 30, 20, 40 | ||
| priorities := []int{50, 10, 30, 20, 40} | ||
| for _, p := range priorities { | ||
| r.OnBeforeToolCall("p-"+string(rune('0'+p)), p, func(_ context.Context, _ *BeforeToolCallEvent) error { |
There was a problem hiding this comment.
The test name creation logic "p-"+string(rune('0'+p)) doesn't produce readable test names. For priority 50, rune('0'+50) equals 98 which is 'b', not a meaningful name. This should use fmt.Sprintf("p-%d", p) instead to generate names like "p-50", "p-10", etc.
| r.OnBeforeToolCall("p-"+string(rune('0'+p)), p, func(_ context.Context, _ *BeforeToolCallEvent) error { | |
| r.OnBeforeToolCall(fmt.Sprintf("p-%d", p), p, func(_ context.Context, _ *BeforeToolCallEvent) error { |
|
Follow-up for style/quality validation: GitHub Actions is still Could a maintainer approve workflow execution for PR #473 so CI (including lint) can run? Latest head: |
|
Reviewed Copilot feedback on the latest head and triaged stale vs actionable. Addressed in follow-up commits:
Items considered stale/outdated:
Remaining open by design (defer):
If maintainers want, I can do a separate follow-up PR for an optional "safe mode" that deep-copies event payloads for observe-only hooks (with measured overhead tradeoff). |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pkg/agent/loop.go
Outdated
| for _, agentID := range al.registry.ListAgentIDs() { | ||
| if agent, ok := al.registry.GetAgent(agentID); ok { | ||
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { |
There was a problem hiding this comment.
Inconsistent indentation: this if statement should be aligned with the surrounding code. It appears to have extra indentation (starts with tabs when it should match line 229).
| if mt, ok := tool.(*tools.MessageTool); ok { | |
| if mt, ok := tool.(*tools.MessageTool); ok { |
pkg/agent/loop.go
Outdated
| toolStart := time.Now() | ||
| if !toolCanceled { | ||
| toolResult = agent.Tools.ExecuteWithContext( | ||
| ctx, | ||
| tc.Name, | ||
| tc.Arguments, | ||
| opts.Channel, | ||
| opts.ChatID, | ||
| asyncCallback, | ||
| ) | ||
| } | ||
| toolDuration := time.Since(toolStart) |
There was a problem hiding this comment.
The toolStart timer is set before checking if the tool is canceled, but when toolCanceled is true, the tool never actually executes. This means the Duration in the after_tool_call hook will be near-zero for canceled tools, which is misleading. Consider either moving toolStart to line 839 (inside the if block), or documenting that Duration will be near-zero for canceled operations.
docs/design/plugin-system-roadmap.md
Outdated
|
|
||
| ## Current Status (Phase 0: Foundation) | ||
|
|
||
| Implemented in current hooks MR: |
There was a problem hiding this comment.
Inconsistent terminology: "MR" (Merge Request) is GitLab terminology, while this is a GitHub repository. The codebase uses "PR" (Pull Request) elsewhere in documentation. Should use "PR" for consistency.
docs/design/plugin-system-roadmap.md
Outdated
|
|
||
| ## Maintainer Review Notes | ||
|
|
||
| The current hooks MR should be reviewed as Phase 0 only. It intentionally establishes extension points while avoiding high-risk runtime plugin mechanics. |
There was a problem hiding this comment.
Inconsistent terminology: "MR" (Merge Request) is GitLab terminology, while this is a GitHub repository. The codebase uses "PR" (Pull Request) elsewhere in documentation. Should use "PR" for consistency.
| al.hooks = h | ||
|
|
||
| // Rewire MessageTool callbacks to route through sendOutbound for hook interception. | ||
| for _, agentID := range al.registry.ListAgentIDs() { | ||
| if agent, ok := al.registry.GetAgent(agentID); ok { | ||
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { | ||
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{ | ||
| Channel: channel, | ||
| ChatID: chatID, | ||
| Content: content, | ||
| }); !sent { | ||
| if strings.TrimSpace(reason) == "" { | ||
| reason = "unspecified" | ||
| } | ||
| return fmt.Errorf("message canceled by hook: %s", reason) | ||
| } | ||
| return nil | ||
| }) | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
When SetPluginManager(nil) is called after hooks were previously installed, the MessageTool callbacks are not restored to their original state. They remain pointing to sendOutbound, which will use the now-nil hooks registry. Consider resetting the MessageTool callbacks to the original direct bus.PublishOutbound behavior when h is nil, or document that SetHooks should not be called with nil after hooks were set.
| al.hooks = h | |
| // Rewire MessageTool callbacks to route through sendOutbound for hook interception. | |
| for _, agentID := range al.registry.ListAgentIDs() { | |
| if agent, ok := al.registry.GetAgent(agentID); ok { | |
| if tool, ok := agent.Tools.Get("message"); ok { | |
| if mt, ok := tool.(*tools.MessageTool); ok { | |
| mt.SetSendCallback(func(channel, chatID, content string) error { | |
| if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{ | |
| Channel: channel, | |
| ChatID: chatID, | |
| Content: content, | |
| }); !sent { | |
| if strings.TrimSpace(reason) == "" { | |
| reason = "unspecified" | |
| } | |
| return fmt.Errorf("message canceled by hook: %s", reason) | |
| } | |
| return nil | |
| }) | |
| } | |
| } | |
| } | |
| } | |
| // If hooks are being cleared, restore MessageTool callbacks to direct bus publishing. | |
| if h == nil { | |
| al.hooks = nil | |
| for _, agentID := range al.registry.ListAgentIDs() { | |
| if agent, ok := al.registry.GetAgent(agentID); ok { | |
| if tool, ok := agent.Tools.Get("message"); ok { | |
| if mt, ok := tool.(*tools.MessageTool); ok { | |
| mt.SetSendCallback(func(channel, chatID, content string) error { | |
| // Bypass hook interception and publish directly to the bus. | |
| return al.bus.PublishOutbound(context.Background(), bus.OutboundMessage{ | |
| Channel: channel, | |
| ChatID: chatID, | |
| Content: content, | |
| }) | |
| }) | |
| } | |
| } | |
| } | |
| } | |
| return | |
| } | |
| al.hooks = h | |
| // Rewire MessageTool callbacks to route through sendOutbound for hook interception. | |
| for _, agentID := range al.registry.ListAgentIDs() { | |
| if agent, ok := al.registry.GetAgent(agentID); ok { | |
| if tool, ok := agent.Tools.Get("message"); ok { | |
| if mt, ok := tool.(*tools.MessageTool); ok { | |
| mt.SetSendCallback(func(channel, chatID, content string) error { | |
| if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{ | |
| Channel: channel, | |
| ChatID: chatID, | |
| Content: content, | |
| }); !sent { | |
| if strings.TrimSpace(reason) == "" { | |
| reason = "unspecified" | |
| } | |
| return fmt.Errorf("message canceled by hook: %s", reason) | |
| } | |
| return nil | |
| }) | |
| } | |
| } | |
| } | |
| } |
docs/design/plugin-system-roadmap.md
Outdated
| ## Phase 1: Static Plugin Contract (Compile-time) | ||
|
|
||
| Goal: define a minimal public plugin contract for Go modules. | ||
|
|
||
| Proposed: | ||
|
|
||
| - Add `pkg/plugin` with a small interface: | ||
| - `Name() string` | ||
| - `Register(*hooks.HookRegistry) error` | ||
| - Register plugins at startup in code. | ||
| - Add compatibility metadata (`PluginAPIVersion`) for forward checks. | ||
|
|
||
| Exit criteria: | ||
|
|
||
| - Example plugin module builds against the contract. |
There was a problem hiding this comment.
The PR description states that Phase 1 has been implemented in this PR (commit 8d4cbd7), but this roadmap document still lists Phase 1 as "Proposed" with future tense. The document should be updated to reflect that Phase 1 is now implemented, moving it to a "Current Status" or "Completed" section to match the actual state of the codebase.
| ## Phase 1: Static Plugin Contract (Compile-time) | |
| Goal: define a minimal public plugin contract for Go modules. | |
| Proposed: | |
| - Add `pkg/plugin` with a small interface: | |
| - `Name() string` | |
| - `Register(*hooks.HookRegistry) error` | |
| - Register plugins at startup in code. | |
| - Add compatibility metadata (`PluginAPIVersion`) for forward checks. | |
| Exit criteria: | |
| - Example plugin module builds against the contract. | |
| ## Phase 1: Static Plugin Contract (Compile-time) — Completed | |
| Goal: define a minimal public plugin contract for Go modules. | |
| Status: Implemented in commit 8d4cbd7: | |
| - `pkg/plugin` provides a small interface: | |
| - `Name() string` | |
| - `Register(*hooks.HookRegistry) error` | |
| - Plugins are registered at startup in code. | |
| - Compatibility metadata (`PluginAPIVersion`) enables forward checks. | |
| Exit criteria (met): | |
| - An example plugin module builds against the contract. |
| - Attach once via `agentLoop.SetHooks(registry)` before `Run()`. | ||
| - If hooks are not set, default behavior is unchanged. | ||
|
|
||
| See runnable examples: [docs/hooks-plugin-examples.md](docs/hooks-plugin-examples.md) |
There was a problem hiding this comment.
Missing blank line between documentation links. For better readability and consistency with Markdown best practices, add a blank line between these two documentation references.
| See runnable examples: [docs/hooks-plugin-examples.md](docs/hooks-plugin-examples.md) | |
| See runnable examples: [docs/hooks-plugin-examples.md](docs/hooks-plugin-examples.md) |
pkg/agent/loop.go
Outdated
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { | ||
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{ |
There was a problem hiding this comment.
Using context.Background() here discards the original context that may contain cancellation signals, deadlines, or tracing information. The MessageTool callback should accept and propagate a context parameter instead of creating a new background context. This could lead to operations continuing even after the parent context is cancelled.
| if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{ | |
| ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) | |
| defer cancel() | |
| if sent, reason := al.sendOutbound(ctx, bus.OutboundMessage{ |
|
Updated roadmap with explicit runtime plugin direction. Short version:
This keeps current PR scope unchanged (hooks + compile-time contract) while documenting a safer long-term runtime path. |
|
Added plugin API compatibility guard in this PR. What changed:
Reason:
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| case uint64: | ||
| if n > maxIntU64 { | ||
| return 0, false | ||
| } |
There was a problem hiding this comment.
Float to int conversion truncates the decimal part without rounding. For timeout values, this could lead to unexpected behavior. For example, timeout_seconds: 30.9 would become 30 instead of 31. Consider documenting this truncation behavior or using rounding (e.g., int(n + 0.5)) to avoid surprising users with values just below a threshold.
| return int(n), true | ||
| case int16: |
There was a problem hiding this comment.
The conversion from int64 to int can overflow on 32-bit systems if the value exceeds MaxInt32. The function should check if the value is within the valid range for int before converting, similar to how it's done for unsigned types. This could cause silent overflow and incorrect behavior.
pkg/agent/loop.go
Outdated
| func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) { | ||
| if al.running.Load() { | ||
| panic("SetHooks must be called before Run starts") | ||
| } | ||
| al.hooks = h | ||
|
|
||
| // Rewire MessageTool callbacks to route through sendOutbound for hook interception. | ||
| for _, agentID := range al.registry.ListAgentIDs() { | ||
| if agent, ok := al.registry.GetAgent(agentID); ok { | ||
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { | ||
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{ | ||
| Channel: channel, | ||
| ChatID: chatID, | ||
| Content: content, | ||
| }); !sent { | ||
| if strings.TrimSpace(reason) == "" { | ||
| reason = "unspecified" | ||
| } | ||
| return fmt.Errorf("message canceled by hook: %s", reason) | ||
| } | ||
| return nil | ||
| }) | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
When SetHooks(nil) is called (e.g., via SetPluginManager(nil)), the MessageTool callback is still rewired to route through sendOutbound. This means even with hooks disabled, message tool sends go through an extra layer of indirection. Consider adding a check to skip the MessageTool rewiring when h == nil, or restore the original callback behavior.
| -> outbound publish | ||
| -> session_end | ||
| ``` | ||
|
|
There was a problem hiding this comment.
The lifecycle map shows a linear flow, but in reality, llm_input, llm_output, before_tool_call, and after_tool_call can occur multiple times in an iteration loop (up to MaxToolIterations). Consider adding a note about iteration/looping behavior to avoid confusion for plugin developers who need to understand when their hooks will fire.
| Note: The map above is shown as a single linear pass for readability. In practice, the | |
| agent loop may iterate up to `MaxToolIterations`, and the following hooks can fire | |
| multiple times within a single overall lifecycle: | |
| `llm_input`, `llm_output`, `before_tool_call`, and `after_tool_call`. |
pkg/agent/loop.go
Outdated
| for _, agentID := range al.registry.ListAgentIDs() { | ||
| if agent, ok := al.registry.GetAgent(agentID); ok { | ||
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { |
There was a problem hiding this comment.
Incorrect indentation: this line has an extra tab character. The if mt, ok := tool.(*tools.MessageTool); ok { should align with the line above it at the same indentation level.
pkg/agent/loop.go
Outdated
| defer al.summarizing.Delete(summarizeKey) | ||
| if !constants.IsInternalChannel(channel) { | ||
| al.bus.PublishOutbound(bus.OutboundMessage{ | ||
| al.sendOutbound(ctx, bus.OutboundMessage{ |
There was a problem hiding this comment.
The context from the parent function is captured in a goroutine, which can lead to the goroutine using a canceled or expired context if the parent function completes before the goroutine. Consider using context.Background() or context.TODO() for the sendOutbound call within the goroutine, or ensure the context lifetime extends beyond the goroutine's execution.
| al.sendOutbound(ctx, bus.OutboundMessage{ | |
| al.sendOutbound(context.Background(), bus.OutboundMessage{ |
|
Addressed the two highest-priority quality items in
Also added a context propagation test in |
|
Follow-up cleanup in Addressed:
Still deferred by design:
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
pkg/agent/loop.go:933
- maybeSummarize now accepts a context but discards it and uses context.Background() when sending the outbound notification. This prevents hook handlers (and MessageTool callbacks) from seeing request-scoped context values (trace IDs, deadlines, auth markers) for summarization notifications. Consider passing the caller ctx through (or using context.WithoutCancel(ctx) if you want to avoid cancellation) instead of context.Background().
// maybeSummarize triggers summarization if the session history exceeds thresholds.
func (al *AgentLoop) maybeSummarize(_ context.Context, agent *AgentInstance, sessionKey, channel, chatID string) {
newHistory := agent.Sessions.GetHistory(sessionKey)
tokenEstimate := al.estimateTokens(newHistory)
threshold := agent.ContextWindow * 75 / 100
if len(newHistory) > 20 || tokenEstimate > threshold {
summarizeKey := agent.ID + ":" + sessionKey
if _, loading := al.summarizing.LoadOrStore(summarizeKey, true); !loading {
go func() {
defer al.summarizing.Delete(summarizeKey)
if !constants.IsInternalChannel(channel) {
al.sendOutbound(context.Background(), bus.OutboundMessage{
Channel: channel,
ChatID: chatID,
Content: "Memory threshold reached. Optimizing conversation history...",
})
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pkg/agent/loop.go
Outdated
| al.pluginManager = pm | ||
| if pm == nil { | ||
| return al.SetHooks(nil) | ||
| } | ||
| return al.SetHooks(pm.HookRegistry()) |
There was a problem hiding this comment.
SetPluginManager assigns al.pluginManager before calling SetHooks. If SetHooks returns an error (e.g., Run already started), this leaves the AgentLoop in a partially-updated state (pluginManager set but hooks not installed). Consider only setting al.pluginManager after SetHooks succeeds, or rolling it back on error.
| al.pluginManager = pm | |
| if pm == nil { | |
| return al.SetHooks(nil) | |
| } | |
| return al.SetHooks(pm.HookRegistry()) | |
| if pm == nil { | |
| // Clear hooks first; only update pluginManager if this succeeds. | |
| if err := al.SetHooks(nil); err != nil { | |
| return err | |
| } | |
| al.pluginManager = nil | |
| return nil | |
| } | |
| // Install hooks from the plugin manager first; only record the manager on success. | |
| if err := al.SetHooks(pm.HookRegistry()); err != nil { | |
| return err | |
| } | |
| al.pluginManager = pm | |
| return nil |
| if al.hooks != nil { | ||
| args := tc.Arguments | ||
| if args == nil { | ||
| args = make(map[string]any) | ||
| } | ||
| btcEvent := &hooks.BeforeToolCallEvent{ | ||
| ToolName: tc.Name, | ||
| Args: args, | ||
| Channel: opts.Channel, | ||
| ChatID: opts.ChatID, | ||
| } | ||
| al.hooks.TriggerBeforeToolCall(ctx, btcEvent) | ||
| if btcEvent.Cancel { | ||
| toolCanceled = true | ||
| reason := btcEvent.CancelReason | ||
| if strings.TrimSpace(reason) == "" { | ||
| reason = fmt.Sprintf("tool call %q was canceled by before_tool_call hook", tc.Name) | ||
| } | ||
| toolResult = tools.ErrorResult(reason) | ||
| } | ||
| tc.Arguments = btcEvent.Args | ||
| } |
There was a problem hiding this comment.
After TriggerBeforeToolCall, tc.Arguments is set to btcEvent.Args without re-validating it. A hook could set Args to nil, which would later cause tools (or other hooks) that mutate args to panic when writing into a nil map. Consider ensuring Args remains non-nil after hooks run (e.g., re-initialize to an empty map if nil).
| case float32: | ||
| // Truncation is intentional for timeout normalization. | ||
| return int(n), true | ||
| case float64: | ||
| // Truncation is intentional for timeout normalization. | ||
| return int(n), true |
There was a problem hiding this comment.
toInt converts float32/float64 to int without any bounds checks. Per the Go spec, converting a float outside the target int range produces an implementation-specific value, which can break timeout clamping (e.g., huge values may wrap/turn negative and bypass the clamp). Consider checking against min/max int before converting floats, similar to the int64/uint64 cases.
|
Updated What I changed:
Tests added/updated:
Validation:
Current PR state:
|
|
Follow-up push:
Current CI status on PR #473:
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| args := tc.Arguments | ||
| if args == nil { | ||
| args = make(map[string]any) | ||
| } | ||
| btcEvent := &hooks.BeforeToolCallEvent{ | ||
| ToolName: tc.Name, | ||
| Args: args, | ||
| Channel: opts.Channel, |
There was a problem hiding this comment.
When a before_tool_call hook cancels a tool invocation, the cancel reason is used to build an error ToolResult but is not logged anywhere with the reason. The PR description says cancellation paths surface reasons and are logged; consider adding a log entry here (tool name + reason + channel/chatID) so operational diagnostics match the documented behavior.
| m.mu.Lock() | ||
| defer m.mu.Unlock() | ||
| if _, exists := m.seen[name]; exists { | ||
| return fmt.Errorf("plugin %q already registered", name) | ||
| } | ||
| if err := p.Register(m.registry); err != nil { | ||
| return fmt.Errorf("register plugin %q: %w", name, err) | ||
| } | ||
| m.seen[name] = struct{}{} | ||
| m.names = append(m.names, name) |
There was a problem hiding this comment.
Manager.Register holds m.mu while calling the plugin’s Register method. Since Register executes plugin-provided code, holding the manager lock during that call can cause unnecessary contention and can deadlock if future plugin code indirectly calls back into Manager APIs. Prefer checking/marking the name under the lock, releasing the lock before calling p.Register, and then reacquiring to append name (with rollback if Register fails).
| m.mu.Lock() | |
| defer m.mu.Unlock() | |
| if _, exists := m.seen[name]; exists { | |
| return fmt.Errorf("plugin %q already registered", name) | |
| } | |
| if err := p.Register(m.registry); err != nil { | |
| return fmt.Errorf("register plugin %q: %w", name, err) | |
| } | |
| m.seen[name] = struct{}{} | |
| m.names = append(m.names, name) | |
| // First, check and mark the plugin as seen under the lock to avoid races. | |
| m.mu.Lock() | |
| if _, exists := m.seen[name]; exists { | |
| m.mu.Unlock() | |
| return fmt.Errorf("plugin %q already registered", name) | |
| } | |
| m.seen[name] = struct{}{} | |
| m.mu.Unlock() | |
| // Call plugin-provided Register without holding the manager lock to | |
| // avoid contention and potential deadlocks if the plugin calls back | |
| // into Manager APIs. | |
| if err := p.Register(m.registry); err != nil { | |
| // Roll back the optimistic seen-mark if registration fails. | |
| m.mu.Lock() | |
| delete(m.seen, name) | |
| m.mu.Unlock() | |
| return fmt.Errorf("register plugin %q: %w", name, err) | |
| } | |
| // Finalize registration under the lock by recording the plugin name. | |
| m.mu.Lock() | |
| m.names = append(m.names, name) | |
| m.mu.Unlock() |
| type SendCallbackWithContext func(ctx context.Context, channel, chatID, content string) error | ||
|
|
||
| type MessageTool struct { | ||
| sendCallback SendCallback | ||
| sendCallback SendCallbackWithContext | ||
| defaultChannel string |
There was a problem hiding this comment.
This changes the exported MessageTool send-callback API by removing the old SendCallback type and requiring a context-aware callback. If pkg/tools is consumed outside this repo, this is a breaking change; consider preserving the old SetSendCallback signature (deprecated) and adding a new SetSendCallbackWithContext, or provide an adapter so existing integrations keep compiling.
|
|
||
| - Observe-only hooks (`message_received`, `after_tool_call`, `llm_input`, `llm_output`, `session_start`, `session_end`) | ||
| - run concurrently | ||
| - cannot block core behavior |
There was a problem hiding this comment.
This documentation claims observe-only hooks “cannot block core behavior”, but the current implementation of observe-only hooks (triggerVoid) waits for all handlers to complete via a WaitGroup. This means slow/blocked handlers will delay the agent loop. Update the wording to reflect that observe-only hooks cannot cancel/modify the operation but can still add latency (or change triggerVoid to be fire-and-forget if that’s the intended contract).
| - cannot block core behavior | |
| - cannot cancel or modify core behavior (observe-only only) | |
| - may still add latency because handlers are awaited by the agent loop |
Description
This PR introduces a bounded plugin foundation for PicoClaw:
pkg/plugin) with explicit startup wiring.The goal is to enable extension without expanding core dependencies or adding runtime plugin complexity.
Scope
Included:
pkg/hooks: typed hook events + registry + trigger semantics.pkg/agent/loop.go: hook integration + plugin manager wiring.pkg/plugin: compile-time plugin interface and manager.docs/hooks-plugin-examples.mddocs/plugin-system-roadmap.mdNot included in this PR:
Behavior & Risk
SetHooksis enforced as pre-run only.Validation
Local:
make fmtgo generate ./...go vet ./...go test ./...External context (references)
Type of Change
AI Code Generation
Related Issue
N/A
Technical Context
Test Environment
Checklist