Skip to content

feat(hooks/plugin): add lifecycle hooks + phase-1 plugin contract#473

Open
gh-xj wants to merge 19 commits intosipeed:mainfrom
gh-xj:feat/hook-system
Open

feat(hooks/plugin): add lifecycle hooks + phase-1 plugin contract#473
gh-xj wants to merge 19 commits intosipeed:mainfrom
gh-xj:feat/hook-system

Conversation

@gh-xj
Copy link
Collaborator

@gh-xj gh-xj commented Feb 19, 2026

Description

This PR introduces a bounded plugin foundation for PicoClaw:

  • Typed lifecycle hooks for agent interception points.
  • Phase-1 compile-time plugin contract (pkg/plugin) with explicit startup wiring.

The goal is to enable extension without expanding core dependencies or adding runtime plugin complexity.

Scope

Included:

  • pkg/hooks: typed hook events + registry + trigger semantics.
  • pkg/agent/loop.go: hook integration + plugin manager wiring.
  • pkg/plugin: compile-time plugin interface and manager.
  • Docs:
    • docs/hooks-plugin-examples.md
    • docs/plugin-system-roadmap.md

Not included in this PR:

  • Dynamic runtime plugin loading
  • Plugin marketplace/distribution
  • Sandbox/permission model

Behavior & Risk

  • Default behavior remains unchanged when hooks/plugins are not enabled.
  • Hook cancellation paths surface reasons and are logged.
  • SetHooks is enforced as pre-run only.
  • Phase-1 plugin model is compile-time registration, not default-open runtime loading.

Validation

Local:

  • make fmt
  • go generate ./...
  • go vet ./...
  • go test ./...

External context (references)

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Documentation update
  • Code refactoring (no functional changes, no api changes)

AI Code Generation

  • Fully AI-generated (100% AI, 0% Human)
  • Mostly AI-generated (AI draft, Human verified/modified)
  • Mostly Human-written (Human lead, AI assisted or none)

Related Issue

N/A

Technical Context

  • Reference: OpenClaw hooks/plugins design
  • Reasoning: provide extension points with explicit contract and bounded rollout path

Test Environment

  • Hardware: MacBook Pro (Apple Silicon)
  • OS: macOS Darwin 25.2.0
  • Model/Provider: N/A (unit/integration tests)
  • Channels: N/A

Checklist

  • My code/docs follow the style of this project.
  • I have performed a self-review of my own changes.
  • I have updated the documentation accordingly.

Copilot AI review requested due to automatic review settings February 19, 2026 09:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a lightweight, typed lifecycle hook registry to PicoClaw and integrates hook trigger points throughout the agent loop to enable observability, filtering, and guardrails with minimal overhead.

Changes:

  • Introduces pkg/hooks with typed event structs, a HookRegistry, and trigger/registration APIs (void + modifying hooks).
  • Integrates hook triggers into pkg/agent/loop.go around inbound messages, session boundaries, LLM calls, tool calls, and outbound publishing.
  • Adds unit tests for hook ordering, cancellation, concurrency, and concurrent registration/trigger behavior, plus a design doc.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
pkg/hooks/types.go Defines the 8 hook event payload types used across the system.
pkg/hooks/hooks.go Implements the hook registry, priority ordering, and void/modifying execution semantics.
pkg/hooks/hooks_test.go Adds tests for concurrency, ordering, cancellation, and race-oriented behavior.
pkg/agent/loop.go Wires hook triggers into the agent loop and wraps outbound publishing via sendOutbound.
cmd/picoclaw/main.go Creates and attaches a hook registry during startup (agent + gateway commands).
docs/plans/2026-02-19-hook-system-design.md Documents the hook model, execution patterns, and integration points.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 136 to 146
go func(reg HookRegistration[T]) {
defer wg.Done()
if err := reg.Handler(ctx, event); err != nil {
logger.WarnCF("hooks", "Hook error",
map[string]any{
"hook": hookName,
"handler": reg.Name,
"error": err.Error(),
})
}
}(h)
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hook handlers can panic (especially since they’re user/extension code), and right now a panic in any handler will crash the whole process (including from the goroutines in triggerVoid). Consider recovering around reg.Handler invocation and logging the panic as a hook failure so hooks can’t take down the agent.

Copilot uses AI. Check for mistakes.
Comment on lines 206 to 213
mt.SetSendCallback(func(channel, chatID, content string) error {
al.sendOutbound(context.Background(), bus.OutboundMessage{
Channel: channel,
ChatID: chatID,
Content: content,
})
return nil
})
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In SetHooks, the MessageTool callback always returns nil even when message_sending cancels delivery (sendOutbound just returns). That makes MessageTool report success and set sentInRound=true while the message was never published, which can also suppress the loop’s normal outbound response (alreadySent becomes true). Consider having sendOutbound return a “sent/canceled” status (and reason) so this callback can return an error on cancel and avoid marking the tool call as successful.

Copilot uses AI. Check for mistakes.
if al.hooks != nil {
event := &hooks.MessageSendingEvent{Channel: msg.Channel, ChatID: msg.ChatID, Content: msg.Content}
al.hooks.TriggerMessageSending(ctx, event)
if event.Cancel {
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When message_sending cancels, sendOutbound silently drops the message. Since MessageSendingEvent includes CancelReason, consider logging (or otherwise surfacing) the cancel reason here so operators can understand why messages disappeared (and so cancellations are debuggable).

Suggested change
if event.Cancel {
if event.Cancel {
reason := strings.TrimSpace(event.CancelReason)
if reason == "" {
reason = "unspecified"
}
logger.Warnf("Outbound message to channel %s (chatID=%s) canceled by hook: %s", msg.Channel, msg.ChatID, reason)

Copilot uses AI. Check for mistakes.
Comment on lines 545 to 548
msgBus := bus.NewMessageBus()
agentLoop := agent.NewAgentLoop(cfg, msgBus, provider)
agentLoop.SetHooks(hooks.NewHookRegistry())

Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agentLoop.SetHooks(hooks.NewHookRegistry()) installs an empty registry unconditionally. With this wiring, al.hooks is non-nil even when no handlers are registered, so the loop will still construct hook events and call Trigger* on every iteration (adding overhead and undermining the stated “zero-cost when unused” goal). Consider leaving hooks nil by default and only calling SetHooks when hooks are actually configured/registered, or add a cheap “has handlers” check before building events.

Copilot uses AI. Check for mistakes.
Comment on lines 57 to 77

for i := range 5 {
r.OnMessageReceived("hook-"+string(rune('A'+i)), i, func(_ context.Context, _ *MessageReceivedEvent) error {
count.Add(1)
// Small sleep to verify concurrency (all run in parallel).
time.Sleep(10 * time.Millisecond)
return nil
})
}

start := time.Now()
r.TriggerMessageReceived(ctx, &MessageReceivedEvent{Content: "test"})
elapsed := time.Since(start)

if count.Load() != 5 {
t.Errorf("Expected 5 handlers called, got %d", count.Load())
}
// If truly concurrent, total time should be much less than 5*10ms = 50ms.
if elapsed > 40*time.Millisecond {
t.Errorf("Handlers appear sequential: elapsed %v", elapsed)
}
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestVoidHooksConcurrent relies on timing (sleeping 10ms per handler and asserting total elapsed < 40ms). This is likely to be flaky under CI load or on slower machines, even if handlers are concurrent. Prefer a synchronization-based assertion (e.g., a barrier channel all handlers must reach before any can proceed) so the test doesn’t depend on wall-clock thresholds.

Suggested change
for i := range 5 {
r.OnMessageReceived("hook-"+string(rune('A'+i)), i, func(_ context.Context, _ *MessageReceivedEvent) error {
count.Add(1)
// Small sleep to verify concurrency (all run in parallel).
time.Sleep(10 * time.Millisecond)
return nil
})
}
start := time.Now()
r.TriggerMessageReceived(ctx, &MessageReceivedEvent{Content: "test"})
elapsed := time.Since(start)
if count.Load() != 5 {
t.Errorf("Expected 5 handlers called, got %d", count.Load())
}
// If truly concurrent, total time should be much less than 5*10ms = 50ms.
if elapsed > 40*time.Millisecond {
t.Errorf("Handlers appear sequential: elapsed %v", elapsed)
}
started := make(chan struct{}, 5)
release := make(chan struct{})
done := make(chan struct{})
for i := range 5 {
r.OnMessageReceived("hook-"+string(rune('A'+i)), i, func(_ context.Context, _ *MessageReceivedEvent) error {
started <- struct{}{}
<-release
count.Add(1)
return nil
})
}
go func() {
r.TriggerMessageReceived(ctx, &MessageReceivedEvent{Content: "test"})
close(done)
}()
// Ensure all handlers have started and reached the barrier.
for i := 0; i < 5; i++ {
select {
case <-started:
case <-time.After(1 * time.Second):
t.Fatalf("timeout waiting for handler %d to start", i+1)
}
}
// Release all handlers to complete.
close(release)
// Wait for TriggerMessageReceived to finish.
select {
case <-done:
case <-time.After(1 * time.Second):
t.Fatal("timeout waiting for handlers to complete")
}
if count.Load() != 5 {
t.Errorf("Expected 5 handlers called, got %d", count.Load())
}

Copilot uses AI. Check for mistakes.
Comment on lines 157 to 165
for _, h := range hooks {
if err := h.Handler(ctx, event); err != nil {
logger.WarnCF("hooks", "Hook error",
map[string]any{
"hook": hookName,
"handler": h.Name,
"error": err.Error(),
})
}
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A panic inside a modifying hook handler will crash the agent loop. Since these are intended as extension points, consider adding panic recovery around h.Handler(ctx, event) (and then logging) so a single bad hook can’t bring down the whole process.

Copilot uses AI. Check for mistakes.
Comment on lines 39 to 44
ToolName string
Args map[string]any // Modifiable
Channel string
ChatID string
Cancel bool
CancelMsg string // Message returned to LLM when canceled
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cancellation fields are named inconsistently across events (MessageSendingEvent has CancelReason, BeforeToolCallEvent has CancelMsg). Since these structs are part of the public hook API, consider standardizing naming (e.g., CancelReason vs CancelMessage) so hook authors don’t have to remember special cases.

Suggested change
ToolName string
Args map[string]any // Modifiable
Channel string
ChatID string
Cancel bool
CancelMsg string // Message returned to LLM when canceled
ToolName string
Args map[string]any // Modifiable
Channel string
ChatID string
Cancel bool
CancelReason string // Reason for cancellation; preferred over CancelMsg for new hooks
CancelMsg string // DEPRECATED: use CancelReason; kept for backward compatibility

Copilot uses AI. Check for mistakes.
Comment on lines 198 to 214
func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) {
al.hooks = h

// Rewire MessageTool callbacks to route through sendOutbound for hook interception.
for _, agentID := range al.registry.ListAgentIDs() {
if agent, ok := al.registry.GetAgent(agentID); ok {
if tool, ok := agent.Tools.Get("message"); ok {
if mt, ok := tool.(*tools.MessageTool); ok {
mt.SetSendCallback(func(channel, chatID, content string) error {
al.sendOutbound(context.Background(), bus.OutboundMessage{
Channel: channel,
ChatID: chatID,
Content: content,
})
return nil
})
}
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetHooks mutates each MessageTool’s sendCallback without any synchronization, while MessageTool.Execute reads sendCallback without locking. If SetHooks could be called while the agent loop is running, this is a data race. Either document/enforce that SetHooks must be called before Run starts, or add synchronization (e.g., protect sendCallback with a mutex/atomic pointer or rewire at tool construction time only).

Copilot uses AI. Check for mistakes.
Comment on lines 409 to 412
msgBus := bus.NewMessageBus()
agentLoop := agent.NewAgentLoop(cfg, msgBus, provider)
agentLoop.SetHooks(hooks.NewHookRegistry())

Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agentLoop.SetHooks(hooks.NewHookRegistry()) installs an empty registry unconditionally. With this wiring, al.hooks is non-nil even when no handlers are registered, so the loop will still construct hook events and call Trigger* on every iteration (adding overhead and undermining the stated “zero-cost when unused” goal). Consider leaving hooks nil by default and only calling SetHooks when hooks are actually configured/registered, or add a cheap “has handlers” check before building events.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 19, 2026 09:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

pkg/agent/loop.go:848

  • maybeSummarize launches a goroutine but captures and reuses the request ctx when calling sendOutbound. Since the goroutine may run after the parent request is canceled, hooks may see a canceled context and behave differently (e.g., skip work / cancel the message). Use a fresh context (e.g., context.Background() or a derived context with timeout) for this background notification, or explicitly document that hooks will receive the caller's ctx even in background work.
			go func() {
				defer al.summarizing.Delete(summarizeKey)
				if !constants.IsInternalChannel(channel) {
					al.sendOutbound(ctx, bus.OutboundMessage{
						Channel: channel,
						ChatID:  chatID,
						Content: "Memory threshold reached. Optimizing conversation history...",
					})
				}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 545 to 549
msgBus := bus.NewMessageBus()
agentLoop := agent.NewAgentLoop(cfg, msgBus, provider)
// Hook registry is nil by default for true zero-cost.
// Call agentLoop.SetHooks(hooks.NewHookRegistry()) to enable hooks.

Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as earlier in agentCmd: this is comment-only guidance and does not actually wire hooks. If the intent is to support hooks in gateway mode, consider adding an explicit enablement mechanism (config/env/flag) and calling agentLoop.SetHooks(...), or adjust the PR description accordingly.

Copilot uses AI. Check for mistakes.
Comment on lines 36 to 45
// BeforeToolCallEvent is fired before a tool is executed.
// Handlers can modify Args, or set Cancel to block execution.
type BeforeToolCallEvent struct {
ToolName string
Args map[string]any // Modifiable
Channel string
ChatID string
Cancel bool
CancelReason string // Message returned to LLM when canceled
}
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BeforeToolCallEvent.Args is documented as modifiable but may be nil depending on the upstream tool call (arguments are omitempty). If the API expects modification, consider documenting that Args is always non-nil (and enforce it in the integration/trigger path), or explicitly state that handlers must handle nil.

Copilot uses AI. Check for mistakes.
Comment on lines 743 to 756
if al.hooks != nil {
btcEvent := &hooks.BeforeToolCallEvent{
ToolName: tc.Name,
Args: tc.Arguments,
Channel: opts.Channel,
ChatID: opts.ChatID,
}
al.hooks.TriggerBeforeToolCall(ctx, btcEvent)
if btcEvent.Cancel {
toolCanceled = true
toolResult = tools.ErrorResult(btcEvent.CancelReason)
}
tc.Arguments = btcEvent.Args
}
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BeforeToolCallEvent.Args is populated from tc.Arguments, which can be nil (tool call arguments are omitempty). If a hook tries to modify args (the documented use-case), writing to a nil map will panic and the modification will be silently dropped (panic is recovered in the hook runner). Initialize args to a non-nil map before triggering the hook (and/or when constructing btcEvent).

Copilot uses AI. Check for mistakes.
al.hooks.TriggerBeforeToolCall(ctx, btcEvent)
if btcEvent.Cancel {
toolCanceled = true
toolResult = tools.ErrorResult(btcEvent.CancelReason)
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a hook cancels a tool call, tools.ErrorResult(btcEvent.CancelReason) is used even if CancelReason is empty. That can produce an empty tool result message fed back into the LLM, which is hard to debug. Consider defaulting to a non-empty message (and/or including the tool name / handler) when CancelReason == "".

Suggested change
toolResult = tools.ErrorResult(btcEvent.CancelReason)
reason := btcEvent.CancelReason
if strings.TrimSpace(reason) == "" {
reason = fmt.Sprintf("tool call %q was cancelled by before_tool_call hook", tc.Name)
}
toolResult = tools.ErrorResult(reason)

Copilot uses AI. Check for mistakes.
Comment on lines 408 to 412
msgBus := bus.NewMessageBus()
agentLoop := agent.NewAgentLoop(cfg, msgBus, provider)
// Hook registry is nil by default for true zero-cost.
// Call agentLoop.SetHooks(hooks.NewHookRegistry()) to enable hooks.

Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description claims main wiring will "create and set HookRegistry", but the code only adds comments and never actually enables hooks (no hooks.NewHookRegistry() import/call). Either add real wiring (e.g., a config/env/flag to enable and call agentLoop.SetHooks(...)) or update the PR description/checklist to reflect that hooks are intentionally opt-in and not wired by default.

Copilot uses AI. Check for mistakes.
Add a typed lifecycle hook system inspired by OpenClaw, designed for
PicoClaw's ultra-lightweight philosophy. Provides 8 interception points
around the agent loop for observability, content filtering, and guardrails.

Two execution patterns:
- Void hooks (concurrent): message_received, after_tool_call,
  llm_input, llm_output, session_start, session_end
- Modifying hooks (sequential by priority, with cancel):
  message_sending, before_tool_call

Key design choices:
- Zero-cost when unused: all triggers check len==0 and return immediately
- Copy-on-write registration: insertSorted allocates new backing array
  so concurrent readers never race with writers
- Panic recovery in all handler dispatch paths
- sendOutbound wrapper returns cancel status to callers
- MessageTool callback rewired via SetHooks for content filtering

15 tests covering execution, priority ordering, cancel semantics,
concurrency (barrier-based), panic recovery, and error swallowing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@nikolasdehor nikolasdehor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well-designed hook system. The generic approach with HookHandler[T] and HookRegistration[T] is clean and type-safe.

Key strengths:

  • Zero-cost when unused: The len==0 early return in trigger methods means no allocations or goroutines when no hooks are registered. The hooks field defaults to nil on AgentLoop, which is the right choice.
  • Copy-on-write insertSorted: Allocating a new backing array on registration avoids data races between concurrent readers (trigger under RLock) and writers (register under Lock). Good design.
  • Panic recovery: Both triggerVoid (concurrent goroutines) and triggerModifying (sequential loop) have defer recover() — prevents a misbehaving hook from crashing the agent loop.
  • sendOutbound returns bool: The caller (including the MessageTool callback path) can detect when delivery was canceled by a hook. Clean API.
  • Test coverage: 15 tests covering execution order, priority, cancellation, concurrency, and panic recovery.

Minor observations:

  • The SetHooks method rewires the MessageTool callback, which is a good approach for intercepting tool-generated outbound messages. The context.Background() usage there is acceptable since the message context is not available at hook registration time.
  • The maybeSummarize signature change to include ctx context.Context is a clean way to thread context through to the hook triggers.

Solid addition to the codebase.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

ChatID: chatID,
Content: content,
}) {
return fmt.Errorf("message canceled by hook")
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When sendOutbound returns false, the MessageTool callback returns a generic "message canceled by hook" error. This drops the hook-provided CancelReason, making it hard for the LLM/operator to understand what policy blocked the message. Consider having sendOutbound return the cancel reason (or an error type) so it can be propagated here.

Copilot uses AI. Check for mistakes.
Comment on lines 127 to 130
// triggerVoid runs all handlers concurrently and waits for completion.
// Handlers MUST NOT mutate the event — it is shared across goroutines.
// Errors are logged but do not propagate to the caller.
func triggerVoid[T any](ctx context.Context, hooks []HookRegistration[T], event *T, hookName string) {
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

triggerVoid passes the same event pointer to multiple goroutines. Even with the comment saying handlers must not mutate, it only takes one handler to accidentally write to the struct/map/slice fields to introduce a data race between hooks. To make the system safer for third-party hooks, consider passing each handler its own (shallow or deep) copy of the event, or running void hooks sequentially when the event contains reference types.

Copilot uses AI. Check for mistakes.
}

// SetHooks installs a hook registry. Must be called before Run starts.
func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) {
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetHooks mutates al.hooks and rewires MessageTool callbacks without any guard against the loop already running. If SetHooks is called after Run starts (despite the comment), this can race with sendOutbound/trigger calls and tool execution. Consider enforcing this contract (e.g., return an error or panic when al.running is true, or protect the hooks pointer/callback rewiring with a mutex/atomic).

Suggested change
func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) {
func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) {
if al.running.Load() {
panic("SetHooks must be called before Run starts")
}

Copilot uses AI. Check for mistakes.
@gh-xj
Copy link
Collaborator Author

gh-xj commented Feb 22, 2026

Rebased/merged with latest upstream main on 2026-02-22 and resolved all conflicts in commit f26d207.

Local checks all pass:

  • make fmt
  • go generate ./...
  • go vet ./...
  • go test ./...

Current GitHub Actions status is action_required with no jobs started. A maintainer approval for fork PR workflow execution appears to be needed to run CI.

Copilot AI review requested due to automatic review settings February 22, 2026 10:10
@gh-xj gh-xj changed the title feat(hooks): add lightweight lifecycle hook system feat(hooks/plugin): add lifecycle hooks + phase-1 plugin contract Feb 22, 2026
@gh-xj
Copy link
Collaborator Author

gh-xj commented Feb 22, 2026

Phase 1 plugin direction has now been implemented in this PR as requested, in separate commits:

  1. 8d4cbd7feat(plugin): add phase-1 compile-time plugin contract

    • adds pkg/plugin manager + Plugin interface
    • integrates plugin manager with AgentLoop (SetPluginManager, EnablePlugins)
    • adds tests in pkg/plugin/manager_test.go and pkg/agent/plugin_test.go
  2. 676f50fdocs(plugin): document plugin model and phased roadmap

    • adds plugin model semantics for users/developers
    • adds phased roadmap doc with explicit non-goals for this PR

Key docs:

  • docs/hooks-plugin-examples.md
  • docs/design/plugin-system-roadmap.md

This keeps risk bounded: dynamic plugin loading remains out-of-scope; current model is compile-time registration on top of typed lifecycle hooks.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.

Comments suppressed due to low confidence (1)

pkg/agent/loop.go:901

  • The context passed to sendOutbound in the goroutine may be canceled before the message is sent. The ctx parameter comes from the parent request and could be canceled if the original request completes or times out. Consider using context.Background() or a detached context for goroutines that should continue after the request ends, similar to line 227 in SetHooks.
		if st, ok := tool.(tools.ContextualTool); ok {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

wg.Add(1)
go func() {
defer wg.Done()
r.OnMessageReceived("reg-hook", i, func(_ context.Context, _ *MessageReceivedEvent) error {
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All registration goroutines use the same handler name "reg-hook" but different priorities. When multiple hooks with the same name are registered, there's no deduplication, so this test registers 10 hooks all named "reg-hook". While this tests concurrent registration safety, it doesn't reflect realistic usage. Consider using unique names like fmt.Sprintf("reg-hook-%d", i) to better represent production scenarios.

Suggested change
r.OnMessageReceived("reg-hook", i, func(_ context.Context, _ *MessageReceivedEvent) error {
r.OnMessageReceived(fmt.Sprintf("reg-hook-%d", i), i, func(_ context.Context, _ *MessageReceivedEvent) error {

Copilot uses AI. Check for mistakes.
Comment on lines 217 to 265
}

// SetHooks installs a hook registry. Must be called before Run starts.
func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) {
al.hooks = h

// Rewire MessageTool callbacks to route through sendOutbound for hook interception.
for _, agentID := range al.registry.ListAgentIDs() {
if agent, ok := al.registry.GetAgent(agentID); ok {
if tool, ok := agent.Tools.Get("message"); ok {
if mt, ok := tool.(*tools.MessageTool); ok {
mt.SetSendCallback(func(channel, chatID, content string) error {
if !al.sendOutbound(context.Background(), bus.OutboundMessage{
Channel: channel,
ChatID: chatID,
Content: content,
}) {
return fmt.Errorf("message canceled by hook")
}
return nil
})
}
}
}
}
}

// SetPluginManager installs a plugin manager and routes its hook registry into the loop.
// Must be called before Run starts.
func (al *AgentLoop) SetPluginManager(pm *plugin.Manager) {
al.pluginManager = pm
if pm == nil {
al.SetHooks(nil)
return
}
al.SetHooks(pm.HookRegistry())
}

// EnablePlugins is a convenience helper to build and install a plugin manager.
func (al *AgentLoop) EnablePlugins(plugins ...plugin.Plugin) error {
pm := plugin.NewManager()
if err := pm.RegisterAll(plugins...); err != nil {
return err
}
al.SetPluginManager(pm)
return nil
}

// sendOutbound wraps bus.PublishOutbound with the message_sending hook.
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hook integration in AgentLoop lacks test coverage. While pkg/hooks has comprehensive unit tests, there are no integration tests verifying that hooks are correctly triggered during message processing, tool execution, or LLM calls. Consider adding tests that verify hook execution in realistic agent loop scenarios, especially for the critical paths like before_tool_call cancellation and message_sending filtering.

Copilot uses AI. Check for mistakes.
Comment on lines 127 to 160
// triggerVoid runs all handlers concurrently and waits for completion.
// Handlers MUST NOT mutate the event — it is shared across goroutines.
// Errors are logged but do not propagate to the caller.
func triggerVoid[T any](ctx context.Context, hooks []HookRegistration[T], event *T, hookName string) {
if len(hooks) == 0 {
return
}
var wg sync.WaitGroup
for _, h := range hooks {
wg.Add(1)
go func(reg HookRegistration[T]) {
defer wg.Done()
defer func() {
if r := recover(); r != nil {
logger.ErrorCF("hooks", "Hook panic",
map[string]any{
"hook": hookName,
"handler": reg.Name,
"panic": fmt.Sprintf("%v", r),
})
}
}()
if err := reg.Handler(ctx, event); err != nil {
logger.WarnCF("hooks", "Hook error",
map[string]any{
"hook": hookName,
"handler": reg.Name,
"error": err.Error(),
})
}
}(h)
}
wg.Wait()
}
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Void hooks documentation states "Handlers MUST NOT mutate the event" but there's no enforcement mechanism. Since event data is shared across concurrent goroutines, mutations could cause data races. Consider adding a note in public-facing documentation that hook handlers must be read-only for observe-only hooks, or consider using immutable event types (though this would add overhead).

Copilot uses AI. Check for mistakes.
Comment on lines +787 to +806
"iteration": iteration,
})

// Create async callback for tools that implement AsyncTool
// NOTE: Following openclaw's design, async tools do NOT send results directly to users.
// Instead, they notify the agent via PublishInbound, and the agent decides
// whether to forward the result to the user (in processSystemMessage).
asyncCallback := func(callbackCtx context.Context, result *tools.ToolResult) {
// Log the async completion but don't send directly to user
// The agent will handle user notification via processSystemMessage
if !result.Silent && result.ForUser != "" {
logger.InfoCF("agent", "Async tool completed, agent will handle notification",
map[string]any{
"tool": tc.Name,
"content_len": len(result.ForUser),
})
}
}

toolResult := agent.Tools.ExecuteWithContext(
ctx,
tc.Name,
tc.Arguments,
opts.Channel,
opts.ChatID,
asyncCallback,
)
// Fire before_tool_call hook
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When tc.Arguments is nil, a new empty map is created for the hook event (line 789), but modifications to this map are not propagated back to tc.Arguments because args is reassigned on line 806. However, if tc.Arguments was originally nil, line 806 correctly assigns the (possibly modified) map back to tc.Arguments. This logic is correct but subtle. Consider adding a comment explaining that the empty map creation ensures hooks always receive a non-nil map that can be safely modified.

Copilot uses AI. Check for mistakes.
Comment on lines 226 to 235
if tool, ok := agent.Tools.Get("message"); ok {
if mt, ok := tool.(*tools.MessageTool); ok {
mt.SetSendCallback(func(channel, chatID, content string) error {
if !al.sendOutbound(context.Background(), bus.OutboundMessage{
Channel: channel,
ChatID: chatID,
Content: content,
}) {
return fmt.Errorf("message canceled by hook")
}
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SetSendCallback method on MessageTool is being called without synchronization. If a tool execution is in progress during SetHooks, there's a race condition between reading sendCallback in Execute() and writing it in SetSendCallback(). While unlikely due to expected pre-Run() usage, this violates Go's race detection rules.

Copilot uses AI. Check for mistakes.
// Register with priorities: 50, 10, 30, 20, 40
priorities := []int{50, 10, 30, 20, 40}
for _, p := range priorities {
r.OnBeforeToolCall("p-"+string(rune('0'+p)), p, func(_ context.Context, _ *BeforeToolCallEvent) error {
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name creation logic "p-"+string(rune('0'+p)) doesn't produce readable test names. For priority 50, rune('0'+50) equals 98 which is 'b', not a meaningful name. This should use fmt.Sprintf("p-%d", p) instead to generate names like "p-50", "p-10", etc.

Suggested change
r.OnBeforeToolCall("p-"+string(rune('0'+p)), p, func(_ context.Context, _ *BeforeToolCallEvent) error {
r.OnBeforeToolCall(fmt.Sprintf("p-%d", p), p, func(_ context.Context, _ *BeforeToolCallEvent) error {

Copilot uses AI. Check for mistakes.
@gh-xj
Copy link
Collaborator Author

gh-xj commented Feb 22, 2026

Follow-up for style/quality validation: GitHub Actions is still action_required for this fork PR, so lint/test jobs have not started yet.

Could a maintainer approve workflow execution for PR #473 so CI (including lint) can run?

Latest head: 676f50f.

Copilot AI review requested due to automatic review settings February 22, 2026 10:21
@gh-xj
Copy link
Collaborator Author

gh-xj commented Feb 22, 2026

Reviewed Copilot feedback on the latest head and triaged stale vs actionable.

Addressed in follow-up commits:

  • 50d6562 fix(hooks): address copilot race and diagnostics feedback
    • Enforced pre-run contract in SetHooks (panic if called after run starts).
    • Propagated hook cancel reason in MessageTool callback path.
    • Kept cancel-reason logging on outbound cancellation.
    • Added/updated tests for integration behavior and naming clarity.
  • 8aa0a30 docs(hooks): clarify BeforeToolCallEvent args non-nil contract

Items considered stale/outdated:

  • Earlier panic-recovery concerns in hook runners.
  • Earlier sendOutbound bool/cancel handling concerns.
  • Earlier nil-args concern in loop integration (now explicitly documented).

Remaining open by design (defer):

  • triggerVoid shared-event pointer for observe-only hooks: current design prioritizes low overhead; docs state observe-only handlers must not mutate shared event payloads.

If maintainers want, I can do a separate follow-up PR for an optional "safe mode" that deep-copies event payloads for observe-only hooks (with measured overhead tradeoff).

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

for _, agentID := range al.registry.ListAgentIDs() {
if agent, ok := al.registry.GetAgent(agentID); ok {
if tool, ok := agent.Tools.Get("message"); ok {
if mt, ok := tool.(*tools.MessageTool); ok {
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent indentation: this if statement should be aligned with the surrounding code. It appears to have extra indentation (starts with tabs when it should match line 229).

Suggested change
if mt, ok := tool.(*tools.MessageTool); ok {
if mt, ok := tool.(*tools.MessageTool); ok {

Copilot uses AI. Check for mistakes.
Comment on lines 838 to 849
toolStart := time.Now()
if !toolCanceled {
toolResult = agent.Tools.ExecuteWithContext(
ctx,
tc.Name,
tc.Arguments,
opts.Channel,
opts.ChatID,
asyncCallback,
)
}
toolDuration := time.Since(toolStart)
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The toolStart timer is set before checking if the tool is canceled, but when toolCanceled is true, the tool never actually executes. This means the Duration in the after_tool_call hook will be near-zero for canceled tools, which is misleading. Consider either moving toolStart to line 839 (inside the if block), or documenting that Duration will be near-zero for canceled operations.

Copilot uses AI. Check for mistakes.

## Current Status (Phase 0: Foundation)

Implemented in current hooks MR:
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent terminology: "MR" (Merge Request) is GitLab terminology, while this is a GitHub repository. The codebase uses "PR" (Pull Request) elsewhere in documentation. Should use "PR" for consistency.

Copilot uses AI. Check for mistakes.

## Maintainer Review Notes

The current hooks MR should be reviewed as Phase 0 only. It intentionally establishes extension points while avoiding high-risk runtime plugin mechanics.
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent terminology: "MR" (Merge Request) is GitLab terminology, while this is a GitHub repository. The codebase uses "PR" (Pull Request) elsewhere in documentation. Should use "PR" for consistency.

Copilot uses AI. Check for mistakes.
Comment on lines 224 to 247
al.hooks = h

// Rewire MessageTool callbacks to route through sendOutbound for hook interception.
for _, agentID := range al.registry.ListAgentIDs() {
if agent, ok := al.registry.GetAgent(agentID); ok {
if tool, ok := agent.Tools.Get("message"); ok {
if mt, ok := tool.(*tools.MessageTool); ok {
mt.SetSendCallback(func(channel, chatID, content string) error {
if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{
Channel: channel,
ChatID: chatID,
Content: content,
}); !sent {
if strings.TrimSpace(reason) == "" {
reason = "unspecified"
}
return fmt.Errorf("message canceled by hook: %s", reason)
}
return nil
})
}
}
}
}
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When SetPluginManager(nil) is called after hooks were previously installed, the MessageTool callbacks are not restored to their original state. They remain pointing to sendOutbound, which will use the now-nil hooks registry. Consider resetting the MessageTool callbacks to the original direct bus.PublishOutbound behavior when h is nil, or document that SetHooks should not be called with nil after hooks were set.

Suggested change
al.hooks = h
// Rewire MessageTool callbacks to route through sendOutbound for hook interception.
for _, agentID := range al.registry.ListAgentIDs() {
if agent, ok := al.registry.GetAgent(agentID); ok {
if tool, ok := agent.Tools.Get("message"); ok {
if mt, ok := tool.(*tools.MessageTool); ok {
mt.SetSendCallback(func(channel, chatID, content string) error {
if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{
Channel: channel,
ChatID: chatID,
Content: content,
}); !sent {
if strings.TrimSpace(reason) == "" {
reason = "unspecified"
}
return fmt.Errorf("message canceled by hook: %s", reason)
}
return nil
})
}
}
}
}
// If hooks are being cleared, restore MessageTool callbacks to direct bus publishing.
if h == nil {
al.hooks = nil
for _, agentID := range al.registry.ListAgentIDs() {
if agent, ok := al.registry.GetAgent(agentID); ok {
if tool, ok := agent.Tools.Get("message"); ok {
if mt, ok := tool.(*tools.MessageTool); ok {
mt.SetSendCallback(func(channel, chatID, content string) error {
// Bypass hook interception and publish directly to the bus.
return al.bus.PublishOutbound(context.Background(), bus.OutboundMessage{
Channel: channel,
ChatID: chatID,
Content: content,
})
})
}
}
}
}
return
}
al.hooks = h
// Rewire MessageTool callbacks to route through sendOutbound for hook interception.
for _, agentID := range al.registry.ListAgentIDs() {
if agent, ok := al.registry.GetAgent(agentID); ok {
if tool, ok := agent.Tools.Get("message"); ok {
if mt, ok := tool.(*tools.MessageTool); ok {
mt.SetSendCallback(func(channel, chatID, content string) error {
if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{
Channel: channel,
ChatID: chatID,
Content: content,
}); !sent {
if strings.TrimSpace(reason) == "" {
reason = "unspecified"
}
return fmt.Errorf("message canceled by hook: %s", reason)
}
return nil
})
}
}
}
}

Copilot uses AI. Check for mistakes.
Comment on lines 29 to 43
## Phase 1: Static Plugin Contract (Compile-time)

Goal: define a minimal public plugin contract for Go modules.

Proposed:

- Add `pkg/plugin` with a small interface:
- `Name() string`
- `Register(*hooks.HookRegistry) error`
- Register plugins at startup in code.
- Add compatibility metadata (`PluginAPIVersion`) for forward checks.

Exit criteria:

- Example plugin module builds against the contract.
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description states that Phase 1 has been implemented in this PR (commit 8d4cbd7), but this roadmap document still lists Phase 1 as "Proposed" with future tense. The document should be updated to reflect that Phase 1 is now implemented, moving it to a "Current Status" or "Completed" section to match the actual state of the codebase.

Suggested change
## Phase 1: Static Plugin Contract (Compile-time)
Goal: define a minimal public plugin contract for Go modules.
Proposed:
- Add `pkg/plugin` with a small interface:
- `Name() string`
- `Register(*hooks.HookRegistry) error`
- Register plugins at startup in code.
- Add compatibility metadata (`PluginAPIVersion`) for forward checks.
Exit criteria:
- Example plugin module builds against the contract.
## Phase 1: Static Plugin Contract (Compile-time) — Completed
Goal: define a minimal public plugin contract for Go modules.
Status: Implemented in commit 8d4cbd7:
- `pkg/plugin` provides a small interface:
- `Name() string`
- `Register(*hooks.HookRegistry) error`
- Plugins are registered at startup in code.
- Compatibility metadata (`PluginAPIVersion`) enables forward checks.
Exit criteria (met):
- An example plugin module builds against the contract.

Copilot uses AI. Check for mistakes.
- Attach once via `agentLoop.SetHooks(registry)` before `Run()`.
- If hooks are not set, default behavior is unchanged.

See runnable examples: [docs/hooks-plugin-examples.md](docs/hooks-plugin-examples.md)
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing blank line between documentation links. For better readability and consistency with Markdown best practices, add a blank line between these two documentation references.

Suggested change
See runnable examples: [docs/hooks-plugin-examples.md](docs/hooks-plugin-examples.md)
See runnable examples: [docs/hooks-plugin-examples.md](docs/hooks-plugin-examples.md)

Copilot uses AI. Check for mistakes.
if tool, ok := agent.Tools.Get("message"); ok {
if mt, ok := tool.(*tools.MessageTool); ok {
mt.SetSendCallback(func(channel, chatID, content string) error {
if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using context.Background() here discards the original context that may contain cancellation signals, deadlines, or tracing information. The MessageTool callback should accept and propagate a context parameter instead of creating a new background context. This could lead to operations continuing even after the parent context is cancelled.

Suggested change
if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
if sent, reason := al.sendOutbound(ctx, bus.OutboundMessage{

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 23, 2026 03:07
@gh-xj
Copy link
Collaborator Author

gh-xj commented Feb 23, 2026

Updated roadmap with explicit runtime plugin direction.

Short version:

  • We are not taking Go .so plugins as default runtime path (toolchain/ABI coupling risk).
  • If runtime plugins are pursued later, preferred model is subprocess + RPC/gRPC with host-managed lifecycle and versioned capability handshake.

This keeps current PR scope unchanged (hooks + compile-time contract) while documenting a safer long-term runtime path.

@gh-xj
Copy link
Collaborator Author

gh-xj commented Feb 23, 2026

Added plugin API compatibility guard in this PR.

What changed:

  • plugin.Plugin now requires APIVersion() string.
  • plugin.Manager.Register rejects plugins whose API version does not match host plugin.APIVersion.
  • Added tests for version mismatch rejection.
  • Updated demo plugin implementation to declare host API version.

Reason:

  • Main program evolves quickly; fail-fast compatibility checks are safer than silent behavior drift.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +247 to +250
case uint64:
if n > maxIntU64 {
return 0, false
}
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Float to int conversion truncates the decimal part without rounding. For timeout values, this could lead to unexpected behavior. For example, timeout_seconds: 30.9 would become 30 instead of 31. Consider documenting this truncation behavior or using rounding (e.g., int(n + 0.5)) to avoid surprising users with values just below a threshold.

Copilot uses AI. Check for mistakes.
Comment on lines +226 to +227
return int(n), true
case int16:
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conversion from int64 to int can overflow on 32-bit systems if the value exceeds MaxInt32. The function should check if the value is within the valid range for int before converting, similar to how it's done for unsigned types. This could cause silent overflow and incorrect behavior.

Copilot uses AI. Check for mistakes.
Comment on lines 220 to 248
func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) {
if al.running.Load() {
panic("SetHooks must be called before Run starts")
}
al.hooks = h

// Rewire MessageTool callbacks to route through sendOutbound for hook interception.
for _, agentID := range al.registry.ListAgentIDs() {
if agent, ok := al.registry.GetAgent(agentID); ok {
if tool, ok := agent.Tools.Get("message"); ok {
if mt, ok := tool.(*tools.MessageTool); ok {
mt.SetSendCallback(func(channel, chatID, content string) error {
if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{
Channel: channel,
ChatID: chatID,
Content: content,
}); !sent {
if strings.TrimSpace(reason) == "" {
reason = "unspecified"
}
return fmt.Errorf("message canceled by hook: %s", reason)
}
return nil
})
}
}
}
}
}
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When SetHooks(nil) is called (e.g., via SetPluginManager(nil)), the MessageTool callback is still rewired to route through sendOutbound. This means even with hooks disabled, message tool sends go through an extra layer of indirection. Consider adding a check to skip the MessageTool rewiring when h == nil, or restore the original callback behavior.

Copilot uses AI. Check for mistakes.
-> outbound publish
-> session_end
```

Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lifecycle map shows a linear flow, but in reality, llm_input, llm_output, before_tool_call, and after_tool_call can occur multiple times in an iteration loop (up to MaxToolIterations). Consider adding a note about iteration/looping behavior to avoid confusion for plugin developers who need to understand when their hooks will fire.

Suggested change
Note: The map above is shown as a single linear pass for readability. In practice, the
agent loop may iterate up to `MaxToolIterations`, and the following hooks can fire
multiple times within a single overall lifecycle:
`llm_input`, `llm_output`, `before_tool_call`, and `after_tool_call`.

Copilot uses AI. Check for mistakes.
for _, agentID := range al.registry.ListAgentIDs() {
if agent, ok := al.registry.GetAgent(agentID); ok {
if tool, ok := agent.Tools.Get("message"); ok {
if mt, ok := tool.(*tools.MessageTool); ok {
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect indentation: this line has an extra tab character. The if mt, ok := tool.(*tools.MessageTool); ok { should align with the line above it at the same indentation level.

Copilot uses AI. Check for mistakes.
defer al.summarizing.Delete(summarizeKey)
if !constants.IsInternalChannel(channel) {
al.bus.PublishOutbound(bus.OutboundMessage{
al.sendOutbound(ctx, bus.OutboundMessage{
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context from the parent function is captured in a goroutine, which can lead to the goroutine using a canceled or expired context if the parent function completes before the goroutine. Consider using context.Background() or context.TODO() for the sendOutbound call within the goroutine, or ensure the context lifetime extends beyond the goroutine's execution.

Suggested change
al.sendOutbound(ctx, bus.OutboundMessage{
al.sendOutbound(context.Background(), bus.OutboundMessage{

Copilot uses AI. Check for mistakes.
@gh-xj
Copy link
Collaborator Author

gh-xj commented Feb 23, 2026

Addressed the two highest-priority quality items in 96097ff:

  • Replaced SetHooks panic behavior with recoverable error return (SetHooks(...) error), and updated plugin-manager wiring/tests.
  • Preserved request context through MessageTool -> sendCallback -> sendOutbound (removed detached context.Background() in hook send path).

Also added a context propagation test in pkg/tools/message_test.go.

Copilot AI review requested due to automatic review settings February 23, 2026 03:20
@gh-xj
Copy link
Collaborator Author

gh-xj commented Feb 23, 2026

Follow-up cleanup in 2c144ff:

Addressed:

  • maybeSummarize outbound notify now uses detached context for background goroutine (context.Background() path).
  • Roadmap/docs wording cleanup (PR terminology, Phase 1 marked implemented, lifecycle iteration note).
  • README spacing nit between plugin docs links.
  • Demo plugin numeric conversion:
    • added int64 -> int overflow guard for 32-bit.
    • documented float-to-int truncation behavior for timeout normalization.
    • added test for 32-bit overflow case.

Still deferred by design:

  • triggerVoid shared event pointer for observe-only hooks (kept for low overhead; documented read-only contract).

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings February 23, 2026 03:30
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

pkg/agent/loop.go:933

  • maybeSummarize now accepts a context but discards it and uses context.Background() when sending the outbound notification. This prevents hook handlers (and MessageTool callbacks) from seeing request-scoped context values (trace IDs, deadlines, auth markers) for summarization notifications. Consider passing the caller ctx through (or using context.WithoutCancel(ctx) if you want to avoid cancellation) instead of context.Background().
// maybeSummarize triggers summarization if the session history exceeds thresholds.
func (al *AgentLoop) maybeSummarize(_ context.Context, agent *AgentInstance, sessionKey, channel, chatID string) {
	newHistory := agent.Sessions.GetHistory(sessionKey)
	tokenEstimate := al.estimateTokens(newHistory)
	threshold := agent.ContextWindow * 75 / 100

	if len(newHistory) > 20 || tokenEstimate > threshold {
		summarizeKey := agent.ID + ":" + sessionKey
		if _, loading := al.summarizing.LoadOrStore(summarizeKey, true); !loading {
			go func() {
				defer al.summarizing.Delete(summarizeKey)
				if !constants.IsInternalChannel(channel) {
					al.sendOutbound(context.Background(), bus.OutboundMessage{
						Channel: channel,
						ChatID:  chatID,
						Content: "Memory threshold reached. Optimizing conversation history...",
					})

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 254 to 258
al.pluginManager = pm
if pm == nil {
return al.SetHooks(nil)
}
return al.SetHooks(pm.HookRegistry())
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetPluginManager assigns al.pluginManager before calling SetHooks. If SetHooks returns an error (e.g., Run already started), this leaves the AgentLoop in a partially-updated state (pluginManager set but hooks not installed). Consider only setting al.pluginManager after SetHooks succeeds, or rolling it back on error.

Suggested change
al.pluginManager = pm
if pm == nil {
return al.SetHooks(nil)
}
return al.SetHooks(pm.HookRegistry())
if pm == nil {
// Clear hooks first; only update pluginManager if this succeeds.
if err := al.SetHooks(nil); err != nil {
return err
}
al.pluginManager = nil
return nil
}
// Install hooks from the plugin manager first; only record the manager on success.
if err := al.SetHooks(pm.HookRegistry()); err != nil {
return err
}
al.pluginManager = pm
return nil

Copilot uses AI. Check for mistakes.
Comment on lines 814 to 835
if al.hooks != nil {
args := tc.Arguments
if args == nil {
args = make(map[string]any)
}
btcEvent := &hooks.BeforeToolCallEvent{
ToolName: tc.Name,
Args: args,
Channel: opts.Channel,
ChatID: opts.ChatID,
}
al.hooks.TriggerBeforeToolCall(ctx, btcEvent)
if btcEvent.Cancel {
toolCanceled = true
reason := btcEvent.CancelReason
if strings.TrimSpace(reason) == "" {
reason = fmt.Sprintf("tool call %q was canceled by before_tool_call hook", tc.Name)
}
toolResult = tools.ErrorResult(reason)
}
tc.Arguments = btcEvent.Args
}
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After TriggerBeforeToolCall, tc.Arguments is set to btcEvent.Args without re-validating it. A hook could set Args to nil, which would later cause tools (or other hooks) that mutate args to panic when writing into a nil map. Consider ensuring Args remains non-nil after hooks run (e.g., re-initialize to an empty map if nil).

Copilot uses AI. Check for mistakes.
Comment on lines 257 to 262
case float32:
// Truncation is intentional for timeout normalization.
return int(n), true
case float64:
// Truncation is intentional for timeout normalization.
return int(n), true
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toInt converts float32/float64 to int without any bounds checks. Per the Go spec, converting a float outside the target int range produces an implementation-specific value, which can break timeout clamping (e.g., huge values may wrap/turn negative and bypass the clamp). Consider checking against min/max int before converting floats, similar to the int64/uint64 cases.

Copilot uses AI. Check for mistakes.
@lxowalle lxowalle mentioned this pull request Feb 23, 2026
5 tasks
@gh-xj
Copy link
Collaborator Author

gh-xj commented Feb 26, 2026

Updated feat/hook-system with merge + focused follow-up fixes (head: 76efd6c).

What I changed:

  • Merged latest upstream/main and resolved the only conflict in pkg/agent/loop.go.
  • Fixed SetPluginManager partial-update risk: pluginManager is now assigned only after SetHooks(...) succeeds.
  • Hardened before_tool_call integration: reinitialize tc.Arguments to non-nil after hook execution in case a hook sets it to nil.
  • Clarified canceled-tool duration behavior: after_tool_call duration is now measured only for executed tool calls (canceled calls report zero duration).
  • Hardened demo policy numeric conversion: toInt now rejects NaN/Inf/out-of-range float values before conversion.

Tests added/updated:

  • pkg/agent/plugin_test.go
    • TestSetPluginManagerDoesNotPartiallyUpdateOnError
    • TestBeforeToolCallHooksCannotLeaveToolArgsNil
  • pkg/plugin/demoplugin/policy_demo_test.go
    • TestToIntRejectsInvalidFloatValues

Validation:

  • go generate ./...
  • make check

Current PR state:

  • mergeable: MERGEABLE

Copilot AI review requested due to automatic review settings February 26, 2026 06:23
@gh-xj
Copy link
Collaborator Author

gh-xj commented Feb 26, 2026

Follow-up push: e3094ec (lint parity fixes for CI config)

  • Fixed golines formatting in pkg/hooks/hooks.go and pkg/hooks/hooks_test.go
  • Fixed govet shadowing in pkg/agent/plugin_test.go
  • Added named interface param in pkg/plugin/manager.go for inamedparam
  • Renamed clampArgNumber parameter (max -> limit) in pkg/plugin/demoplugin/policy_demo.go for predeclared

Current CI status on PR #473:

  • Linter: pass
  • Tests: pass

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +838 to +845
args := tc.Arguments
if args == nil {
args = make(map[string]any)
}
btcEvent := &hooks.BeforeToolCallEvent{
ToolName: tc.Name,
Args: args,
Channel: opts.Channel,
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a before_tool_call hook cancels a tool invocation, the cancel reason is used to build an error ToolResult but is not logged anywhere with the reason. The PR description says cancellation paths surface reasons and are logged; consider adding a log entry here (tool name + reason + channel/chatID) so operational diagnostics match the documented behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +78 to +87
m.mu.Lock()
defer m.mu.Unlock()
if _, exists := m.seen[name]; exists {
return fmt.Errorf("plugin %q already registered", name)
}
if err := p.Register(m.registry); err != nil {
return fmt.Errorf("register plugin %q: %w", name, err)
}
m.seen[name] = struct{}{}
m.names = append(m.names, name)
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manager.Register holds m.mu while calling the plugin’s Register method. Since Register executes plugin-provided code, holding the manager lock during that call can cause unnecessary contention and can deadlock if future plugin code indirectly calls back into Manager APIs. Prefer checking/marking the name under the lock, releasing the lock before calling p.Register, and then reacquiring to append name (with rollback if Register fails).

Suggested change
m.mu.Lock()
defer m.mu.Unlock()
if _, exists := m.seen[name]; exists {
return fmt.Errorf("plugin %q already registered", name)
}
if err := p.Register(m.registry); err != nil {
return fmt.Errorf("register plugin %q: %w", name, err)
}
m.seen[name] = struct{}{}
m.names = append(m.names, name)
// First, check and mark the plugin as seen under the lock to avoid races.
m.mu.Lock()
if _, exists := m.seen[name]; exists {
m.mu.Unlock()
return fmt.Errorf("plugin %q already registered", name)
}
m.seen[name] = struct{}{}
m.mu.Unlock()
// Call plugin-provided Register without holding the manager lock to
// avoid contention and potential deadlocks if the plugin calls back
// into Manager APIs.
if err := p.Register(m.registry); err != nil {
// Roll back the optimistic seen-mark if registration fails.
m.mu.Lock()
delete(m.seen, name)
m.mu.Unlock()
return fmt.Errorf("register plugin %q: %w", name, err)
}
// Finalize registration under the lock by recording the plugin name.
m.mu.Lock()
m.names = append(m.names, name)
m.mu.Unlock()

Copilot uses AI. Check for mistakes.
Comment on lines +8 to 12
type SendCallbackWithContext func(ctx context.Context, channel, chatID, content string) error

type MessageTool struct {
sendCallback SendCallback
sendCallback SendCallbackWithContext
defaultChannel string
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the exported MessageTool send-callback API by removing the old SendCallback type and requiring a context-aware callback. If pkg/tools is consumed outside this repo, this is a breaking change; consider preserving the old SetSendCallback signature (deprecated) and adding a new SetSendCallbackWithContext, or provide an adapter so existing integrations keep compiling.

Copilot uses AI. Check for mistakes.

- Observe-only hooks (`message_received`, `after_tool_call`, `llm_input`, `llm_output`, `session_start`, `session_end`)
- run concurrently
- cannot block core behavior
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation claims observe-only hooks “cannot block core behavior”, but the current implementation of observe-only hooks (triggerVoid) waits for all handlers to complete via a WaitGroup. This means slow/blocked handlers will delay the agent loop. Update the wording to reflect that observe-only hooks cannot cancel/modify the operation but can still add latency (or change triggerVoid to be fire-and-forget if that’s the intended contract).

Suggested change
- cannot block core behavior
- cannot cancel or modify core behavior (observe-only only)
- may still add latency because handlers are awaited by the agent loop

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants