diff --git a/plans/OPENCLAW_TOOLS_FILTER_HOOK.md b/plans/OPENCLAW_TOOLS_FILTER_HOOK.md new file mode 100644 index 0000000..b194e95 --- /dev/null +++ b/plans/OPENCLAW_TOOLS_FILTER_HOOK.md @@ -0,0 +1,92 @@ +# Proposal: openclaw `before_outgoing_tools` plugin hook + +**Status**: draft proposal for upstream openclaw +**Owner**: hzhang +**Last updated**: 2026-06-04 +**Consumer**: PaddedCell — adds `dynamic-cache-tools` family (4 tools) to gate model tool visibility per session + +## Problem + +A regular openclaw plugin can register its own tools (via `OpenClawPluginToolFactory` returning `null` to skip) but has **no way to filter the tools other plugins or the host advertise to the model** before the model dispatch. + +Surveyed surface (as of the openclaw release that ships `PluginHookName | applyToolPolicyPipeline`): + +| Candidate | Limitation | +|---|---| +| `OpenClawPluginToolFactory(ctx) => null` | only controls the calling plugin's own tools | +| `before_prompt_build` hook result | `{ systemPrompt?, prependContext?, appendContext?, prependSystemContext?, appendSystemContext? }` — no tools field | +| `agent_turn_prepare` hook result | `{ prependContext?, appendContext? }` — no tools field | +| `llm_input` hook event | `tools?: unknown[]` is observation-only (no Result) | +| `applyToolPolicyPipeline(steps)` | policy steps are constructed statically from agent / group / sender / profile policies; no plugin-side step contribution surface | +| `ProviderNormalizeToolSchemasContext` | provider-plugin only; one normalizer per provider; not the right hook for cross-provider session policy | + +So today, **per-session agent-driven tool filtering cannot be implemented as a regular plugin**. Hangman-Lab's PaddedCell wants to ship a per-session "tools-cache" mechanism (Plexum decision #37, mirrored on the openclaw side) and is blocked on this gap. + +## Proposed hook + +```ts +export type PluginHookBeforeOutgoingToolsEvent = { + agentId: string; + sessionId: string; + model: string; + tools: AnyAgentTool[]; // snapshot AFTER applyToolPolicyPipeline +}; + +export type PluginHookBeforeOutgoingToolsResult = { + // If returned, replaces the tools array passed to the provider. Order + // is preserved as the host receives it. Returning the original array + // unchanged (or omitting the field) leaves it untouched. + tools?: AnyAgentTool[]; +}; +``` + +Hook name: `before_outgoing_tools`. Fires per turn, AFTER `applyToolPolicyPipeline`, BEFORE the provider adapter (`payload.tools = …`) is assembled. + +If multiple plugins subscribe, results compose left-to-right: each plugin sees the array as left by the previous plugin and can further filter. Plugins MUST NOT add tools that weren't present in the input array (no synthesizing) — host wraps the result and asserts `setof(out.names) ⊆ setof(in.names)`. Returning a superset is a hook-level error (logged, original array used). + +## Why not extend `applyToolPolicyPipeline`? + +`ToolPolicyPipelineStep` is the natural place architecturally, but it's a synchronous `(policy, label)` shape consumed by static profile / agent / group resolution. Plugins would need: + +- a way to register pipeline steps (no current API); +- their step to access per-session mutable state (current API only reads from agent.json / plexum.json / group config); +- step ordering — when to slot a "session-cache" policy relative to profile / sender steps. + +Building all that out is wider than this proposal. A dedicated hook with `{agentId, sessionId, tools}` lets a plugin do whatever lookup it needs (file, memory, RPC) and return the trimmed array. If a future refactor folds it into the policy pipeline, the hook can become a thin adapter. + +## Concrete use case — `dynamic-cache-tools` family + +PaddedCell intends to register 4 host tools: + +| Tool | Role | +|---|---| +| `dynamic-list-tools` | browse — returns the full catalog `[{name, description, source}, …]` where `source ∈ {essentials, cached, available}` | +| `dynamic-search-tools` | browse — substring search over name+description | +| `dynamic-cache-tools` | commit — adds names to per-session whitelist; consumes prior browse via `previous-dynamic-id` | +| `dynamic-evict-tools` | mutate — removes names from per-session whitelist | + +Per-session state: `~/.openclaw/agents//sessions//dynamic-tools-cache.json` (atomic tmp+rename). + +Without `before_outgoing_tools`, PaddedCell cannot enforce the filter — `dynamic-cache-tools` would silently no-op for the model's actual visibility. + +The same design has been implemented end-to-end on the Plexum side (a sibling Go agent host that owns its own dispatch path): see `Plexum/docs/DESIGN.md` decision #37 + §5.6 "Tools-cache family". Plexum's `internal/agentloop/run.go` does the filter inline; that's the shape the openclaw hook needs to mirror. + +## Backwards compatibility + +- No existing hook is modified. +- Plugins that don't subscribe see no behavior change. +- Hosts before the hook lands fall back to passing tools unchanged (PaddedCell's filter becomes a no-op until upgrade). + +## Out of scope + +- Per-iteration filtering inside one turn. Tools are fixed for the turn (mirrors Plexum semantics; same-turn cache changes take effect on the next `Run`). +- Adding tools (synthesizing). Hook is filter-only. +- Cross-session shared cache (any plugin wanting that can read/write its own file in the hook). + +## Open questions for upstream maintainers + +1. Naming: `before_outgoing_tools` or `tools_resolved`? The first reads as "right before the wire", the second matches `before_model_call` / `model_call_started` naming. +2. Multi-subscriber composition order — alphabetical by plugin id? Registration order? An explicit `priority: number` field? +3. Should the hook also receive the model name / model capabilities, in case a plugin wants to vary filter behavior per-model? + +Happy to draft the PR once direction is locked.