docs(plans): proposal for openclaw before_outgoing_tools hook

Captures the design for the new openclaw plugin hook PaddedCell needs to implement its half of Plexum decision #37 (per-session tools-cache filter). Documents the gap in current openclaw plugin SDK surface, the proposed hook signature, the use case for dynamic-cache-tools family of 4 host tools, and three open questions for upstream maintainers (naming, multi-subscriber order, model-info access). Plexum's implementation is the reference shape; the openclaw side ships once the hook lands upstream. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-04 15:20:05 +01:00
parent 0b7f18253d
commit a7ee414bca
1 changed files with 92 additions and 0 deletions
--- a/plans/OPENCLAW_TOOLS_FILTER_HOOK.md
+++ b/plans/OPENCLAW_TOOLS_FILTER_HOOK.md
@@ -0,0 +1,92 @@
+# Proposal: openclaw `before_outgoing_tools` plugin hook
+
+**Status**: draft proposal for upstream openclaw
+**Owner**: hzhang
+**Last updated**: 2026-06-04
+**Consumer**: PaddedCell — adds `dynamic-cache-tools` family (4 tools) to gate model tool visibility per session
+
+## Problem
+
+A regular openclaw plugin can register its own tools (via `OpenClawPluginToolFactory` returning `null` to skip) but has **no way to filter the tools other plugins or the host advertise to the model** before the model dispatch.
+
+Surveyed surface (as of the openclaw release that ships `PluginHookName | applyToolPolicyPipeline`):
+
+| Candidate | Limitation |
+|---|---|
+| `OpenClawPluginToolFactory(ctx) => null` | only controls the calling plugin's own tools |
+| `before_prompt_build` hook result | `{ systemPrompt?, prependContext?, appendContext?, prependSystemContext?, appendSystemContext? }` — no tools field |
+| `agent_turn_prepare` hook result | `{ prependContext?, appendContext? }` — no tools field |
+| `llm_input` hook event | `tools?: unknown[]` is observation-only (no Result) |
+| `applyToolPolicyPipeline(steps)` | policy steps are constructed statically from agent / group / sender / profile policies; no plugin-side step contribution surface |
+| `ProviderNormalizeToolSchemasContext` | provider-plugin only; one normalizer per provider; not the right hook for cross-provider session policy |
+
+So today, **per-session agent-driven tool filtering cannot be implemented as a regular plugin**. Hangman-Lab's PaddedCell wants to ship a per-session "tools-cache" mechanism (Plexum decision #37, mirrored on the openclaw side) and is blocked on this gap.
+
+## Proposed hook
+
+```ts
+export type PluginHookBeforeOutgoingToolsEvent = {
+  agentId: string;
+  sessionId: string;
+  model: string;
+  tools: AnyAgentTool[];   // snapshot AFTER applyToolPolicyPipeline
+};
+
+export type PluginHookBeforeOutgoingToolsResult = {
+  // If returned, replaces the tools array passed to the provider. Order
+  // is preserved as the host receives it. Returning the original array
+  // unchanged (or omitting the field) leaves it untouched.
+  tools?: AnyAgentTool[];
+};
+```
+
+Hook name: `before_outgoing_tools`. Fires per turn, AFTER `applyToolPolicyPipeline`, BEFORE the provider adapter (`payload.tools = …`) is assembled.
+
+If multiple plugins subscribe, results compose left-to-right: each plugin sees the array as left by the previous plugin and can further filter. Plugins MUST NOT add tools that weren't present in the input array (no synthesizing) — host wraps the result and asserts `setof(out.names) ⊆ setof(in.names)`. Returning a superset is a hook-level error (logged, original array used).
+
+## Why not extend `applyToolPolicyPipeline`?
+
+`ToolPolicyPipelineStep` is the natural place architecturally, but it's a synchronous `(policy, label)` shape consumed by static profile / agent / group resolution. Plugins would need:
+
+- a way to register pipeline steps (no current API);
+- their step to access per-session mutable state (current API only reads from agent.json / plexum.json / group config);
+- step ordering — when to slot a "session-cache" policy relative to profile / sender steps.
+
+Building all that out is wider than this proposal. A dedicated hook with `{agentId, sessionId, tools}` lets a plugin do whatever lookup it needs (file, memory, RPC) and return the trimmed array. If a future refactor folds it into the policy pipeline, the hook can become a thin adapter.
+
+## Concrete use case — `dynamic-cache-tools` family
+
+PaddedCell intends to register 4 host tools:
+
+| Tool | Role |
+|---|---|
+| `dynamic-list-tools` | browse — returns the full catalog `[{name, description, source}, …]` where `source ∈ {essentials, cached, available}` |
+| `dynamic-search-tools` | browse — substring search over name+description |
+| `dynamic-cache-tools` | commit — adds names to per-session whitelist; consumes prior browse via `previous-dynamic-id` |
+| `dynamic-evict-tools` | mutate — removes names from per-session whitelist |
+
+Per-session state: `~/.openclaw/agents/<id>/sessions/<sid>/dynamic-tools-cache.json` (atomic tmp+rename).
+
+Without `before_outgoing_tools`, PaddedCell cannot enforce the filter — `dynamic-cache-tools` would silently no-op for the model's actual visibility.
+
+The same design has been implemented end-to-end on the Plexum side (a sibling Go agent host that owns its own dispatch path): see `Plexum/docs/DESIGN.md` decision #37 + §5.6 "Tools-cache family". Plexum's `internal/agentloop/run.go` does the filter inline; that's the shape the openclaw hook needs to mirror.
+
+## Backwards compatibility
+
+- No existing hook is modified.
+- Plugins that don't subscribe see no behavior change.
+- Hosts before the hook lands fall back to passing tools unchanged (PaddedCell's filter becomes a no-op until upgrade).
+
+## Out of scope
+
+- Per-iteration filtering inside one turn. Tools are fixed for the turn (mirrors Plexum semantics; same-turn cache changes take effect on the next `Run`).
+- Adding tools (synthesizing). Hook is filter-only.
+- Cross-session shared cache (any plugin wanting that can read/write its own file in the hook).
+
+## Open questions for upstream maintainers
+
+1. Naming: `before_outgoing_tools` or `tools_resolved`? The first reads as "right before the wire", the second matches `before_model_call` / `model_call_started` naming.
+2. Multi-subscriber composition order — alphabetical by plugin id? Registration order? An explicit `priority: number` field?
+3. Should the hook also receive the model name / model capabilities, in case a plugin wants to vary filter behavior per-model?
+
+Happy to draft the PR once direction is locked.