Three install/bridge bugs that made every OpenClaw model call to the
bridge fail when driven by a non-bundled channel plugin (e.g. Fabric):
1. OpenClaw redacts secret-like keys before exposing pluginConfig to a
plugin, so config.bridgeApiKey was the literal __OPENCLAW_REDACTED__
sentinel. The bridge then validated Authorization against the
sentinel while the model provider sent the real key -> permanent
HTTP 401. Resolve the real shared secret from the raw on-disk config
(same pattern resolveAgent already uses); if still missing/redacted,
treat as no-auth on the loopback-only bridge instead of 401-locking.
2. install.mjs set the provider apiKey authoritatively but only
setIfMissing the plugin bridgeApiKey, so a stale prior value desynced
the pair. Make bridgeApiKey authoritative too (they must match).
3. The provider had no timeoutSeconds; a full bridged agent turn far
exceeds OpenClaw's default model-fetch timeout, so OpenClaw aborted
mid-turn and no reply was ever delivered. Default timeoutSeconds=600
(preserves a user override).
Verified live: bridge now returns 200 for the real key and a valid
OpenAI SSE completion; the fetch-timeout abort is gone.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bridge server lifecycle:
- Move createBridgeServer() out of register() into an api.on("gateway_start", ...)
handler. register() runs in every CLI subprocess that loads plugins
(e.g. `openclaw completion`, `openclaw doctor`); eagerly binding the
bridge HTTP listener there could pin those processes when no gateway
is already holding the port.
- Call server.unref() so the listener never pins the host's event loop,
even if startup somehow runs outside the gateway.
Plugin SDK convention update:
- Wrap default export with definePluginEntry({ id, name, description, register })
per the current openclaw plugin authoring contract.
- Switch imports from the deprecated root barrel "openclaw/plugin-sdk" to
focused "openclaw/plugin-sdk/core" / "openclaw/plugin-sdk/plugin-entry".
- Modernize openclaw.plugin.json: drop version/main, add activation.onStartup
so gateway_start fires for this plugin at boot, declare commandAliases
for the contractor-agents CLI command.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bridge was keying claudeSessionId by agentId alone, so every Discord
channel, DM, and cron run for a single agent shared one Claude CLI
session. Two consequences in the wild:
- Cross-channel context bleed: 8.7MB session for `developer` mixed
references from channels 1474327736242798612 and 1498579994044010566
plus the operator DM all in one --resume thread.
- `/new` had no effect on the CLI side. OpenClaw rotated its session
file but the bridge kept --resume-ing the same long-lived
claudeSessionId, eventually crossing the 1M model context (debug log
showed `prompt is too long: 1179616 tokens > 1000000 maximum`).
Changes:
* input-filter: extract `chat_id` from the Conversation-info
untrusted-metadata block (scanning all messages, since runtimeOnly
turns put it in the system prompt) and detect bare `/new`/`/reset`
via the BARE_SESSION_RESET_PROMPT_BASE marker. Add buildSessionKey
`${agentId}::${chatId}` and resolveDispatchPrompt fallback for the
empty user message that OpenClaw sends on bare resets.
* server: use the composite session key for getSession/putSession;
on bareSessionReset, removeSession before dispatching so the CLI
starts a fresh session; on a CLI result_error (typically
prompt_too_long) drop the entry too so the next turn doesn't
re-resume into the poisoned context.
* claude/sdk-adapter: surface CLI terminal errors via a new
`result_error` event (carries reason + sessionId) so the bridge
can react instead of just streaming the synthetic
"Prompt is too long" assistant text and silently re-using the
same session.
* index: convert register() to synchronous (OpenClaw rejects async
register with "plugin register must be synchronous"); replace the
pre-bind port probe with a server-level EADDRINUSE handler.
* .gitignore: ignore node_modules/ and dist/.
- Migrate src/ → plugin/ (plugin/core/, plugin/web/, plugin/commands/)
and src/mcp/ → services/ per OpenClaw plugin dev spec
- Add Gemini CLI backend (plugin/core/gemini/sdk-adapter.ts) with GEMINI.md
system-prompt injection
- Inject bootstrap as stateless system prompt on every turn instead of
first turn only: Claude via --system-prompt, Gemini via workspace/GEMINI.md;
eliminates isFirstTurn branch, keeps skills in sync with OpenClaw snapshots
- Fix session-map-store defensive parsing (sessions ?? []) to handle bare {}
reset files without crashing on .find()
- Add docs/TEST_FLOW.md with E2E test scenarios and expected outcomes
- Add docs/claude/BRIDGE_MODEL_FINDINGS.md with contractor-probe results
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>