refactor: restructure to plugin/ + services/ layout and add per-turn bootstrap injection

- Migrate src/ → plugin/ (plugin/core/, plugin/web/, plugin/commands/) and src/mcp/ → services/ per OpenClaw plugin dev spec - Add Gemini CLI backend (plugin/core/gemini/sdk-adapter.ts) with GEMINI.md system-prompt injection - Inject bootstrap as stateless system prompt on every turn instead of first turn only: Claude via --system-prompt, Gemini via workspace/GEMINI.md; eliminates isFirstTurn branch, keeps skills in sync with OpenClaw snapshots - Fix session-map-store defensive parsing (sessions ?? []) to handle bare {} reset files without crashing on .find() - Add docs/TEST_FLOW.md with E2E test scenarios and expected outcomes - Add docs/claude/BRIDGE_MODEL_FINDINGS.md with contractor-probe results Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 21:21:32 +01:00
parent eee62efbf1
commit 07a0f06e2e
30 changed files with 1239 additions and 172 deletions
--- a/docs/claude/BRIDGE_MODEL_FINDINGS.md
+++ b/docs/claude/BRIDGE_MODEL_FINDINGS.md
@@ -0,0 +1,258 @@
+# Bridge Model Probe Findings
+
+## Purpose
+
+Document actual test results from running the `contractor-probe` test plugin against a live
+OpenClaw gateway. Resolves the two critical unknowns identified in earlier feasibility review.
+
+Test setup: installed `contractor-probe` plugin that exposes an OpenAI-compatible HTTP server
+on port 8799 and logs every raw request body to `/tmp/contractor-probe-requests.jsonl`.
+Created a `probe-test` agent with `model: contractor-probe/contractor-probe-bridge`.
+Sent two consecutive messages via `openclaw agent --channel qa-channel`.
+
+---
+
+## Finding 1 — Custom Model Registration Mechanism
+
+**There is no plugin SDK `registerModelProvider` call.**
+
+The actual mechanism used by dirigent (and confirmed working for contractor-probe) is:
+
+### Step 1 — Add provider to `openclaw.json`
+
+Under `models.providers`, add an entry pointing to a local OpenAI-compatible HTTP server:
+
+```json
+"contractor-probe": {
+  "baseUrl": "http://127.0.0.1:8799/v1",
+  "apiKey": "probe-local",
+  "api": "openai-completions",
+  "models": [{
+    "id": "contractor-probe-bridge",
+    "name": "Contractor Probe Bridge (test)",
+    "reasoning": false,
+    "input": ["text"],
+    "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 },
+    "contextWindow": 200000,
+    "maxTokens": 4096
+  }]
+}
+```
+
+### Step 2 — Plugin starts a sidecar HTTP server
+
+The plugin's `register()` function starts a Node.js HTTP server on `gateway_start` (protected
+by a `globalThis` flag to prevent double-start on hot-reload). The server implements:
+
+- `GET  /v1/models` — model list
+- `POST /v1/chat/completions` — model inference (must support streaming)
+- `POST /v1/responses` — responses API variant (optional)
+
+### Step 3 — Agent uses `provider/model` as primary model
+
+```json
+{
+  "id": "my-agent",
+  "model": { "primary": "contractor-probe/contractor-probe-bridge" }
+}
+```
+
+### What this means for ContractorAgent
+
+The `contractor-claude-bridge` model should be registered the same way:
+
+1. Install script writes a provider entry to `openclaw.json` pointing to the bridge sidecar port
+2. Plugin starts the bridge sidecar on `gateway_start`
+3. `openclaw contractor-agents add` sets the agent's primary model to `contractor-claude-bridge`
+
+No plugin SDK model registration API exists or is needed.
+
+---
+
+## Finding 2 — Exact Payload Sent to Custom Model
+
+OpenClaw sends a standard OpenAI Chat Completions request to the sidecar on every turn.
+
+### Endpoint and transport
+
+```
+POST /v1/chat/completions
+Content-Type: application/json
+stream: true          ← streaming is always requested; sidecar MUST emit SSE
+```
+
+### Message array structure
+
+**Turn 1 (2 messages):**
+
+| index | role | content |
+|-------|------|---------|
+| 0 | `system` | Full OpenClaw agent context (~28,000 chars) — rebuilt every turn |
+| 1 | `user` | `[Sat 2026-04-11 08:32 GMT+1] hello from probe test` |
+
+**Turn 2 (3 messages):**
+
+| index | role | content |
+|-------|------|---------|
+| 0 | `system` | Same full context (~28,000 chars) |
+| 1 | `user` | `[Sat 2026-04-11 08:32 GMT+1] hello from probe test` |
+| 2 | `user` | `[Sat 2026-04-11 08:34 GMT+1] and this is the second message` |
+
+Note: the probe sidecar did not emit proper SSE. As a result, turn 2 shows no assistant message
+between the two user messages. Once the bridge sidecar returns well-formed SSE, OpenClaw should
+include assistant turns in history. Needs follow-up verification with streaming.
+
+### System prompt contents
+
+The system prompt is assembled by OpenClaw from:
+- OpenClaw base instructions (tool call style, scheduling rules, ACP guidance)
+- Workspace context files (AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, HEARTBEAT.md, BOOTSTRAP.md)
+- Skill definitions
+- Tool guidance text
+
+In the test, total system prompt was 28,942 chars. This is rebuilt from scratch on every turn.
+
+### User message format
+
+```
+[Day YYYY-MM-DD HH:MM TZ] <message text>
+```
+
+Example: `[Sat 2026-04-11 08:32 GMT+1] hello from probe test`
+
+### Tool definitions
+
+Full OpenAI function definitions are sent on every request as `tools: [...]`. In the test run,
+37 tools were included (read, edit, exec, cron, message, sessions_spawn, dirigent tools, etc.).
+
+### Other request fields
+
+| field | observed value | notes |
+|-------|---------------|-------|
+| `model` | `contractor-probe-bridge` | model id as configured |
+| `stream` | `true` | always; bridge must stream |
+| `store` | `false` | |
+| `max_completion_tokens` | `4096` | from provider model config |
+
+### What this means for ContractorAgent
+
+**The input filter is critical.** On every turn, OpenClaw sends:
+
+- A large system prompt (28K+ chars) that repeats unchanged
+- The full accumulated user message history
+- No (or incomplete) assistant message history
+
+The bridge model must NOT forward this verbatim to Claude Code. Instead:
+
+1. Extract only the latest user message from the messages array (last `user` role entry)
+2. Strip the OpenClaw system prompt entirely — Claude Code maintains its own live context
+3. On first turn: inject a one-time bootstrap block telling Claude it is operating as an
+   OpenClaw contractor agent, with workspace path and session key
+4. On subsequent turns: forward only the latest user message text
+
+This keeps Claude as the owner of its own conversational context and avoids dual-context drift.
+
+**The bridge sidecar must support SSE streaming.** OpenClaw always sets `stream: true`. A
+non-streaming response causes assistant turn data to be dropped from OpenClaw's session history.
+
+---
+
+## Finding 3 — Claude Code Session Continuation Identifier
+
+From Claude Code documentation research:
+
+### Session ID format
+
+UUIDs assigned at session creation. Example: `bc1a7617-0651-443d-a8f1-efeb2957b8c2`
+
+### Session storage
+
+```
+~/.claude/projects/<encoded-cwd>/<session-id>.jsonl
+```
+
+`<encoded-cwd>` is the absolute working directory path with every non-alphanumeric character
+replaced by `-`:
+- `/home/user/project` → `-home-user-project`
+
+### CLI resumption
+
+```bash
+# Non-interactive mode with session resume
+claude -p --resume <session-uuid> "next user message"
+```
+
+`-p` is print/non-interactive mode. The session UUID is passed as the resume argument.
+The output includes the session ID so the caller can capture it for next turn.
+
+### SDK resumption (TypeScript)
+
+```typescript
+import { query } from "@anthropic-ai/claude-agent-sdk";
+
+for await (const message of query({
+  prompt: "next user message",
+  options: {
+    resume: sessionId,   // UUID from previous turn
+    allowedTools: ["Read", "Edit", "Glob"],
+  }
+})) {
+  if (message.type === "result") {
+    const nextSessionId = message.session_id;  // capture for next turn
+  }
+}
+```
+
+### Session enumeration
+
+```typescript
+import { listSessions, getSessionInfo } from "@anthropic-ai/claude-agent-sdk";
+
+const sessions = await listSessions();  // keyed by session UUID
+const info = await getSessionInfo(sessionId);
+```
+
+### What this means for ContractorAgent
+
+The `SessionMapEntry.claudeSessionId` field should store the UUID returned by `message.session_id`
+after each Claude turn. On the next turn, pass it as `options.resume`.
+
+The session file path can be reconstructed from the session ID and workspace path if needed for
+recovery or inspection, but direct SDK resumption is the primary path.
+
+---
+
+## Finding 4 — Sidecar Port Conflict and Globalthis Guard
+
+During testing, the probe sidecar failed to start with `EADDRINUSE` when `openclaw agent --local`
+was used alongside a running gateway, because both tried to spawn the server process.
+
+This is exactly the hot-reload / double-start problem documented in LESSONS_LEARNED items 1, 3,
+and 7. The fix for the bridge sidecar:
+
+1. Check a lock file (e.g. `/tmp/contractor-bridge-sidecar.lock`) before starting
+2. If the lock file exists and the PID is alive, skip start
+3. Protect the `startSidecar` call with a `globalThis` flag
+4. Clean up the lock file on `gateway_stop`
+
+---
+
+## Open Questions Resolved
+
+| Question | Status | Answer |
+|----------|--------|--------|
+| Custom model registration API? | ✅ Resolved | `openclaw.json` provider config + OpenAI-compatible sidecar |
+| Claude session continuation identifier? | ✅ Resolved | UUID via `message.session_id`, resume via `options.resume` |
+| Does OpenClaw include assistant history? | ⚠️ Partial | Appears yes when sidecar streams correctly; needs retest with SSE |
+| Streaming required? | ✅ Resolved | Yes, `stream: true` always sent; non-streaming drops assistant history |
+
+---
+
+## Immediate Next Steps (Updated)
+
+1. **Add SSE streaming to probe sidecar** — retest to confirm assistant messages appear in turn 3
+2. **Build the real bridge sidecar** — implement SSE passthrough from Claude SDK output
+3. **Implement input filter** — extract latest user message, strip system prompt
+4. **Implement session map store** — persist UUID → OpenClaw session key mapping
+5. **Implement bootstrap injection** — first-turn only; include workspace path and session key
+6. **Add lock file to sidecar** — prevent double-start (LESSONS_LEARNED lesson 7)