# ContractorAgent E2E Test Flow End-to-end test scenarios for the contractor-agent plugin. Each scenario describes the precondition, the action, and the expected observable outcome. --- ## Setup ### ST-1 — Clean install **Precondition**: Plugin not installed; test agent workspaces do not exist. **Steps**: 1. `node scripts/install.mjs --uninstall` (idempotent if already gone) 2. Remove test agents from `openclaw.json` if present 3. `node scripts/install.mjs --install` 4. `openclaw gateway restart` **Expected**: - `GET /health` → `{"ok":true,"service":"contractor-bridge"}` - `GET /v1/models` → list includes `contractor-claude-bridge` and `contractor-gemini-bridge` - OpenClaw logs: `[contractor-agent] plugin registered (bridge port: 18800)` --- ### ST-2 — Provision Claude contractor agent **Command**: `openclaw contractor-agents add --agent-id claude-e2e --workspace /tmp/claude-e2e-workspace --contractor claude` **Expected**: - `openclaw.json` agents list contains `claude-e2e` with model `contractor-agent/contractor-claude-bridge` - `/tmp/claude-e2e-workspace/.openclaw/contractor-agent/session-map.json` exists (empty sessions) - Workspace files created by `openclaw agents add` are present (SOUL.md, IDENTITY.md, etc.) --- ### ST-3 — Provision Gemini contractor agent **Command**: `openclaw contractor-agents add --agent-id gemini-e2e --workspace /tmp/gemini-e2e-workspace --contractor gemini` **Expected**: same as ST-2 but model is `contractor-agent/contractor-gemini-bridge`. --- ## Core Bridge Behaviour ### CB-1 — First turn bootstraps persona (Claude) **Precondition**: `claude-e2e` has no active session (session-map empty). **Request**: POST `/v1/chat/completions`, model `contractor-agent/contractor-claude-bridge`, system message contains `Runtime: agent=claude-e2e | repo=/tmp/claude-e2e-workspace`, user message: `"Introduce yourself briefly."` **Expected**: - Response streams SSE chunks, terminates with `[DONE]` - Response reflects the persona from `SOUL.md`/`IDENTITY.md` — agent uses the name and tone defined in the workspace files (no generic "I'm Claude …" dry response) - `session-map.json` now contains one entry with `contractor=claude`, `state=active`, and a non-empty `claudeSessionId` **Mechanism**: bootstrap is injected only on the first turn; it embeds the content of SOUL.md, IDENTITY.md, USER.md, MEMORY.md inline so the agent adopts the persona immediately without needing to read files itself. --- ### CB-2 — Session resume retains context (Claude) **Precondition**: CB-1 has run; session-map holds an active session. **Request**: same headers, user message: `"What did I ask you in my first message?"` **Expected**: - Agent recalls the previous question without re-reading files - `session-map.json` `lastActivityAt` updated; `claudeSessionId` unchanged **Mechanism**: bridge detects existing active session, passes `--resume ` to Claude Code; bootstrap is NOT re-injected. --- ### CB-3 — MCP tool relay — contractor_echo (Claude) **Precondition**: active session from CB-1; request includes `contractor_echo` tool definition. **Request**: user message: `"Use the contractor_echo tool to echo: hello"` **Expected**: - Agent calls the `mcp__openclaw__contractor_echo` tool via the MCP proxy - Bridge relays the call to `POST /mcp/execute` → OpenClaw plugin registry - Response confirms the echo with a timestamp: `"Echo confirmed: hello at …"` **Mechanism**: bridge writes an MCP config file pointing to `services/openclaw-mcp-server.mjs` before each `claude` invocation; the MCP server forwards tool calls to the bridge `/mcp/execute` endpoint which resolves them through the OpenClaw global plugin registry. --- ### CB-4 — First turn bootstraps persona (Gemini) Same as CB-1 but model `contractor-agent/contractor-gemini-bridge`, agent `gemini-e2e`. **Expected**: persona from `SOUL.md`/`IDENTITY.md` is reflected; session entry has `contractor=gemini`. **Mechanism**: bootstrap is identical; Gemini CLI receives the full prompt via `-p`. MCP config is written to `workspace/.gemini/settings.json` (Gemini's project settings path) instead of an `--mcp-config` flag. --- ### CB-5 — Session resume (Gemini) Same as CB-2 but for Gemini. **Expected**: agent recalls prior context via `--resume `. --- ### CB-6 — MCP tool relay (Gemini) Same as CB-3 but for Gemini. Gemini sees the tool as `mcp_openclaw_contractor_echo` (single-underscore FQN; server alias is `openclaw` with no underscores). **Expected**: echo confirmed in response. --- ## Skill Invocation ### SK-1 — Agent reads and executes a skill script (Claude) **Precondition**: fresh session (session-map cleared); skill `contractor-test-skill` is installed in `~/.openclaw/skills/contractor-test-skill/`. System message includes: ```xml contractor-test-skill Test skill for verifying that a contractor agent can discover and invoke a workspace script… /home/hzhang/.openclaw/skills/contractor-test-skill ``` **Request**: user message: `"Run the contractor test skill and show me the output."` **Expected**: - Agent reads `SKILL.md` at the given `` - Expands `{baseDir}` → `/home/hzhang/.openclaw/skills/contractor-test-skill` - Executes `scripts/test.sh` via Bash - Response contains: ``` === contractor-test-skill: PASSED === Timestamp: ``` **Why first turn matters**: the bootstrap embeds the `` block in the system prompt sent to Claude on the first turn. Claude retains this in session memory for subsequent turns. If the session is already active when the skill-bearing request arrives, Claude won't know about the skill. --- ### SK-2 — Agent reads and executes a skill script (Gemini) Same as SK-1 but for Gemini. Gemini reads SKILL.md and executes the script. **Expected**: same `PASSED` output block. --- ## Skills Injection Timing ### Background — How OpenClaw injects skills OpenClaw rebuilds the system prompt (including ``) on **every turn** and sends it to the model. The skill list comes from a **snapshot** cached in the session entry, refreshed under the following conditions: | Trigger | Refresh? | |---------|----------| | First turn in a new session | ✅ Always | | Skills directory file changed (file watcher detects version bump) | ✅ Yes | | `openclaw gateway restart` (session entry survives) | ❌ No — old snapshot reused | | `openclaw gateway restart` + session reset / new session | ✅ Yes — first-turn logic runs | | Skill filter changes | ✅ Yes | ### Implication for the contractor bridge The bridge receives the skills block on **every** incoming turn but currently only uses it in the **first-turn bootstrap**. Subsequent turns carry updated skills if OpenClaw refreshed the snapshot, but the bridge does not re-inject them into the running Claude/Gemini session. **Consequence**: if skills are added or removed while a contractor session is active, the agent won't see the change until the session is reset (session-map cleared) and a new bootstrap is sent. This is intentional for v1: contractor sessions are meant to be long-lived, and skill changes mid-session are uncommon. If needed, explicitly clearing the session map forces a new bootstrap on the next turn. --- ## Error Cases ### ER-1 — No active session, no workspace in system message **Request**: system message has no `Runtime:` line. **Expected**: bridge logs a warning, falls back to `/tmp` workspace, session key is empty, response may succeed but session is not persisted. --- ### ER-2 — Gemini CLI not installed **Request**: model `contractor-agent/contractor-gemini-bridge`. **Expected**: `dispatchToGemini` spawn fails, bridge streams `[contractor-bridge dispatch failed: …]` error chunk, then `[DONE]`. Session is not persisted; if a prior session existed, it is marked `orphaned`. --- ### ER-3 — MCP tool not registered in plugin registry **Request**: tool `unknown_tool` called via `/mcp/execute`. **Expected**: `POST /mcp/execute` returns `{"error":"Tool 'unknown_tool' not registered…"}` (200). The agent receives the error text as the tool result and surfaces it in its reply.