Dirigent/docs/TURN-WAKEUP-PROBLEM.md

# Turn-Based Speaking: Wakeup Problem

## Context

WhisperGate implements turn-based speaking for Discord group channels where multiple AI agents coexist. Only one agent (the "current speaker") is allowed to respond at a time. Others are silenced via a no-reply model override.

## The Problem

When the current speaker responds with **NO_REPLY** (decides the message is not relevant to them), the turn advances to the next agent. However, **the next agent has no trigger to start speaking**.

### Why This Happens

1. A message arrives in the Discord channel
2. OpenClaw routes it to **all** agent sessions in that channel simultaneously
3. The WhisperGate plugin intercepts at `before_model_resolve`:
   - Current speaker → allowed to process
   - Everyone else → forced to no-reply model (message is "consumed" silently)
4. Current speaker processes the message and returns NO_REPLY
5. `message_sent` hook detects NO_REPLY → turn advances to next agent
6. **But the next agent already "consumed" the message in step 3** — their session processed it (as no-reply) and moved on
7. No new message exists to trigger the next agent

### The Result

After a NO_REPLY, the next speaker sits idle until a **new** message arrives in the channel (from a human or another source). The original message that should have been passed to the next speaker is lost.

## When This Matters

- **Single-round conversation**: Human asks a question → Agent A says NO_REPLY → Agent B should answer but can't
- **Chain conversations**: Agent A defers → Agent B defers → Agent C should speak but never gets triggered

## When This Doesn't Matter

- **End-symbol responses**: When an agent actually speaks (ends with 🔚), the turn advances and the next agent will respond to the **next** message. This is fine.
- **Human-driven channels**: If humans keep sending messages, the dormant state resolves quickly.

## Possible Solutions

### 1. Synthetic Trigger Message (Plugin-Side)

After detecting NO_REPLY and advancing the turn, the plugin sends a **synthetic message** to the channel that triggers the next agent.

**Challenges:**
- The plugin SDK (`message_sent` hook) doesn't have an API to inject messages into agent sessions
- Sending a real Discord message (even invisible like zero-width space) creates noise and may confuse other agents
- The synthetic message wouldn't contain the original user's context

### 2. Deferred Evaluation (Don't Block in before_model_resolve)

Instead of blocking non-speakers at `before_model_resolve`, let all agents receive the message but inject a "you are not the current speaker, reply NO_REPLY" instruction. The current speaker gets a normal prompt.

After the current speaker responds with NO_REPLY, the plugin would need to **re-trigger** the next agent's session with the same message.

**Challenges:**
- All agents still consume tokens for the NO_REPLY evaluation
- Re-triggering a session with an already-processed message requires OpenClaw internal APIs

### 3. Queue + Replay (Plugin-Side State)

The plugin stores the original message when it arrives. After NO_REPLY, it replays the message by injecting it into the next speaker's session.

**Challenges:**
- Requires access to session injection API (not available in current plugin SDK)
- Managing the message queue adds complexity

### 4. Gateway-Level Support (OpenClaw Core Change)

Add a plugin hook return value like `{ defer: true }` in `before_model_resolve` that tells OpenClaw: "don't process this message yet, but keep it pending." When the turn advances, the plugin could call `api.retrigger(sessionKey)` to replay the pending message.

**Challenges:**
- Requires changes to OpenClaw core, not just the plugin
- Needs design discussion with the OpenClaw team

### 5. Bot-to-Bot Handoff via Discord Message

When current speaker NO_REPLYs, have **that bot** send a brief handoff message in the channel: e.g., "（轮到下一位）" or a reaction. This real Discord message triggers all agents, and the turn manager ensures only the next speaker responds.

**Challenges:**
- Adds visible noise to the channel (could use a convention like a specific emoji reaction)
- The no-reply'd bot can't send messages (it was silenced)
- Could use the discord-control-api to send as a different bot

### 6. Timer-Based Retry (Pragmatic)

After advancing the turn, set a short timer (e.g., 2-3 seconds). If no new message has arrived, send a minimal trigger. This could be an internal "nudge" if the SDK supports it.

**Challenges:**
- Timing is fragile
- Still needs a mechanism to trigger the next agent

## Recommendation

**Solution 5 (Bot-to-Bot Handoff)** is the most pragmatic with current constraints. The implementation would be:

1. In the `message_sent` hook, after detecting NO_REPLY and advancing the turn:
2. Use the discord-control-api to send a short message (e.g., `[轮转]` or a specific emoji) from the **next speaker's bot account** in the channel
3. This real Discord message triggers OpenClaw to route it to all agents
4. The turn manager allows only the (now-current) next speaker to respond
5. The next speaker sees the original conversation context in their session history and responds appropriately

**Downside:** Adds a visible "[轮转]" message. Could be mitigated by immediately deleting it after delivery, or using a reaction instead of a message.

## Open Questions

1. Does the OpenClaw plugin SDK support injecting messages into sessions?
2. Can plugins access the Discord client to send messages directly?
3. Would an OpenClaw core `defer`/`retrigger` mechanism be feasible?
4. Is visible channel noise acceptable for the handoff message?