Files
Dirigent/docs/TURN-WAKEUP-PROBLEM.md
zhi 476308d0df refactor: auto-managed turn order + dormant state + identity injection
Turn system redesign:
- Turn order auto-populated from config bindings (all bot accounts)
- No manual turnOrder config needed
- Humans (humanList) excluded from turn order automatically
- Dormant state: when all agents NO_REPLY in a cycle, currentSpeaker=null
- Reactivation: any new message wakes the system
  - Human message → start from first in order
  - Bot not in order → start from first
  - Bot in order → next after sender
- Skip already-NO_REPLY'd agents when advancing

Identity injection:
- Group chat prompts now include agent identity
- Format: '你是 {name}(Discord 账号: {accountId})'

Other:
- Remove turnOrder from ChannelPolicy (no longer configurable)
- Add TURN-WAKEUP-PROBLEM.md documenting the NO_REPLY wake-up challenge
- Update message_received to call onNewMessage with proper human detection
- Update message_sent to call onSpeakerDone with NO_REPLY tracking
2026-02-28 12:10:52 +00:00

5.5 KiB

Turn-Based Speaking: Wakeup Problem

Context

WhisperGate implements turn-based speaking for Discord group channels where multiple AI agents coexist. Only one agent (the "current speaker") is allowed to respond at a time. Others are silenced via a no-reply model override.

The Problem

When the current speaker responds with NO_REPLY (decides the message is not relevant to them), the turn advances to the next agent. However, the next agent has no trigger to start speaking.

Why This Happens

  1. A message arrives in the Discord channel
  2. OpenClaw routes it to all agent sessions in that channel simultaneously
  3. The WhisperGate plugin intercepts at before_model_resolve:
    • Current speaker → allowed to process
    • Everyone else → forced to no-reply model (message is "consumed" silently)
  4. Current speaker processes the message and returns NO_REPLY
  5. message_sent hook detects NO_REPLY → turn advances to next agent
  6. But the next agent already "consumed" the message in step 3 — their session processed it (as no-reply) and moved on
  7. No new message exists to trigger the next agent

The Result

After a NO_REPLY, the next speaker sits idle until a new message arrives in the channel (from a human or another source). The original message that should have been passed to the next speaker is lost.

When This Matters

  • Single-round conversation: Human asks a question → Agent A says NO_REPLY → Agent B should answer but can't
  • Chain conversations: Agent A defers → Agent B defers → Agent C should speak but never gets triggered

When This Doesn't Matter

  • End-symbol responses: When an agent actually speaks (ends with 🔚), the turn advances and the next agent will respond to the next message. This is fine.
  • Human-driven channels: If humans keep sending messages, the dormant state resolves quickly.

Possible Solutions

1. Synthetic Trigger Message (Plugin-Side)

After detecting NO_REPLY and advancing the turn, the plugin sends a synthetic message to the channel that triggers the next agent.

Challenges:

  • The plugin SDK (message_sent hook) doesn't have an API to inject messages into agent sessions
  • Sending a real Discord message (even invisible like zero-width space) creates noise and may confuse other agents
  • The synthetic message wouldn't contain the original user's context

2. Deferred Evaluation (Don't Block in before_model_resolve)

Instead of blocking non-speakers at before_model_resolve, let all agents receive the message but inject a "you are not the current speaker, reply NO_REPLY" instruction. The current speaker gets a normal prompt.

After the current speaker responds with NO_REPLY, the plugin would need to re-trigger the next agent's session with the same message.

Challenges:

  • All agents still consume tokens for the NO_REPLY evaluation
  • Re-triggering a session with an already-processed message requires OpenClaw internal APIs

3. Queue + Replay (Plugin-Side State)

The plugin stores the original message when it arrives. After NO_REPLY, it replays the message by injecting it into the next speaker's session.

Challenges:

  • Requires access to session injection API (not available in current plugin SDK)
  • Managing the message queue adds complexity

4. Gateway-Level Support (OpenClaw Core Change)

Add a plugin hook return value like { defer: true } in before_model_resolve that tells OpenClaw: "don't process this message yet, but keep it pending." When the turn advances, the plugin could call api.retrigger(sessionKey) to replay the pending message.

Challenges:

  • Requires changes to OpenClaw core, not just the plugin
  • Needs design discussion with the OpenClaw team

5. Bot-to-Bot Handoff via Discord Message

When current speaker NO_REPLYs, have that bot send a brief handoff message in the channel: e.g., "(轮到下一位)" or a reaction. This real Discord message triggers all agents, and the turn manager ensures only the next speaker responds.

Challenges:

  • Adds visible noise to the channel (could use a convention like a specific emoji reaction)
  • The no-reply'd bot can't send messages (it was silenced)
  • Could use the discord-control-api to send as a different bot

6. Timer-Based Retry (Pragmatic)

After advancing the turn, set a short timer (e.g., 2-3 seconds). If no new message has arrived, send a minimal trigger. This could be an internal "nudge" if the SDK supports it.

Challenges:

  • Timing is fragile
  • Still needs a mechanism to trigger the next agent

Recommendation

Solution 5 (Bot-to-Bot Handoff) is the most pragmatic with current constraints. The implementation would be:

  1. In the message_sent hook, after detecting NO_REPLY and advancing the turn:
  2. Use the discord-control-api to send a short message (e.g., [轮转] or a specific emoji) from the next speaker's bot account in the channel
  3. This real Discord message triggers OpenClaw to route it to all agents
  4. The turn manager allows only the (now-current) next speaker to respond
  5. The next speaker sees the original conversation context in their session history and responds appropriately

Downside: Adds a visible "[轮转]" message. Could be mitigated by immediately deleting it after delivery, or using a reaction instead of a message.

Open Questions

  1. Does the OpenClaw plugin SDK support injecting messages into sessions?
  2. Can plugins access the Discord client to send messages directly?
  3. Would an OpenClaw core defer/retrigger mechanism be feasible?
  4. Is visible channel noise acceptable for the handoff message?