fix: channelId extraction, sender identification, and per-channel turn order #9

Merged
hzhang merged 18 commits from fix/turn-gate-and-handoff into main 2026-03-01 11:12:18 +00:00
Collaborator

Fixes moderator bot silence and turn deadlocks. See commit message for details.

Fixes moderator bot silence and turn deadlocks. See commit message for details.
zhi added 18 commits 2026-03-01 11:09:03 +00:00
1. Rename tool: whispergateway_tools → whispergate_tools

2. Turn-based speaking mechanism:
   - New turn-manager.ts maintains per-channel turn state
   - ChannelPolicy新增turnOrder字段配置发言顺序
   - before_model_resolve hook检查当前agent是否为发言人
   - 非当前发言人直接切换到no-reply模型
   - message_sent hook检测结束符或NO_REPLY时推进turn
   - message_received检测到human消息时重置turn

3. 注入提示词增强:
   - buildEndMarkerInstruction增加isGroupChat参数
   - 群聊时追加规则:与自己无关时主动回复NO_REPLY

4. Slash command支持:
   - /whispergate status - 查看频道策略
   - /whispergate turn-status - 查看轮流状态
   - /whispergate turn-advance - 手动推进轮流
   - /whispergate turn-reset - 重置轮流顺序
Turn system redesign:
- Turn order auto-populated from config bindings (all bot accounts)
- No manual turnOrder config needed
- Humans (humanList) excluded from turn order automatically
- Dormant state: when all agents NO_REPLY in a cycle, currentSpeaker=null
- Reactivation: any new message wakes the system
  - Human message → start from first in order
  - Bot not in order → start from first
  - Bot in order → next after sender
- Skip already-NO_REPLY'd agents when advancing

Identity injection:
- Group chat prompts now include agent identity
- Format: '你是 {name}(Discord 账号: {accountId})'

Other:
- Remove turnOrder from ChannelPolicy (no longer configurable)
- Add TURN-WAKEUP-PROBLEM.md documenting the NO_REPLY wake-up challenge
- Update message_received to call onNewMessage with proper human detection
- Update message_sent to call onSpeakerDone with NO_REPLY tracking
Add a dedicated moderator Discord bot that sends handoff messages when
the current speaker says NO_REPLY. This solves the wakeup problem.

Flow:
1. Agent A is current speaker, receives message
2. Agent A responds with NO_REPLY
3. Plugin detects NO_REPLY in message_sent hook, advances turn to Agent B
4. Plugin sends via moderator bot: '轮到(@AgentB)了,如果没有想说的请直接回复NO_REPLY'
5. This real Discord message triggers Agent B's session
6. Turn manager allows Agent B to respond

Implementation:
- moderatorBotToken config field for the moderator bot's Discord token
- userIdFromToken() extracts Discord user ID from bot token (base64)
- resolveDiscordUserId() maps accountId → Discord user ID via account tokens
- sendModeratorMessage() calls Discord REST API directly
- message_received ignores moderator bot messages (transparent to turn state)
- Moderator bot is NOT in the turn order
The plugin's openclaw.plugin.json has additionalProperties:false,
so any config field not in the schema causes gateway startup failure.
The packaging script didn't copy turn-manager.ts to dist, causing
'Cannot find module ./turn-manager.js' at gateway startup.
Use Node.js built-in WebSocket to maintain a minimal Discord Gateway
connection for the moderator bot, keeping it 'online' with a
'Watching Moderating' status. Handles heartbeat, reconnect, and resume.

Also fix package-plugin.mjs to include moderator-presence.ts in dist.
Turn order should be enforced for ALL messages, not just non-human ones.
Previously, human messages bypassed turn check because they go through
human_list_sender path with shouldUseNoReply=false. Now turn check
always runs when channel has turn state.
Root causes:
1. Multiple plugin subsystems each called startModeratorPresence,
   creating competing WebSocket connections to the same bot token.
   Discord only allows one connection per bot → 4008 rate limit →
   infinite reconnect loop (1000+ connects → token reset by Discord)

2. Invalid session (op 9) handler called scheduleReconnect, but the
   new connection would also get kicked → cascading reconnects

Fixes:
- Singleton guard: startModeratorPresence is a no-op if already started
- cleanup() nullifies old ws handlers before creating new connection
- Stale ws check: all callbacks verify they belong to current ws
- Exponential backoff with cap (max 60s) instead of fixed 2-5s delay
- heartbeat ACK tracking: detect zombie connections
- Non-recoverable codes (4004) properly stop all reconnection
- Change before_prompt_build hook to return systemPrompt instead of prependContext
- This ensures the end marker instruction is injected once per session as a system prompt, not repeatedly prepended to each user message
- Add moderatorBotToken to CONFIG.example.json for documentation
1. Turn check improvements:
   - Add debug logs for ctx.agentId, resolved accountId, turnOrder length
   - Fallback to ctx.accountId if resolveAccountId fails
   - Add resolveDiscordUserId debug logs for handoff troubleshooting

2. One-time prompt injection:
   - Add sessionInjected Set to track injected sessions
   - Use prependContext (not systemPrompt) but only inject once per session
   - Skip subsequent injections with debug log
This reverts commit 6a81f75fd0, reversing
changes made to 86fdc63802.
1. Fix channelId priority: use ctx.channelId > conv.chat_id > conv.channel_id
2. Add sessionAllowed state: track if session was allowed (true) or forced no-reply (false)
3. Add sessionInjected Set: implement one-time prependContext injection
4. Add before_message_write hook: detect NO_REPLY and handle turn advancement
   - forced no-reply: don't advance turn
   - allowed + NO_REPLY: advance turn + handoff
   - dormant state: don't trigger handoff
5. message_sent: continues to handle real messages with end symbols
- Remove async/await from before_message_write hook
- Use fire-and-forget for sendModeratorMessage (void ... .catch())
- OpenClaw's before_message_write is synchronous, not async
- Add sessionChannelId Map to track sessionKey -> channelId
- Save channelId in before_model_resolve when we have derived.channelId
- Fix message_sent to use sessionChannelId fallback when ctx.channelId is undefined
- Add debug logging to message_sent
- Add sessionAccountId Map to track sessionKey -> accountId
- Save accountId in before_model_resolve when resolving accountId
- Use sessionChannelId/sessionAccountId fallback in before_message_write
- Fix channelId extraction: ctx.channelId is platform name ('discord'), not
  the Discord channel snowflake. Now extracts from conversation_label field
  ('channel id:123456') and sessionKey fallback (':channel:123456').

- Fix extractDiscordChannelId: support 'discord:channel:xxx' format in
  addition to 'channel:xxx' for conversationId/event.to fields.

- Fix sender identification in message_received: event.from returns channel
  target, not sender ID. Now uses event.metadata.senderId for humanList
  matching so human messages correctly reset turn order.

- Fix per-channel turn order: was using all server-wide bot accounts from
  bindings, causing deadlock when turn landed on bots not in the channel.
  Now dynamically tracks which bot accounts are seen per channel via
  message_received and only includes those in turn order.

- Always save sessionChannelId/sessionAccountId mappings in before_model_resolve
  regardless of turn check result, so downstream hooks can use them.

- Add comprehensive debug logging to message_sent hook.
hzhang merged commit aeecb534b7 into main 2026-03-01 11:12:18 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: nav/Dirigent#9