Files
Dirigent/DESIGN.md
hzhang b5196e972c feat: rewrite plugin as v2 with globalThis-based turn management
Complete rewrite of the Dirigent plugin turn management system to work
correctly with OpenClaw's VM-context-per-session architecture:

- All turn state stored on globalThis (persists across VM context hot-reloads)
- Hooks registered unconditionally on every api instance; event-level dedup
  (runId Set for agent_end, WeakSet for before_model_resolve) prevents
  double-processing
- Gateway lifecycle events (gateway_start/stop) guarded once via globalThis flag
- Shared initializingChannels lock prevents concurrent channel init across VM
  contexts in message_received and before_model_resolve
- New ChannelStore and IdentityRegistry replace old policy/session-state modules
- Added agent_end hook with tail-match polling for Discord delivery confirmation
- Added web control page, padded-cell auto-scan, discussion tool support
- Removed obsolete v1 modules: channel-resolver, channel-modes, discussion-service,
  session-state, turn-bootstrap, policy/store, rules, decision-input

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 22:41:25 +01:00

394 lines
16 KiB
Markdown

# Dirigent — Design Spec (v2)
## Overview
Dirigent is an OpenClaw plugin that orchestrates turn-based multi-agent conversations in Discord. It manages who speaks when, prevents out-of-turn responses, and coordinates structured discussions between agents.
**Optional integrations** (Dirigent must function fully without either):
- **padded-cell** — enables auto-registration of agent identities from `ego.json`
- **yonexus** — enables cross-instance multi-agent coordination (see §8)
---
## 1. Identity Registry
### Storage
A JSON file (path configurable via plugin config, default `~/.openclaw/dirigent-identity.json`).
Each entry:
```json
{
"discordUserId": "123456789012345678",
"agentId": "home-developer",
"agentName": "Developer"
}
```
### Registration Methods
#### Manual — Tool
Agents call `dirigent-register` to add or update their own entry. `agentId` is auto-derived from the calling session; the agent only provides `discordUserId` and optionally `agentName`.
#### Manual — Control Page
The `/dirigent` control page exposes a table with inline add, edit, and delete.
#### Auto — padded-cell Integration
On gateway startup, if padded-cell is loaded, Dirigent reads `~/.openclaw/ego.json`.
**Detection**: check whether `ego.json`'s `columns` array contains `"discord-id"`. If not, treat padded-cell as absent and skip auto-registration entirely.
**ego.json structure** (padded-cell's `EgoData` format):
```json
{
"columns": ["discord-id", "..."],
"publicColumns": ["..."],
"publicScope": {},
"agentScope": {
"home-developer": { "discord-id": "123456789012345678" },
"home-researcher": { "discord-id": "987654321098765432" }
}
}
```
**Scan logic**:
1. If `columns` does not include `"discord-id"`: skip entirely.
2. For each key in `agentScope`: key is the `agentId`.
3. Read `agentScope[agentId]["discord-id"]`. If present and non-empty: upsert into identity registry (existing entries preserved, new ones appended).
4. Agent name defaults to `agentId` if no dedicated name column exists.
The control page shows a **Re-scan padded-cell** button when padded-cell is detected.
---
## 2. Channel Modes
**Default**: any channel Dirigent has not seen before is treated as `none`.
| Mode | Description | How to set |
|------|-------------|------------|
| `none` | No special behavior. Turn-manager disabled. | Default · `/set-channel-mode none` · control page |
| `work` | Agent workspace channel. Turn-manager disabled. | `create-work-channel` tool only |
| `report` | Agents post via message tool only; not woken by incoming messages. | `create-report-channel` tool · `/set-channel-mode report` · control page |
| `discussion` | Structured agent discussion. | `create-discussion-channel` tool only |
| `chat` | Ongoing multi-agent chat. | `create-chat-channel` tool · `/set-channel-mode chat` · control page |
**Mode-change restrictions**:
- `work` and `discussion` are locked — only settable at channel creation by their respective tools. Cannot be changed to another mode; no other mode can be changed to them.
- `none`, `chat`, and `report` are freely switchable via `/set-channel-mode` or the control page.
### Mode → Turn-Manager State
| Mode | Agent Count | Turn-Manager State |
|------|-------------|-------------------|
| `none` | any | `disabled` |
| `work` | any | `disabled` |
| `report` | any | `dead` |
| `discussion` | 1 | `disabled` |
| `discussion` | 2 | `normal` |
| `discussion` | 3+ | `shuffle` |
| `discussion` | concluded | `archived` |
| `chat` | 1 | `disabled` |
| `chat` | 2 | `normal` |
| `chat` | 3+ | `shuffle` |
---
## 3. Channel Creation Tools & Slash Commands
### Tools
#### `create-chat-channel`
Creates a new Discord channel in the caller's guild and sets its mode to `chat`.
| Parameter | Description |
|-----------|-------------|
| `name` | Channel name |
| `participants` | Discord user IDs to add (optional; moderator bot always added) |
#### `create-report-channel`
Creates a new Discord channel and sets its mode to `report`.
| Parameter | Description |
|-----------|-------------|
| `name` | Channel name |
| `members` | Discord user IDs to add (optional) |
#### `create-work-channel`
Creates a new Discord channel and sets its mode to `work`. Mode is permanently locked.
| Parameter | Description |
|-----------|-------------|
| `name` | Channel name |
| `members` | Additional Discord user IDs to add (optional) |
#### `create-discussion-channel`
See §5 for full details.
#### `dirigent-register`
Registers or updates the calling agent's identity entry.
| Parameter | Description |
|-----------|-------------|
| `discordUserId` | The agent's Discord user ID |
| `agentName` | Display name (optional; defaults to agentId) |
### Slash Command — `/set-channel-mode`
Available in any Discord channel where the moderator bot is present.
```
/set-channel-mode <mode>
```
- Allowed values: `none`, `chat`, `report`
- Rejected with error: `work`, `discussion` (locked to creation tools)
- If the channel is currently `work` or `discussion`: command is rejected, mode is locked
---
## 4. Turn-Manager
### Per-Channel States
| State | Behavior |
|-------|----------|
| `disabled` | All turn-manager logic bypassed. Agents respond normally. |
| `dead` | Discord messages are not routed to any agent session. |
| `normal` | Speaker list rotates in fixed order. |
| `shuffle` | After the last speaker completes a full cycle, the list is reshuffled. Constraint: the previous last speaker cannot become the new first speaker. |
| `archived` | Channel is sealed. No agent is woken. New Discord messages receive a moderator auto-reply: "This channel is archived and no longer active." |
### Speaker List Construction
For `discussion` and `chat` channels:
1. Moderator bot fetches all Discord channel members via Discord API.
2. Each member's Discord user ID is resolved via the identity registry. Members identified as agents are added to the speaker list.
3. At each **cycle boundary** (after the last speaker in the list completes their turn), the list is rebuilt:
- Re-fetch current Discord channel members.
- In `normal` mode: existing members retain relative order; new agents are appended.
- In `shuffle` mode: the rebuilt list is reshuffled, with the constraint above.
### Turn Flow
#### `before_model_resolve`
1. Determine the active speaker for this channel (from turn-manager state).
2. Record the current channel's latest Discord message ID as an **anchor** (used later for delivery confirmation).
3. If the current agent is the active speaker: allow through with their configured model.
4. If not: route to `dirigent/no-reply` — response is suppressed.
#### `agent_end`
1. Check if the agent that finished is the active speaker. If not: ignore.
2. Extract the final reply text from `event.messages`: find the last message with `role === "assistant"`, then concatenate the `text` field from all `{type: "text"}` parts in its `content` array.
3. Classify the turn:
- **Empty turn**: text is `NO_REPLY`, `NO`, or empty/whitespace-only.
- **Real turn**: anything else.
4. Record the result for dormant tracking.
**If empty turn**: advance the speaker pointer immediately — no Discord delivery to wait for.
**If real turn**: wait for Discord delivery confirmation before advancing.
### Delivery Confirmation (Real Turns)
`agent_end` fires when OpenClaw has dispatched the message, not when Discord has delivered it. OpenClaw also splits long messages into multiple Discord messages — the next agent must not be triggered before the last fragment arrives.
**Tail-match polling**:
1. Take the last 40 characters of the final reply text as a **tail fingerprint**.
2. Poll `GET /channels/{channelId}/messages?limit=20` at a short interval, filtering to messages where:
- `message.id > anchor` (only messages from this turn onward)
- `message.author.id === agentDiscordUserId` (only from this agent's Discord account)
3. Take the most recent matching message. If its content ends with the tail fingerprint: match confirmed.
4. On match: advance the speaker pointer and post `{schedule_identifier}` then immediately delete it.
**Interruption**: if any message from a non-current-speaker appears in the channel during the wait, cancel the tail-match and treat the event as a wake-from-dormant (see below).
**Timeout**: if no match within 15 seconds (configurable), log a warning and advance anyway to prevent a permanently stalled turn.
**Fingerprint length**: 40 characters (configurable). The author + anchor filters make false matches negligible at this length.
### Dormant Stage
#### Definitions
- **Cycle**: one complete pass through the current speaker list from first to last.
- **Empty turn**: final reply text is `NO_REPLY`, `NO`, or empty/whitespace-only.
- **Cycle boundary**: the moment the last agent in the current list completes their turn.
#### Intent
Dormant stops the moderator from endlessly triggering agents when no one has anything to say. Entering dormant requires **unanimous** empty turns — any single real message is a veto and the cycle continues. When a new Discord message arrives (from a human or an agent via the message tool), it signals a new topic; the channel wakes and every agent gets another chance to respond.
#### Trigger
At each cycle boundary:
1. Re-fetch Discord channel members and build the new speaker list.
2. Check whether any new agents were added to the list.
3. Check whether **all agents who completed a turn in this cycle** sent empty turns.
Enter dormant **only if both hold**:
- All agents in the completed cycle sent empty turns.
- No new agents were added at this boundary.
If new agents joined: reset empty-turn tracking and start a fresh cycle — do not enter dormant even if all existing agents sent empty.
#### Dormant Behavior
- `currentSpeaker``null`.
- Empty-turn history is cleared.
- Moderator stops posting `{schedule_identifier}`.
#### Wake from Dormant
- **Trigger**: any new Discord message in the channel (human or agent via message tool).
- `currentSpeaker` → first agent in the speaker list.
- Moderator posts `{schedule_identifier}` then deletes it.
- A new cycle begins. Agents that have nothing to say emit empty turns; if all pass again, the channel returns to dormant.
#### Edge Cases
| Scenario | Behavior |
|----------|----------|
| Agent leaves mid-cycle | Turn is skipped; agent removed at next cycle boundary. Dormant check counts only agents who completed a turn. |
| New agent joins mid-cycle | Not added until next cycle boundary. Does not affect current dormant check. |
| Shuffle mode | Reshuffle happens after the dormant check at cycle boundary. Dormant logic is identical to `normal`. |
| Shuffle + new agents | New agents appended before reshuffling. Since new agents were found, dormant is suppressed; full enlarged list starts a new shuffled cycle. |
---
## 5. Discussion Mode
### Creation — `create-discussion-channel`
Called by an agent (the **initiator**). `initiator` is auto-derived from the calling session.
| Parameter | Description |
|-----------|-------------|
| `callback-guild` | Guild ID of the initiator's current channel. Error if moderator bot lacks admin in this guild. |
| `callback-channel` | Channel ID of the initiator's current channel. Error if not a Discord group channel. |
| `discussion-guide` | Minimum context: topic, goals, completion criteria. |
| `participants` | List of Discord user IDs for participating agents. |
### Discussion Lifecycle
```
Agent calls create-discussion-channel
Moderator creates new private Discord channel, adds participants
Moderator posts discussion-guide into the channel → wakes participant agents
Turn-manager governs the discussion (normal / shuffle based on participant count)
├─[dormant]──► Moderator posts reminder to initiator:
│ "Discussion is idle. Please summarize and call discussion-complete."
▼ initiator calls discussion-complete
Turn-manager state → archived
Moderator auto-replies to any new messages: "This discussion is closed."
Moderator posts summary file path to callback-channel
```
### `discussion-complete` Tool
| Parameter | Description |
|-----------|-------------|
| `discussion-channel` | Channel ID where the discussion took place |
| `summary` | File path to the summary (must be under `{workspace}/discussion-summary/`) |
Validation:
- Caller must be the initiator of the specified discussion channel. Otherwise: error.
- Summary file must exist at the given path.
---
## 6. Control Page — `/dirigent`
HTTP route registered on the OpenClaw gateway. Auth: `gateway` (requires the same Bearer token as the gateway API; returns 401 without it).
### Sections
#### Identity Registry
- Table: discord-user-id / agent-id / agent-name
- Inline add, edit, delete
- **Re-scan padded-cell** button (shown only when padded-cell is detected)
#### Guild & Channel Configuration
- Lists all Discord guilds where the moderator bot has admin permissions.
- For each guild: all private group channels.
- Per channel:
- Current mode badge
- Mode dropdown (`none | chat | report`) — hidden for `work` and `discussion` channels
- `work` and `discussion` channels display mode as a read-only badge
- Channels unknown to Dirigent display as `none`
- Current turn-manager state and active speaker name (where applicable)
---
## 7. Migration from v1
| v1 Mechanic | v2 Replacement |
|-------------|----------------|
| End symbol (`🔚`) required in agent replies | Removed — agents no longer need end symbols |
| `before_message_write` drives turn advance | Replaced by `agent_end` hook |
| Moderator posts visible handoff message each turn | Moderator posts `{schedule_identifier}` then immediately deletes it |
| NO_REPLY detected from `before_message_write` content | Derived from last assistant message in `agent_end` `event.messages` |
| Turn advances immediately on agent response | Empty turns advance immediately; real turns wait for Discord delivery confirmation via tail-match polling |
---
## 8. Yonexus Compatibility (Future)
> Yonexus is a planned cross-instance WebSocket communication plugin (hub-and-spoke). Dirigent must work fully without it.
### Topology
```
Instance A (master) Instance B (slave) Instance C (slave)
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Dirigent │◄──Yonexus──►│ Dirigent │◄──Yonexus──►│ Dirigent │
│ (authority) │ │ (relay) │ │ (relay) │
└──────────────┘ └──────────────┘ └──────────────┘
Authoritative state:
- Identity registry
- Channel modes & turn-manager states
- Speaker lists & turn pointers
- Discussion metadata
```
### Master / Slave Roles
**Master**:
- Holds all authoritative state.
- Serves read/write operations to slaves via Yonexus message rules.
- Executes all moderator bot actions (post/delete `{schedule_identifier}`, send discussion-guide, etc.).
**Slave**:
- No local state for shared channels.
- `before_model_resolve`: queries master to determine if this agent is the active speaker.
- `agent_end`: notifies master that the turn is complete (`agentId`, `channelId`, `isEmpty`).
- Master handles all speaker advancement and moderator actions.
### Message Rules (provisional)
```
dirigent::check-turn → { allowed: bool, currentSpeaker: string }
dirigent::turn-complete → { agentId, channelId, isEmpty }
dirigent::get-identity → identity registry entry for discordUserId
dirigent::get-channel-state → { mode, tmState, currentSpeaker }
```
### Constraints
- Without Yonexus: Dirigent runs in standalone mode with all state local.
- Role configured via plugin config: `dirigentRole: "master" | "slave"` (default: `"master"`).
- Slave instances skip all local state mutations.
- Identity registry, channel config, and control page are only meaningful on the master instance.