399 lines
10 KiB
Markdown
399 lines
10 KiB
Markdown
# Yonexus.Server — Project Plan
|
||
|
||
## 1. Goal
|
||
|
||
`Yonexus.Server` is the OpenClaw plugin that acts as the central communication hub in a Yonexus network.
|
||
|
||
This repository references `Yonexus.Protocol` as a submodule at `protocol/`.
|
||
|
||
It is responsible for:
|
||
- accepting WebSocket connections from clients
|
||
- maintaining the client registry and trust state
|
||
- handling pairing initiation and Discord DM notification
|
||
- verifying client authentication proofs
|
||
- tracking client liveness via heartbeat
|
||
- routing and dispatching application messages
|
||
- exposing a TypeScript API for server-side plugins and integrations
|
||
|
||
---
|
||
|
||
## 2. Configuration
|
||
|
||
```ts
|
||
interface YonexusServerConfig {
|
||
followerIdentifiers: string[];
|
||
notifyBotToken: string;
|
||
adminUserId: string;
|
||
listenHost?: string;
|
||
listenPort: number;
|
||
publicWsUrl?: string;
|
||
}
|
||
```
|
||
|
||
Field semantics:
|
||
- `followerIdentifiers`: allowlist of identifiers permitted to pair/connect
|
||
- `notifyBotToken`: Discord bot token for sending pairing code DM to admin
|
||
- `adminUserId`: Discord user id of the human administrator
|
||
- `listenHost`: local bind address (default: `0.0.0.0`)
|
||
- `listenPort`: local bind port (required)
|
||
- `publicWsUrl`: optional canonical external WebSocket URL advertised to clients
|
||
|
||
Validation:
|
||
- missing required fields must fail plugin initialization
|
||
- identifiers not in `followerIdentifiers` must be rejected at connection time
|
||
|
||
---
|
||
|
||
## 3. Runtime Lifecycle
|
||
|
||
### 3.1 Startup
|
||
|
||
On OpenClaw gateway startup:
|
||
1. load and validate config
|
||
2. initialize persistent client registry
|
||
3. register builtin protocol handlers
|
||
4. register application rule registry
|
||
5. start WebSocket server on configured host/port
|
||
6. start heartbeat/status sweep timer
|
||
|
||
### 3.2 Shutdown
|
||
|
||
On shutdown:
|
||
1. close all client WebSocket connections gracefully
|
||
2. persist client registry state
|
||
3. stop sweep timers
|
||
|
||
---
|
||
|
||
## 4. Client Registry
|
||
|
||
### 4.1 Data Model
|
||
|
||
```ts
|
||
interface ClientRecord {
|
||
identifier: string;
|
||
publicKey?: string;
|
||
secret?: string;
|
||
pairingStatus: "unpaired" | "pending" | "paired" | "revoked";
|
||
pairingCode?: string;
|
||
pairingExpiresAt?: number;
|
||
pairingNotifiedAt?: number;
|
||
pairingNotifyStatus?: "pending" | "sent" | "failed";
|
||
status: "online" | "offline" | "unstable";
|
||
lastHeartbeatAt?: number;
|
||
lastAuthenticatedAt?: number;
|
||
recentNonces: Array<{ nonce: string; timestamp: number }>;
|
||
recentHandshakeAttempts: number[];
|
||
createdAt: number;
|
||
updatedAt: number;
|
||
}
|
||
```
|
||
|
||
### 4.2 Persistence
|
||
|
||
Registry must survive server restarts:
|
||
- all trust-related fields must be persisted
|
||
- security rolling windows should be reset on restart or kept safely
|
||
- on-disk storage format should support future encryption-at-rest
|
||
|
||
---
|
||
|
||
## 5. Pairing Flow
|
||
|
||
### 5.1 Entry Condition
|
||
|
||
Pairing starts when a client connects and:
|
||
- its identifier is in `followerIdentifiers`
|
||
- it has no valid `secret` stored
|
||
|
||
### 5.2 Step A — Generate Pairing Code
|
||
|
||
Server generates:
|
||
- a random `pairingCode`
|
||
- `expiresAt` (UTC unix seconds)
|
||
- `ttlSeconds`
|
||
|
||
### 5.3 Step B — Discord DM to Admin
|
||
|
||
Server must use `notifyBotToken` to DM `adminUserId`.
|
||
|
||
DM body must contain:
|
||
- `identifier`
|
||
- `pairingCode`
|
||
- `expiresAt` or TTL
|
||
|
||
DM delivery must succeed before protocol continues.
|
||
|
||
### 5.4 Step C — Protocol Notification to Client
|
||
|
||
Server sends `hello_ack` with `nextAction: "pair_required"`.
|
||
|
||
Server then sends `pair_request` builtin message containing:
|
||
- `identifier`
|
||
- `expiresAt`
|
||
- `ttlSeconds`
|
||
- `adminNotification: "sent" | "failed"`
|
||
- `codeDelivery: "out_of_band"`
|
||
|
||
### 5.5 Step D — Accept Confirmation
|
||
|
||
Server accepts a `pair_confirm` builtin message from client containing the pairing code.
|
||
|
||
Validation:
|
||
- code must match stored pending code
|
||
- current time must be before `pairingExpiresAt`
|
||
|
||
### 5.6 Step E — Issue Secret
|
||
|
||
On successful confirmation:
|
||
- generate a random `secret`
|
||
- store `publicKey` and `secret`
|
||
- mark `pairingStatus` as `paired`
|
||
- send `pair_success` builtin message to client with the secret
|
||
|
||
On failure:
|
||
- send `pair_failed` builtin message
|
||
- optionally retry or leave for client to reconnect
|
||
|
||
---
|
||
|
||
## 6. Authentication
|
||
|
||
### 6.1 Entry Condition
|
||
|
||
Authentication starts when a connected client sends a `hello` with `hasSecret: true`.
|
||
|
||
### 6.2 Proof Validation
|
||
|
||
Client sends `auth_request` containing:
|
||
- `identifier`
|
||
- `nonce` (24 random characters)
|
||
- `proofTimestamp` (UTC unix seconds)
|
||
- `signature` (signed proof payload)
|
||
- optionally a new `publicKey` if rotating
|
||
|
||
Server validates:
|
||
1. identifier is allowlisted and paired
|
||
2. public key matches stored key (if not rotating)
|
||
3. signature verifies correctly
|
||
4. decrypted proof contains the correct `secret`
|
||
5. `abs(now - proofTimestamp) < 10`
|
||
6. nonce is not in recent nonce window
|
||
7. handshake attempts in last 10s ≤ 10
|
||
|
||
### 6.3 Nonce Window
|
||
|
||
Store last 10 nonces per client with their timestamps.
|
||
|
||
When a nonce is presented:
|
||
- if it matches any in the window, reject with `nonce_collision`
|
||
- add the new nonce to the window
|
||
- trim window to most recent 10 entries
|
||
|
||
### 6.4 Handshake Rate Limit
|
||
|
||
Track recent handshake attempt timestamps per client.
|
||
|
||
If >10 attempts appear in the last 10 seconds:
|
||
- reject with `rate_limited`
|
||
- trigger `re_pair_required`
|
||
- mark pairing status as `revoked`
|
||
|
||
### 6.5 Success / Failure Responses
|
||
|
||
On success:
|
||
- send `auth_success` with `status: "online"`
|
||
- record `lastAuthenticatedAt`
|
||
|
||
On failure:
|
||
- send `auth_failed` with reason
|
||
- if reason is unsafe, also send `re_pair_required`
|
||
|
||
---
|
||
|
||
## 7. Heartbeat and Liveness
|
||
|
||
### 7.1 Heartbeat Reception
|
||
|
||
Clients send `heartbeat` builtin messages every 5 minutes.
|
||
|
||
On receiving a heartbeat:
|
||
- update `lastHeartbeatAt`
|
||
- if client was `offline` or `unstable`, transition to `online`
|
||
|
||
### 7.2 Status Sweep
|
||
|
||
Server runs a periodic sweep (recommended: every 30–60s).
|
||
|
||
For each registered client:
|
||
- if no heartbeat for 7 min → mark `unstable`
|
||
- if no heartbeat for 11 min → mark `offline`, close socket, send `disconnect_notice` first
|
||
|
||
### 7.3 Status Transitions
|
||
|
||
Allowed transitions:
|
||
- `online` → `unstable` (7 min timeout)
|
||
- `unstable` → `online` (heartbeat received)
|
||
- `unstable` → `offline` (11 min timeout)
|
||
- `offline` → (removed from active registry or marked offline permanently)
|
||
|
||
---
|
||
|
||
## 8. Messaging and Rule Dispatch
|
||
|
||
### 8.1 Message Rewrite
|
||
|
||
When server receives an application rule message from a client, before rule dispatch it rewrites:
|
||
|
||
```
|
||
${rule_identifier}::${message_content}
|
||
```
|
||
|
||
Into:
|
||
|
||
```
|
||
${rule_identifier}::${sender_identifier}::${message_content}
|
||
```
|
||
|
||
### 8.2 Rule Registry
|
||
|
||
Server maintains a registry of `(rule_identifier → processor)` pairs.
|
||
|
||
Dispatch algorithm:
|
||
1. parse first `::` segment as `rule_identifier`
|
||
2. if `rule_identifier === builtin`, route to builtin protocol handler
|
||
3. iterate registered rules in registration order
|
||
4. invoke first exact match
|
||
5. if no match, ignore or log as unhandled
|
||
|
||
### 8.3 Processor Function Signature
|
||
|
||
```ts
|
||
type RuleProcessor = (message: string) => unknown;
|
||
registerRule(rule: string, processor: RuleProcessor): void;
|
||
```
|
||
|
||
Validation:
|
||
- must reject `builtin`
|
||
- must reject duplicate rule unless explicit override mode is added later
|
||
|
||
### 8.4 API: sendMessageToClient
|
||
|
||
```ts
|
||
async function sendMessageToClient(identifier: string, message: string): Promise<void>
|
||
```
|
||
|
||
Constraints:
|
||
- identifier must be currently connected and authenticated
|
||
- message must already conform to `${rule_identifier}::${message_content}`
|
||
- throws if identifier is not online
|
||
|
||
---
|
||
|
||
## 9. WebSocket Server
|
||
|
||
### 9.1 Connection Accept
|
||
|
||
On new WebSocket connection:
|
||
1. read initial `hello` message
|
||
2. validate identifier is in allowlist
|
||
3. check if paired/authenticated or requires pairing
|
||
4. proceed accordingly
|
||
|
||
### 9.2 Connection Close
|
||
|
||
On client disconnect:
|
||
- mark client as offline immediately
|
||
- stop heartbeat tracking for that session
|
||
- keep persistent registry state intact
|
||
|
||
### 9.3 One Active Session Per Identifier
|
||
|
||
Recommended v1 policy:
|
||
- if a new authenticated connection appears for an already-authenticated identifier, terminate the old connection and accept the new one
|
||
|
||
---
|
||
|
||
## 10. Error Handling
|
||
|
||
Structured errors required for at minimum:
|
||
- `INVALID_CONFIG` — missing required config fields
|
||
- `IDENTIFIER_NOT_ALLOWED` — identifier not in allowlist
|
||
- `PAIRING_NOTIFICATION_FAILED` — Discord DM send failed
|
||
- `PAIRING_EXPIRED` — pairing code expired
|
||
- `AUTH_FAILED` — proof verification failed
|
||
- `NONCE_COLLISION` — replay detected
|
||
- `RATE_LIMITED` — unsafe handshake rate
|
||
- `RE_PAIR_REQUIRED` — trust must be reset
|
||
- `CLIENT_OFFLINE` — attempted to send to offline client
|
||
- `RULE_ALREADY_REGISTERED` — duplicate rule registration
|
||
- `RESERVED_RULE` — attempted to register `builtin`
|
||
- `MALFORMED_MESSAGE` — malformed builtin/application message
|
||
|
||
---
|
||
|
||
## 11. Implementation Phases
|
||
|
||
### Phase 0 — Skeleton
|
||
- plugin manifest and entry point
|
||
- config loading and validation
|
||
- basic OpenClaw hook registration
|
||
- minimal logging/error scaffolding
|
||
|
||
### Phase 1 — WebSocket Server
|
||
- WebSocket server startup
|
||
- connection accept / close lifecycle
|
||
- hello / hello_ack flow
|
||
- per-connection state tracking
|
||
|
||
### Phase 2 — Registry and Persistence
|
||
- in-memory client registry
|
||
- on-disk persistence (JSON or equivalent)
|
||
- restart recovery
|
||
- basic CRUD for client records
|
||
|
||
### Phase 3 — Pairing
|
||
- pairing code generation
|
||
- Discord DM via bot token
|
||
- pair_request / pair_confirm / pair_success / pair_failed
|
||
- pairing state transitions
|
||
|
||
### Phase 4 — Authentication
|
||
- auth_request verification
|
||
- signature verification
|
||
- nonce window tracking
|
||
- handshake rate limiting
|
||
- re_pair_required flow
|
||
|
||
### Phase 5 — Heartbeat and Status
|
||
- heartbeat receiver
|
||
- status sweep timer
|
||
- online / unstable / offline transitions
|
||
- disconnect_notice before socket close
|
||
|
||
### Phase 6 — Rule Dispatch and APIs
|
||
- rule registry
|
||
- message rewrite on inbound
|
||
- first-match dispatch
|
||
- `registerRule` API
|
||
- `sendMessageToClient` API
|
||
|
||
### Phase 7 — Hardening
|
||
- structured error definitions
|
||
- redacted logging for sensitive values
|
||
- integration test coverage
|
||
- failure-path coverage
|
||
|
||
---
|
||
|
||
## 12. Open Questions for Yonexus.Server
|
||
|
||
These should be resolved before or during implementation:
|
||
|
||
1. What Discord library/module will be used to send DM? (direct Discord API / discord.js / etc.)
|
||
2. Should the WebSocket server also expose an optional TLS listener?
|
||
3. Should the sweep timer interval be configurable or fixed?
|
||
4. What is the maximum supported number of concurrent connected clients?
|
||
5. Should server-side rule processors run in isolated contexts?
|
||
6. Should `sendMessageToClient` queue messages for briefly offline clients, or fail immediately?
|