diff --git a/README.md b/README.md index c3bd2b0..d61b5dc 100644 --- a/README.md +++ b/README.md @@ -42,7 +42,7 @@ HarborForge.OpenclawPlugin/ │ └───────────────────────────────────────────┘ │ └─────────────────────────────────────────────────┘ │ - ▼ HTTP/WebSocket + ▼ HTTP ┌─────────────────────┐ │ HarborForge Monitor │ └─────────────────────┘ @@ -76,21 +76,26 @@ node scripts/install.mjs --verbose ## 配置 -1. 在 HarborForge Monitor 中注册服务器,获取 `challengeUuid` +1. 在 HarborForge Monitor 中注册服务器,并生成 `apiKey` 2. 编辑 `~/.openclaw/openclaw.json`: ```json { "plugins": { - "harborforge-monitor": { - "enabled": true, - "backendUrl": "https://monitor.hangman-lab.top", - "identifier": "my-server-01", - "challengeUuid": "your-challenge-uuid-here", - "reportIntervalSec": 30, - "httpFallbackIntervalSec": 60, - "logLevel": "info" + "entries": { + "harborforge-monitor": { + "enabled": true, + "config": { + "enabled": true, + "backendUrl": "https://monitor.hangman-lab.top", + "identifier": "my-server-01", + "apiKey": "your-api-key-here", + "reportIntervalSec": 30, + "httpFallbackIntervalSec": 60, + "logLevel": "info" + } + } } } } @@ -109,7 +114,7 @@ openclaw gateway restart | `enabled` | boolean | `true` | 是否启用插件 | | `backendUrl` | string | `https://monitor.hangman-lab.top` | Monitor 后端地址 | | `identifier` | string | 自动检测 hostname | 服务器标识符 | -| `challengeUuid` | string | 必填 | 注册挑战 UUID | +| `apiKey` | string | 必填 | HarborForge Monitor 生成的服务器 API Key | | `reportIntervalSec` | number | `30` | 报告间隔(秒) | | `httpFallbackIntervalSec` | number | `60` | HTTP 回退间隔(秒) | | `logLevel` | string | `"info"` | 日志级别: debug/info/warn/error | @@ -151,7 +156,7 @@ npm run build ```bash cd server -HF_MONITOR_CHALLENGE_UUID=test-uuid \ +HF_MONITOR_API_KEY=test-api-key \ HF_MONITOR_BACKEND_URL=http://localhost:8000 \ HF_MONITOR_LOG_LEVEL=debug \ node telemetry.mjs diff --git a/docs/monitor-server-connector-plan.md b/docs/monitor-server-connector-plan.md index 398971f..291a9ee 100644 --- a/docs/monitor-server-connector-plan.md +++ b/docs/monitor-server-connector-plan.md @@ -1,112 +1,47 @@ -# HarborForge OpenClaw Server Connector Plugin — Project Plan +# Monitor Server Connector Plan -## 1) Goal -Provide a secure, lightweight plugin/agent that connects servers to HarborForge Monitor, streams telemetry in real time, and falls back to HTTP heartbeat when WebSocket is unavailable. +## Current design -## 2) Scope -- **Handshake + auth** using backend-issued challenge + RSA-OAEP encrypted payload. -- **WebSocket telemetry** to `/monitor/server/ws`. -- **HTTP heartbeat** to `/monitor/server/heartbeat` as fallback. -- **System metrics**: CPU/Mem/Disk/Swap/Uptime/OpenClaw version/Agents list. -- **Retry & backoff**, offline handling, and minimal local state. +The plugin uses: -## 3) Non-Goals -- No UI in the plugin. -- No provider billing calls from plugin. -- No multi-tenant auth beyond challenge + server identifier. +- **HTTP heartbeat** to `/monitor/server/heartbeat-v2` +- **API Key authentication** via `X-API-Key` +- **Gateway lifecycle hooks**: `gateway_start` / `gateway_stop` -## 4) Architecture -``` -plugin/ - config/ # load config & secrets - crypto/ # RSA-OAEP encrypt/decrypt helpers - collector/ # system + openclaw metrics - transport/ # ws + http heartbeat - state/ # retry/backoff, last sent, cache - main.ts|py # entry -``` +## No longer used -### 4.1 Config -- `backend_url` -- `identifier` -- `challenge_uuid` -- `report_interval_sec` (default: 20-30s) -- `http_fallback_interval_sec` (default: 60s) -- `log_level` +The following design has been retired: -### 4.2 Security -- Fetch public key: `GET /monitor/public/server-public-key` -- Encrypt payload with RSA-OAEP -- Include `nonce` + `ts` (UTC) to prevent replay -- **Challenge valid**: 10 minutes -- **Offline threshold**: 7 minutes +- challenge UUID +- RSA public key fetch +- encrypted handshake payload +- WebSocket telemetry -## 5) Communication Flow -### 5.1 Handshake (WS) -1. Plugin reads `identifier + challenge_uuid`. -2. Fetch RSA public key. -3. Encrypt payload: `{identifier, challenge_uuid, nonce, ts}`. -4. Connect WS `/monitor/server/ws` and send `encrypted_payload`. -5. On success: begin periodic telemetry push. +## Runtime flow -### 5.2 Fallback (HTTP) -If WS fails: -- POST telemetry to `/monitor/server/heartbeat` with same payload fields. -- Retry with exponential backoff (cap 5–10 min). +1. Gateway loads `harborforge-monitor` +2. Plugin reads config from OpenClaw plugin config +3. On `gateway_start`, plugin launches `server/telemetry.mjs` +4. Sidecar collects: + - system metrics + - OpenClaw version + - plugin version + - configured agents +5. Sidecar posts telemetry to backend with `X-API-Key` -## 6) Telemetry Schema (example) -``` +## Payload + +```json { - identifier: "vps.t1", - openclaw_version: "x.y.z", - cpu_pct: 12.5, - mem_pct: 41.2, - disk_pct: 62.0, - swap_pct: 0.0, - agents: [ { id: "a1", name: "agent", status: "running" } ], - last_seen_at: "2026-03-11T21:00:00Z" + "identifier": "vps.t1", + "openclaw_version": "OpenClaw 2026.3.13 (61d171a)", + "plugin_version": "0.1.0", + "agents": [], + "cpu_pct": 10.5, + "mem_pct": 52.1, + "disk_pct": 81.0, + "swap_pct": 0.0, + "load_avg": [0.12, 0.09, 0.03], + "uptime_seconds": 12345 } ``` - -## 7) Reliability -- Automatic reconnect on WS drop -- HTTP fallback if WS unavailable > 2 intervals -- Exponential backoff on failures -- Local cache for last successful payload - -## 8) Deployment Options -- **Systemd service** (preferred for VPS) -- **Docker container** (optional) -- Single-binary build if using Go/Rust - -## 9) Milestones -**M1 – POC (2–3 days)** -- CLI config loader + HTTP heartbeat -- See online + metrics in Monitor - -**M2 – WS realtime (2–3 days)** -- Full handshake + WS streaming -- Reconnect & fallback logic - -**M3 – Packaging (1–2 days)** -- systemd unit + sample config -- installation script - -**M4 – Hardening & Docs (1–2 days)** -- logging, metrics, docs -- troubleshooting guide - -## 10) Deliverables -- Plugin source -- Config template + systemd unit -- Integration docs -- Test script + example payloads - -## 11) Open Questions -- Preferred language (Go/Python/Node/Rust)? -- How to read OpenClaw agent list (API vs local state)? -- Required log format / retention? - ---- - -**Next step:** confirm preferred runtime (Go/Python/Node) and I will scaffold the project structure + first heartbeat implementation. diff --git a/server/telemetry.mjs b/server/telemetry.mjs index 6929c33..be3a14b 100644 --- a/server/telemetry.mjs +++ b/server/telemetry.mjs @@ -4,7 +4,7 @@ * Runs as separate process from Gateway. * Collects system metrics and OpenClaw status, sends to Monitor. */ -import { readFile, access } from 'fs/promises'; +import { readFile, access, readdir } from 'fs/promises'; import { constants } from 'fs'; import { exec } from 'child_process'; import { promisify } from 'util'; @@ -191,10 +191,24 @@ async function getOpenclawAgents() { try { await access(agentConfigPath, constants.R_OK); const data = JSON.parse(await readFile(agentConfigPath, 'utf8')); - return data.agents || []; + if (Array.isArray(data.agents) && data.agents.length > 0) { + return data.agents; + } } catch { - return []; + // fall through to directory-based discovery } + + const agentsDir = `${CONFIG.openclawPath}/agents`; + await access(agentsDir, constants.R_OK); + const entries = await readdir(agentsDir, { withFileTypes: true }); + return entries + .filter((entry) => entry.isDirectory()) + .filter((entry) => entry.name !== 'main') + .map((entry) => ({ + id: entry.name, + name: entry.name, + status: 'configured', + })); } catch { return []; }