Files
hzhang 6e3ad669f8 feat(monitor): active push loop replacing standalone monitor
Adds a periodic POST loop to <backend>/monitor/server/heartbeat so
HF plugin can take over the standalone harborforge-monitor daemon's
job — same X-API-Key header, same flat telemetry shape (cpu_pct /
mem_pct / disk_pct / swap_pct / load_avg / uptime_seconds /
plugin_version / agents[]). HF backend stays unchanged.

Config: monitor_push_enabled (default false; opt-in to avoid surprise
heartbeats from existing deployments), monitor_push_interval_seconds
(default 30), reuses apiKey for the X-API-Key header. Lift the
container's HF_MONITER_API_KEY into config.apiKey, flip
monitor_push_enabled true, then docker rm -f the container — DB
last_seen_at keeps advancing under the plugin's loop.

Collector grew swap + cpu sampling (two reads of /proc/stat over a
1-second window when SampleCPU=true). Bridge endpoint stays cheap
(SampleCPU=false on demand); push loop is the only caller paying the
sampling cost.

E2E in sim: monitor_push_enabled=true + apiKey from injected
MonitoredServer row → server_states.last_seen_at advances exactly
every interval_seconds (10s configured, 10s observed). cpu/mem/disk/
swap_pct all populate correctly.
2026-06-03 13:04:51 +01:00

145 lines
5.8 KiB
Markdown

# HarborForge.PlexumPlugin
Plexum-side equivalent of [HarborForge.OpenclawPlugin](https://git.hangman-lab.top/zhi/HarborForge.OpenclawPlugin):
exposes Plexum-side telemetry to the HarborForge Monitor bridge,
drives the HarborForge Calendar scheduler, and gives agents a tool
surface for the same calendar lifecycle actions OpenClaw agents had.
Part of the [HarborForge](../README.md) platform; tracked as a git
submodule of the HarborForge umbrella repo.
- Plugin id: `harbor-forge` (matches the OpenClaw counterpart so the
backend's per-plugin schemas don't fork)
- Plugin version: `0.1.0`
- Activation: `eager` — Monitor bridge + Calendar scheduler must be
running before any agent turn fires
- Plexum SDK version: requires `Plexum-sdk-go` with `HostAPI.WakeAgent`
(commit 216cf21 or later)
## What it does
- **Monitor push loop** — when `monitor_push_enabled: true`, posts a
flat telemetry payload (cpu/mem/disk/swap/load + per-agent state)
to `<backendUrl>/monitor/server/heartbeat` every
`monitor_push_interval_seconds`. This replaces the standalone
`harborforge-monitor` daemon — the plugin's lifecycle (gateway
start/stop) bounds the loop, so a separate supervisor isn't needed.
Use the same `apiKey` value the standalone monitor's
`HF_MONITER_API_KEY` carried.
- **Monitor bridge** (optional) — HTTP server on
`127.0.0.1:<monitor_port>` that responds to `/telemetry` with a
Snapshot. Useful when the standalone monitor is still present and
you want it to enrich its push payload from the plugin's view of
agents. Disable by setting `monitor_port: 0`.
- **Calendar scheduler** — heartbeats `<backendUrl>/calendar/agent/
heartbeat` every interval, receives any TimeSlots due to fire, and
dispatches them through `HostAPI.WakeAgent` (state-aware queue
with depth-1 replace-newest)
- **9 harborforge_* tools** mirroring the OpenClaw plugin's surface
| Tool | Use |
|---|---|
| `harborforge_status` | resolved config + Monitor bridge health + Calendar status + telemetry snapshot |
| `harborforge_telemetry` | fresh system + agent metrics |
| `harborforge_monitor_telemetry` | last bridge query timing + last snapshot served |
| `harborforge_calendar_status` | active slot(s) + history + heartbeat clock |
| `harborforge_calendar_complete` | mark active slot completed (+optional summary) |
| `harborforge_calendar_abort` | mark active slot aborted (+optional reason) |
| `harborforge_calendar_pause` | pause active slot (non-terminal) |
| `harborforge_calendar_resume` | resume a paused slot |
| `harborforge_restart_status` | backend restart-pending flag + last poll time |
## Install
```bash
git clone --recurse-submodules https://git.hangman-lab.top/zhi/HarborForge.PlexumPlugin
cd HarborForge.PlexumPlugin
bash scripts/install.sh # or: make install
```
Then in `~/.plexum/plexum.json`:
```json
{
"plugins": {
"allow": [
".",
"harbor-forge"
]
}
}
```
And configure at `~/.plexum/plugins/harbor-forge/config.json`:
```json
{
"backendUrl": "https://hf-api.hangman-lab.top",
"identifier": "server-t3",
"apiKey": "g1_xxx",
"monitor_push_enabled": true,
"monitor_push_interval_seconds": 30,
"monitor_port": 0,
"calendar_enabled": true,
"calendar_heartbeat_interval_seconds": 30
}
```
Replacing the standalone `harborforge-monitor` container: lift the
container's `HF_MONITER_API_KEY` into `apiKey`, set
`monitor_push_enabled: true`, then `docker rm -f harborforge-monitor`
once you've confirmed the plugin's pushes are landing (the backend's
`server_states.last_seen_at` should keep advancing without the
container running).
Restart the host (`systemctl --user restart plexum`) and verify:
```bash
plexum plugin-list | grep harbor
curl -s http://127.0.0.1:9100/health
curl -s http://127.0.0.1:9100/telemetry | jq .agents
```
## How calendar wake works
When the backend returns a `slot_to_fire` in a heartbeat response:
1. Scheduler builds the message from `slot.wake_options.override_message`
or falls back to `slot.prompt`
2. `host.WakeAgent({agent_id, message, source: "calendar:slot-<id>"})`
3. Plexum host-side `wake.Manager`:
- if agent's sm-state is `idle` → runs the turn synchronously in a
goroutine against the agent's `wake` session
- else → enqueues (depth 1; new wake replaces any pending one)
- drains automatically when the running turn returns
4. The `source` tag lands on the turn's faithful event so retros can
tell which slot caused which turn
The agent uses `harborforge_calendar_complete` / `_abort` / `_pause` /
`_resume` mid-turn to push status back to the backend.
## Layout
```
HarborForge.PlexumPlugin/
├── manifest.json # plugin manifest (eager, 9 tools)
├── go.mod # → Plexum-sdk-go (replace ../)
├── cmd/plexum-harborforge-plugin/ # main entry (Serve + Init)
├── internal/config/ # config.json schema + Resolve
├── internal/telemetry/ # /proc-based snapshot collector
├── internal/monitor/ # HTTP bridge for HF.Monitor
├── internal/calendar/ # types + backend client + scheduler
├── internal/tools/ # 9 tool implementations
└── scripts/install.sh # build + drop into ~/.plexum/plugins
```
## Differences vs OpenClaw equivalent
| OpenClaw plugin | Plexum plugin |
|---|---|
| `api.registerTool(factory)` runtime | `ToolPlugin.CallTool` + manifest contract |
| `api.spawn({agentId, task})` | `HostAPI.WakeAgent({agent_id, message, source})` (state-aware queue) |
| `api.getAgentStatus()` | `HostAPI.ReadAgentState(ctx, agent_id)` |
| `--install-monitor` / `--install-cli` flags | n/a — Monitor + `hf` CLI deploy separately (e.g. via HangmanLab.Server.T3 docker compose) |
| TS source compiled by `tsc` | static Go binary built per-platform |