9 Commits

Author SHA1 Message Date
a1b4d347d9 fix(hf-plugin): wrap tool returns in MCP {content:[...]} shape
OpenClaw's Codex tool dispatcher (thread-lifecycle:255) expects every
tool execute() to return { content: [...] } and calls result.content.reduce()
to compute total text length. All 9 harborforge_* tools returned bare
objects ({ running, processing, currentSlot, ... }) which has no
.content field — so .reduce of undefined threw, and the agent saw the
cryptic 'Cannot read properties of undefined (reading reduce)' on
every call. This silently blocked every calendar slot transition on
prod for hours: agents could call harborforge_calendar_complete but
it always errored, so slots never moved out of not_started.

Fix is at the registerTool boundary: api.registerTool is wrapped once
to coerce every tool's execute return through ensureMcpContentShape.
Tools that already return the correct shape are unchanged. No per-tool
edits needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:48:05 +01:00
h z
2a2a298d15 Merge pull request 'fix: wakeup message says 'continue in same session', not 'only reply WAKEUP_OK'' (#9) from fix/wakeup-message-no-ack-only into main 2026-05-21 10:05:34 +00:00
hanghang zhang
102809dc2a fix(plugin): wakeup message says 'continue in same session', not 'only reply WAKEUP_OK'
E2e showed the old wakeup text trapped agents in an ack-only loop:

> "You have due slots. Follow the `hf-wakeup` workflow of skill
>  `hf-hangman-lab` to proceed. Only reply `WAKEUP_OK` in this session."

The two clauses contradicted each other — "follow the workflow" vs
"only reply WAKEUP_OK". MiniMax-M2.5 prioritised the literal "only"
and never proceeded past the ack; the scheduler then re-woke every 30s
because the slot stayed `not_started`, and the agent kept re-acking
forever (verified: 3 consecutive WAKEUP_OK-only replies across slot 7).

Rewrites the wakeup message to be explicit:
  - first line MUST be `WAKEUP_OK` (the ack token the plugin looks for)
  - then continue IN THE SAME session: drive calendar_status → task
    fetch → sub-workflow → calendar_complete/abort
  - flags the loop trap so the agent knows what to avoid

Bumps version 0.3.3 → 0.3.4.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 11:05:28 +01:00
h z
065b0d3da3 Merge pull request 'fix: lift calendarScheduler to module scope (multi-register singleton)' (#8) from fix/scheduler-module-singleton into main 2026-05-21 09:54:55 +00:00
hanghang zhang
afb8b25558 fix(plugin): lift calendarScheduler to module scope (multi-register singleton)
Trying the prior multi-agent-handle fix in dind-t2 surfaced a second bug
that PR #7 didn't reach: `harborforge_calendar_status` still returned
`Calendar scheduler not running` even though the gateway log showed the
scheduler had started 30+ seconds before the agent's call.

## Root cause

`register()` is invoked once per agent — `grep -c "HarborForge plugin
registered" /tmp/gw-stdout.log` reports 5 for a 5-agent claw. Every
invocation creates its own `let calendarScheduler` closure binding. But
`gateway_start` fires once and we only call `startCalendarScheduler()`
through that single hook, so exactly one of the five closures sees the
handle and the other four keep their bindings at `null`.

The host's tool router picks one of the five duplicate
`harborforge_calendar_status` registrations to dispatch to — most of the
time it's one of the four "null" closures, which is why every wakeup the
agent saw `Calendar scheduler not running`.

## Fix

Lift `let calendarScheduler` out of `register()` and into module scope.
All five register-call closures now reference the same binding; once the
single `gateway_start` initialises it, every tool sees it.

`startCalendarScheduler()` now early-returns when `calendarScheduler` is
already set, so duplicate `gateway_start` firings (if the host ever does
that) don't double-install intervals.

Bumps version 0.3.2 → 0.3.3.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 10:54:36 +01:00
h z
98e663a19b Merge pull request 'fix: real per-agent slot handle for multi-agent calendar tools' (#7) from fix/multi-agent-scheduler-handle into main 2026-05-21 09:39:51 +00:00
hanghang zhang
d5cea9a44d fix(plugin): real per-agent slot handle for multi-agent calendar tools
In multi-agent sync mode every harborforge_calendar_* tool was returning
`calendarScheduler.<method> is not a function`. The cause: index.ts replaced
`calendarScheduler` (typed `CalendarScheduler | null`) with a `{ stop() }`
stub right after wiring the runSync/runCheck intervals, so `isRunning()`,
`getCurrentSlot()`, `completeCurrentSlot()`, `abortCurrentSlot()`,
`pauseCurrentSlot()`, `resumeCurrentSlot()`, `getState()`,
`isRestartPending()` and `getStateFilePath()` all blew up at call time.

Replaces the stub with a `MultiAgentSchedulerHandle` that:
  - tracks the last slot dispatched per agent (recorded by `wakeAgent`)
  - exposes status/complete/abort/pause/resume taking the calling agentId
  - resolves the implicit "current slot" via woken-cursor first then a
    cache scan over not_started/deferred/ongoing slots
  - PATCHes via `bridge.updateSlotAs(agentId, …)` so audit headers reflect
    the real caller (bridge constructor agentId is 'unused' in multi-agent)
  - mirrors the legacy `isRunning/isProcessing/getState/...` surface so
    the single-agent fallback (`CalendarScheduler`) keeps working unchanged

Each calendar tool factory now takes `OpenClawPluginToolContext`, reads
`ctx.agentId`, and dispatches through the handle. Single-agent path
(when `calendarScheduler` is a real `CalendarScheduler`) is preserved
behind `instanceof` checks.

Drops the dead `trackSessionCompletion` poll loop (only definition, no
caller) which referenced the removed `completeCurrentSlot`. Bumps
plugin version 0.2.0 → 0.3.2.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 10:38:57 +01:00
h z
f627845543 Merge pull request 'fix: wake dedupe + inline slot context + complete contracts.tools' (#6) from fix/wake-dedupe-and-contracts into main 2026-05-20 14:48:06 +00:00
hanghang zhang
b878fa2a41 fix: wake dedupe + inline slot context + complete contracts.tools
Three issues making HF→agent wakeup unusable in practice, surfaced by
DinD sim end-to-end test (recruiter agent + slot for 招募 manager task):

1. **Plugin re-woke the same slot every 30s.** The inline runCheck only
   destructured agentId from scheduleCache.getAgentsWithDueSlots() and
   dropped the slots array, then called wakeAgent without recording the
   wake. The simplified inline scheduler also never PATCHes slot status
   server-side from not_started→ongoing, so the next 30s check sees the
   slot still due and wakes again. After 4 wakes the agent's wakeup
   session was full of WAKEUP_OK noise.

   Fix: keep slots in runCheck, add an in-memory wakedSlotKeys set
   keyed by (agentId, slotId|virtual_id|scheduled_at). Dedupe on this
   set; clear it inside the sync interval (fresh wake budget per sync).
   Server-side slot transition still TODO (requires re-introducing the
   CalendarScheduler class path or PATCH /calendar/slots/.../agent-update
   here); the dedupe at least stops the wake spam.

2. **Wakeup message had no slot context.** The wakeup body just said
   'follow hf-wakeup workflow' with no slot id/event_data/task_code.
   The agent then had to call harborforge_calendar_status to learn
   anything — which itself is broken in the simplified scheduler (it
   queries a CalendarScheduler instance that never gets created).

   Fix: pass dueSlots into wakeAgent and inline the highest-priority
   slot's {slot_id, scheduled_at, priority, slot_type, event_data} as
   a JSON block in the wakeup message. The agent reads event_data.
   task_code directly and routes via workflow_lookup without any
   round-trip. Per PLG-CAL-001 docs in hf-hangman-lab SKILL.md, this
   is the documented contract; we are bringing the message in line.

3. **contracts.tools listed 5 of the 9 registered tools.** Manifest had
   harborforge_status/telemetry/monitor_telemetry/calendar_status/
   calendar_complete. Code also registers calendar_abort, calendar_pause,
   calendar_resume, harborforge_restart_status. With the new OpenClaw
   plugin host enforcement (same gotcha that bit Meridian — see
   zhi/Meridian#2), undeclared tools are silently dropped from the
   agent's tool list, so abort/pause/resume cannot be called by the
   agent. plugin doctor was emitting:
   'plugin tool is undeclared (harbor-forge): harborforge_calendar_abort'
   for each missing tool.

   Fix: add the 4 missing tool names to contracts.tools.

Also use api.config as the primary config source in wakeAgent (current
public API), falling back to runtime.config.loadConfig() for older
hosts — same pattern as the Meridian fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 12:02:25 +01:00
2 changed files with 87 additions and 13 deletions

View File

@@ -25,6 +25,25 @@ import {
CalendarScheduler, CalendarScheduler,
} from './calendar/index.js'; } from './calendar/index.js';
// ---------------------------------------------------------------------------
// Module-scope calendar scheduler singleton.
//
// `register()` is called multiple times per gateway boot — once per agent
// (we see 5 `HarborForge plugin registered` lines for 5 agents on dind-t2).
// `gateway_start` only fires once, so before this lift the
// `startCalendarScheduler()` setup ran inside ONE closure while four other
// closures kept their own `calendarScheduler = null`. Whichever of the five
// tool registrations the gateway picked at call time was effectively a coin
// flip, and four times out of five `harborforge_calendar_status` returned
// `Calendar scheduler not running` even though the scheduler was active.
//
// Keeping the singleton at module scope removes the per-`register()` shadow:
// the scheduler is started once, every closure reads the same binding, and
// `startCalendarScheduler()` is idempotent so duplicate `gateway_start`
// firings are harmless.
// ---------------------------------------------------------------------------
let calendarScheduler: MultiAgentSchedulerHandle | CalendarScheduler | null = null;
interface PluginAPI { interface PluginAPI {
logger: { logger: {
info: (...args: any[]) => void; info: (...args: any[]) => void;
@@ -54,6 +73,30 @@ interface PluginAPI {
getAgentStatus?: () => Promise<{ status: string } | null>; getAgentStatus?: () => Promise<{ status: string } | null>;
} }
/**
* Coerce a tool execute() return value into the MCP `{ content: [...] }`
* shape that the openclaw Codex tool dispatcher requires.
*
* Background: openclaw's `convertToolContents()` does `result.content.reduce(...)`
* to compute total text length before flattening. Every HF tool here returned a
* bare object (`{ running, processing, currentSlot, ... }`) which has no
* `.content` field, so `undefined.reduce` threw and every call to
* `harborforge_*` from a Codex-harness agent surfaced as the cryptic
* `Cannot read properties of undefined (reading 'reduce')`. The fix is to
* wrap every tool's execute return; doing it at the `registerTool` boundary
* keeps each tool body unchanged.
*/
function ensureMcpContentShape(result: unknown): { content: Array<{ type: 'text'; text: string }> } {
if (
result && typeof result === 'object' &&
Array.isArray((result as { content?: unknown }).content)
) {
return result as { content: Array<{ type: 'text'; text: string }> };
}
const text = typeof result === 'string' ? result : JSON.stringify(result, null, 2);
return { content: [{ type: 'text', text }] };
}
function register(api: PluginAPI): void { function register(api: PluginAPI): void {
const logger = api.logger || { const logger = api.logger || {
info: (...args: any[]) => console.log('[HarborForge]', ...args), info: (...args: any[]) => console.log('[HarborForge]', ...args),
@@ -62,6 +105,22 @@ function register(api: PluginAPI): void {
warn: (...args: any[]) => console.warn('[HarborForge]', ...args), warn: (...args: any[]) => console.warn('[HarborForge]', ...args),
}; };
// Wrap api.registerTool so every tool's execute() return is coerced into
// the MCP `{ content: [...] }` shape openclaw expects. See
// `ensureMcpContentShape` above.
const _origRegisterTool = api.registerTool.bind(api);
api.registerTool = (factory: (ctx: any) => any) => {
_origRegisterTool((ctx: any) => {
const def = factory(ctx);
if (!def || typeof def.execute !== 'function') return def;
const origExecute = def.execute;
return {
...def,
execute: async (...args: any[]) => ensureMcpContentShape(await origExecute(...args)),
};
});
};
function resolveConfig() { function resolveConfig() {
return getPluginConfig(api); return getPluginConfig(api);
} }
@@ -108,7 +167,7 @@ function register(api: PluginAPI): void {
}, },
openclaw: { openclaw: {
version: api.runtime?.version || api.version || 'unknown', version: api.runtime?.version || api.version || 'unknown',
pluginVersion: '0.3.2', // Bumped for PLG-CAL-004 pluginVersion: '0.3.4', // Bumped for PLG-CAL-004
}, },
timestamp: new Date().toISOString(), timestamp: new Date().toISOString(),
}; };
@@ -117,13 +176,9 @@ function register(api: PluginAPI): void {
// Periodic metadata push interval handle // Periodic metadata push interval handle
let metaPushInterval: ReturnType<typeof setInterval> | null = null; let metaPushInterval: ReturnType<typeof setInterval> | null = null;
// Calendar scheduler instance. // (calendarScheduler is module-scope — see top of file for the why.
// // Tools and lifecycle hooks all reference the same binding so the
// In multi-agent sync mode (the only path today) this is a // multi-register/single-start mismatch can't shadow them again.)
// {@link MultiAgentSchedulerHandle}. The legacy `CalendarScheduler` type
// is retained in the union for compatibility with the typed-only single-
// agent path that may be reintroduced later.
let calendarScheduler: MultiAgentSchedulerHandle | CalendarScheduler | null = null;
/** /**
* Push OpenClaw metadata to the Monitor bridge. * Push OpenClaw metadata to the Monitor bridge.
@@ -147,7 +202,7 @@ function register(api: PluginAPI): void {
const meta: OpenClawMeta = { const meta: OpenClawMeta = {
version: api.runtime?.version || api.version || 'unknown', version: api.runtime?.version || api.version || 'unknown',
plugin_version: '0.3.2', plugin_version: '0.3.4',
agents: agentNames.map(name => ({ name })), agents: agentNames.map(name => ({ name })),
}; };
@@ -252,10 +307,21 @@ function register(api: PluginAPI): void {
)}\n\`\`\``; )}\n\`\`\``;
} }
// First-line ack `WAKEUP_OK` is the plugin's ack-receipt token; the
// agent MUST then continue in the same session and drive the
// `hf-wakeup` workflow to completion (calendar_status → task fetch →
// sub-workflow → calendar_complete/abort). Without that continuation
// the scheduler keeps re-waking every 30s because the slot stays
// `not_started` forever.
const wakeupMessage = const wakeupMessage =
`You have due slots. Follow the \`hf-wakeup\` workflow of skill ` + `You have due slots. **First line of your reply MUST be exactly ` +
`\`hf-hangman-lab\` to proceed. Only reply \`WAKEUP_OK\` in this ` + `\`WAKEUP_OK\`** so the plugin records the ack. Then, **in this ` +
`session.${slotBlock}`; `same session**, drive the \`hf-wakeup\` workflow of skill ` +
`\`hf-hangman-lab\` to completion — read slot context, call the ` +
`harborforge_calendar_* tools, route to the right sub-workflow, ` +
`and finish with harborforge_calendar_complete or abort. Do NOT ` +
`stop after the ack — the scheduler will re-wake you every 30s ` +
`until the slot transitions out of \`not_started\`.${slotBlock}`;
const result = await dispatchInboundMessageWithDispatcher({ const result = await dispatchInboundMessageWithDispatcher({
ctx: { ctx: {
@@ -291,8 +357,16 @@ function register(api: PluginAPI): void {
/** /**
* Initialize and start the calendar scheduler. * Initialize and start the calendar scheduler.
*
* Idempotent — `gateway_start` may fire once per `register()` invocation
* (the host calls `register` per agent), and we only want one set of
* sync/check intervals across the whole process.
*/ */
function startCalendarScheduler(): void { function startCalendarScheduler(): void {
if (calendarScheduler) {
logger.info('Calendar scheduler already started, skipping duplicate gateway_start');
return;
}
const live = resolveConfig(); const live = resolveConfig();
// Create bridge client (claw-instance level, not per-agent) // Create bridge client (claw-instance level, not per-agent)

View File

@@ -1,6 +1,6 @@
{ {
"name": "harbor-forge-plugin", "name": "harbor-forge-plugin",
"version": "0.3.2", "version": "0.3.4",
"description": "OpenClaw plugin for HarborForge monitor bridge and CLI integration", "description": "OpenClaw plugin for HarborForge monitor bridge and CLI integration",
"type": "module", "type": "module",
"main": "dist/index.js", "main": "dist/index.js",