4 Commits

Author SHA1 Message Date
h z
98e663a19b Merge pull request 'fix: real per-agent slot handle for multi-agent calendar tools' (#7) from fix/multi-agent-scheduler-handle into main 2026-05-21 09:39:51 +00:00
hanghang zhang
d5cea9a44d fix(plugin): real per-agent slot handle for multi-agent calendar tools
In multi-agent sync mode every harborforge_calendar_* tool was returning
`calendarScheduler.<method> is not a function`. The cause: index.ts replaced
`calendarScheduler` (typed `CalendarScheduler | null`) with a `{ stop() }`
stub right after wiring the runSync/runCheck intervals, so `isRunning()`,
`getCurrentSlot()`, `completeCurrentSlot()`, `abortCurrentSlot()`,
`pauseCurrentSlot()`, `resumeCurrentSlot()`, `getState()`,
`isRestartPending()` and `getStateFilePath()` all blew up at call time.

Replaces the stub with a `MultiAgentSchedulerHandle` that:
  - tracks the last slot dispatched per agent (recorded by `wakeAgent`)
  - exposes status/complete/abort/pause/resume taking the calling agentId
  - resolves the implicit "current slot" via woken-cursor first then a
    cache scan over not_started/deferred/ongoing slots
  - PATCHes via `bridge.updateSlotAs(agentId, …)` so audit headers reflect
    the real caller (bridge constructor agentId is 'unused' in multi-agent)
  - mirrors the legacy `isRunning/isProcessing/getState/...` surface so
    the single-agent fallback (`CalendarScheduler`) keeps working unchanged

Each calendar tool factory now takes `OpenClawPluginToolContext`, reads
`ctx.agentId`, and dispatches through the handle. Single-agent path
(when `calendarScheduler` is a real `CalendarScheduler`) is preserved
behind `instanceof` checks.

Drops the dead `trackSessionCompletion` poll loop (only definition, no
caller) which referenced the removed `completeCurrentSlot`. Bumps
plugin version 0.2.0 → 0.3.2.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 10:38:57 +01:00
h z
f627845543 Merge pull request 'fix: wake dedupe + inline slot context + complete contracts.tools' (#6) from fix/wake-dedupe-and-contracts into main 2026-05-20 14:48:06 +00:00
hanghang zhang
b878fa2a41 fix: wake dedupe + inline slot context + complete contracts.tools
Three issues making HF→agent wakeup unusable in practice, surfaced by
DinD sim end-to-end test (recruiter agent + slot for 招募 manager task):

1. **Plugin re-woke the same slot every 30s.** The inline runCheck only
   destructured agentId from scheduleCache.getAgentsWithDueSlots() and
   dropped the slots array, then called wakeAgent without recording the
   wake. The simplified inline scheduler also never PATCHes slot status
   server-side from not_started→ongoing, so the next 30s check sees the
   slot still due and wakes again. After 4 wakes the agent's wakeup
   session was full of WAKEUP_OK noise.

   Fix: keep slots in runCheck, add an in-memory wakedSlotKeys set
   keyed by (agentId, slotId|virtual_id|scheduled_at). Dedupe on this
   set; clear it inside the sync interval (fresh wake budget per sync).
   Server-side slot transition still TODO (requires re-introducing the
   CalendarScheduler class path or PATCH /calendar/slots/.../agent-update
   here); the dedupe at least stops the wake spam.

2. **Wakeup message had no slot context.** The wakeup body just said
   'follow hf-wakeup workflow' with no slot id/event_data/task_code.
   The agent then had to call harborforge_calendar_status to learn
   anything — which itself is broken in the simplified scheduler (it
   queries a CalendarScheduler instance that never gets created).

   Fix: pass dueSlots into wakeAgent and inline the highest-priority
   slot's {slot_id, scheduled_at, priority, slot_type, event_data} as
   a JSON block in the wakeup message. The agent reads event_data.
   task_code directly and routes via workflow_lookup without any
   round-trip. Per PLG-CAL-001 docs in hf-hangman-lab SKILL.md, this
   is the documented contract; we are bringing the message in line.

3. **contracts.tools listed 5 of the 9 registered tools.** Manifest had
   harborforge_status/telemetry/monitor_telemetry/calendar_status/
   calendar_complete. Code also registers calendar_abort, calendar_pause,
   calendar_resume, harborforge_restart_status. With the new OpenClaw
   plugin host enforcement (same gotcha that bit Meridian — see
   zhi/Meridian#2), undeclared tools are silently dropped from the
   agent's tool list, so abort/pause/resume cannot be called by the
   agent. plugin doctor was emitting:
   'plugin tool is undeclared (harbor-forge): harborforge_calendar_abort'
   for each missing tool.

   Fix: add the 4 missing tool names to contracts.tools.

Also use api.config as the primary config source in wakeAgent (current
public API), falling back to runtime.config.loadConfig() for older
hosts — same pattern as the Meridian fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 12:02:25 +01:00
2 changed files with 13 additions and 176 deletions

View File

@@ -25,25 +25,6 @@ import {
CalendarScheduler,
} from './calendar/index.js';
// ---------------------------------------------------------------------------
// Module-scope calendar scheduler singleton.
//
// `register()` is called multiple times per gateway boot — once per agent
// (we see 5 `HarborForge plugin registered` lines for 5 agents on dind-t2).
// `gateway_start` only fires once, so before this lift the
// `startCalendarScheduler()` setup ran inside ONE closure while four other
// closures kept their own `calendarScheduler = null`. Whichever of the five
// tool registrations the gateway picked at call time was effectively a coin
// flip, and four times out of five `harborforge_calendar_status` returned
// `Calendar scheduler not running` even though the scheduler was active.
//
// Keeping the singleton at module scope removes the per-`register()` shadow:
// the scheduler is started once, every closure reads the same binding, and
// `startCalendarScheduler()` is idempotent so duplicate `gateway_start`
// firings are harmless.
// ---------------------------------------------------------------------------
let calendarScheduler: MultiAgentSchedulerHandle | CalendarScheduler | null = null;
interface PluginAPI {
logger: {
info: (...args: any[]) => void;
@@ -73,30 +54,6 @@ interface PluginAPI {
getAgentStatus?: () => Promise<{ status: string } | null>;
}
/**
* Coerce a tool execute() return value into the MCP `{ content: [...] }`
* shape that the openclaw Codex tool dispatcher requires.
*
* Background: openclaw's `convertToolContents()` does `result.content.reduce(...)`
* to compute total text length before flattening. Every HF tool here returned a
* bare object (`{ running, processing, currentSlot, ... }`) which has no
* `.content` field, so `undefined.reduce` threw and every call to
* `harborforge_*` from a Codex-harness agent surfaced as the cryptic
* `Cannot read properties of undefined (reading 'reduce')`. The fix is to
* wrap every tool's execute return; doing it at the `registerTool` boundary
* keeps each tool body unchanged.
*/
function ensureMcpContentShape(result: unknown): { content: Array<{ type: 'text'; text: string }> } {
if (
result && typeof result === 'object' &&
Array.isArray((result as { content?: unknown }).content)
) {
return result as { content: Array<{ type: 'text'; text: string }> };
}
const text = typeof result === 'string' ? result : JSON.stringify(result, null, 2);
return { content: [{ type: 'text', text }] };
}
function register(api: PluginAPI): void {
const logger = api.logger || {
info: (...args: any[]) => console.log('[HarborForge]', ...args),
@@ -105,22 +62,6 @@ function register(api: PluginAPI): void {
warn: (...args: any[]) => console.warn('[HarborForge]', ...args),
};
// Wrap api.registerTool so every tool's execute() return is coerced into
// the MCP `{ content: [...] }` shape openclaw expects. See
// `ensureMcpContentShape` above.
const _origRegisterTool = api.registerTool.bind(api);
api.registerTool = (factory: (ctx: any) => any) => {
_origRegisterTool((ctx: any) => {
const def = factory(ctx);
if (!def || typeof def.execute !== 'function') return def;
const origExecute = def.execute;
return {
...def,
execute: async (...args: any[]) => ensureMcpContentShape(await origExecute(...args)),
};
});
};
function resolveConfig() {
return getPluginConfig(api);
}
@@ -167,7 +108,7 @@ function register(api: PluginAPI): void {
},
openclaw: {
version: api.runtime?.version || api.version || 'unknown',
pluginVersion: '0.3.4', // Bumped for PLG-CAL-004
pluginVersion: '0.3.2', // Bumped for PLG-CAL-004
},
timestamp: new Date().toISOString(),
};
@@ -176,9 +117,13 @@ function register(api: PluginAPI): void {
// Periodic metadata push interval handle
let metaPushInterval: ReturnType<typeof setInterval> | null = null;
// (calendarScheduler is module-scope — see top of file for the why.
// Tools and lifecycle hooks all reference the same binding so the
// multi-register/single-start mismatch can't shadow them again.)
// Calendar scheduler instance.
//
// In multi-agent sync mode (the only path today) this is a
// {@link MultiAgentSchedulerHandle}. The legacy `CalendarScheduler` type
// is retained in the union for compatibility with the typed-only single-
// agent path that may be reintroduced later.
let calendarScheduler: MultiAgentSchedulerHandle | CalendarScheduler | null = null;
/**
* Push OpenClaw metadata to the Monitor bridge.
@@ -202,7 +147,7 @@ function register(api: PluginAPI): void {
const meta: OpenClawMeta = {
version: api.runtime?.version || api.version || 'unknown',
plugin_version: '0.3.4',
plugin_version: '0.3.2',
agents: agentNames.map(name => ({ name })),
};
@@ -307,22 +252,10 @@ function register(api: PluginAPI): void {
)}\n\`\`\``;
}
// The wakeup dispatcher's `deliver` callback below only logs the
// reply text — it does NOT inspect any ack token. The earlier
// `WAKEUP_OK` first-line-ack convention was prompt-only theatre;
// nothing in this plugin or in openclaw acted on it. The only
// thing that ends a wake cycle is the slot transitioning out of
// `not_started`, which happens when the agent calls
// `harborforge_calendar_complete` or `harborforge_calendar_abort`.
// Tell the agent that plainly instead of asking for a fake ack.
const wakeupMessage =
`You have due slots. Drive the \`hf-wakeup\` workflow of skill ` +
`\`hf-hangman-lab\` to completion in this session — read slot ` +
`context, call the harborforge_calendar_* tools, route to the ` +
`right sub-workflow, and finish with harborforge_calendar_complete ` +
`or harborforge_calendar_abort. The scheduler keeps re-waking you ` +
`every 30s until the slot transitions out of \`not_started\`, so ` +
`partial work or silence just produces another wake.${slotBlock}`;
`You have due slots. Follow the \`hf-wakeup\` workflow of skill ` +
`\`hf-hangman-lab\` to proceed. Only reply \`WAKEUP_OK\` in this ` +
`session.${slotBlock}`;
const result = await dispatchInboundMessageWithDispatcher({
ctx: {
@@ -358,16 +291,8 @@ function register(api: PluginAPI): void {
/**
* Initialize and start the calendar scheduler.
*
* Idempotent — `gateway_start` may fire once per `register()` invocation
* (the host calls `register` per agent), and we only want one set of
* sync/check intervals across the whole process.
*/
function startCalendarScheduler(): void {
if (calendarScheduler) {
logger.info('Calendar scheduler already started, skipping duplicate gateway_start');
return;
}
const live = resolveConfig();
// Create bridge client (claw-instance level, not per-agent)
@@ -397,94 +322,6 @@ function register(api: PluginAPI): void {
}
}
// Cross-plugin exposure: agent status lookup for other plugins
// (currently Fabric.OpenclawPlugin uses this to skip delivering
// `announce` channel messages to busy agents — see DIALECTIC-V2
// design doc, Phase 1). Backed by calendarBridge.getAgentStatus
// with a small TTL cache to avoid hammering the HF backend.
type HfStatus = 'idle' | 'on_call' | 'busy' | 'exhausted' | 'offline';
const HF_STATUS_CACHE_TTL_MS = 30_000;
const hfStatusCache = new Map<string, { status: HfStatus; at: number }>();
const _G = globalThis as Record<string, unknown>;
_G['__hfAgentStatus'] = {
async get(agentId: string): Promise<HfStatus | undefined> {
if (!agentId) return undefined;
const cached = hfStatusCache.get(agentId);
if (cached && Date.now() - cached.at < HF_STATUS_CACHE_TTL_MS) {
return cached.status;
}
try {
const status = await calendarBridge.getAgentStatus(agentId);
if (status) {
const typed = status as HfStatus;
hfStatusCache.set(agentId, { status: typed, at: Date.now() });
return typed;
}
} catch {
/* fall through to cached-or-undefined */
}
return cached?.status;
},
/**
* Approximate "does agent have an on_call slot covering [from, to]?"
* for cross-plugin pre-check use (currently:
* Dialectic.OpenclawPlugin's signup HF coverage).
*
* v1 honest scope: we only have today's slots in scheduleCache
* (synced from /calendar/sync which is today-only). Returns:
* - true iff window is same-day AND some cached on_call slot
* starts <= from AND ends >= to
* - false iff window is same-day AND no such slot
* - undefined for cross-day windows OR cache empty for this
* agent (caller treats undefined as "I don't know" — see
* Dialectic plugin's hf-precheck.ts which degrades to
* "skipped" gracefully)
*
* Phase TBD: when HF backend ships a `/calendar/slots?agent&from&to`
* endpoint, swap this to call it for arbitrary windows. Until then,
* same-day-only coverage gates ~all debates created by analyze-intel
* (which schedules <2h windows) without needing a backend change.
*/
async hasOnCallCovering(
agentId: string,
fromIso: string,
toIso: string,
): Promise<boolean | undefined> {
if (!agentId || !fromIso || !toIso) return undefined;
const from = new Date(fromIso);
const to = new Date(toIso);
if (isNaN(from.getTime()) || isNaN(to.getTime())) return undefined;
if (!(from < to)) return undefined;
// Cross-day → cache only has today; can't decide.
const fromDate = from.toISOString().slice(0, 10);
const toDate = to.toISOString().slice(0, 10);
if (fromDate !== toDate) return undefined;
// Cache's cachedDate must match our window's date.
const cacheStatus = scheduleCache.getStatus();
if (cacheStatus.cachedDate !== fromDate) return undefined;
const slots = scheduleCache.getAgentSlots(agentId);
if (slots.length === 0) return undefined; // cache empty for this agent — can't decide
for (const s of slots) {
if (s.slot_type !== 'on_call') continue;
// status: ignore aborted/cancelled, accept not_started / ongoing / finished
if (s.status === 'aborted' || s.status === 'cancelled') continue;
const startStr = s.scheduled_at;
if (typeof startStr !== 'string') continue;
// scheduled_at can be HH:MM:SS (cache-relative date) or full ISO
const start =
/^\d{2}:\d{2}(:\d{2})?$/.test(startStr)
? new Date(`${fromDate}T${startStr}Z`)
: new Date(startStr);
if (isNaN(start.getTime())) continue;
const dur = typeof s.estimated_duration === 'number' ? s.estimated_duration : 0;
const end = new Date(start.getTime() + dur * 60_000);
if (start <= from && end >= to) return true;
}
return false;
},
};
// Track wakes already dispatched for a slot in the current sync
// window — the simplified inline scheduler does not PATCH slot
// status server-side, so without dedupe the check loop re-wakes

View File

@@ -1,6 +1,6 @@
{
"name": "harbor-forge-plugin",
"version": "0.3.4",
"version": "0.3.2",
"description": "OpenClaw plugin for HarborForge monitor bridge and CLI integration",
"type": "module",
"main": "dist/index.js",