17 Commits

Author SHA1 Message Date
h z
06ccd3564e Merge pull request 'refactor(install): clone HarborForge.Cli to /tmp instead of fixed path' (#12) from refactor/install-cli-clone-from-repo into main 2026-05-29 07:52:40 +00:00
2977ab369e refactor(install): clone HarborForge.Cli to /tmp instead of fixed path
`installCli()` used to look for the CLI source at a fixed path relative
to the plugin checkout: either `./HarborForge.Cli` or `../HarborForge.Cli`.
That breaks any install layout where the plugin lives on its own — the
script just logs "Skipping CLI installation" and returns. Same anti-
pattern installManagedMonitor had already fixed for the monitor binary.

Mirror the monitor flow:

  1. git clone --depth 1 --branch <cliBranch> CLI_REPO_URL → /tmp/<dir>
  2. go build -ldflags Version=<date>+<branch>-<sha> -o $openclaw/bin/hf
  3. chmod 755 + delete tmp dir on success or failure

Adds `--cli-branch <name>` (default: main) for parity with --monitor-branch.

Also stamps the binary with a real version string (was 'dev' before this
patch) so `hf version` is informative for debugging.
2026-05-29 08:52:14 +01:00
zhi
c8998c6b0d Merge pull request 'perf(meta-push): use cached api.config instead of deprecated loadConfig() — kills ~25% chronic baseline CPU' (#11) from fix/meta-push-use-cached-api-config into main 2026-05-27 08:25:34 +00:00
686f2c7cb0 perf(meta-push): use cached api.config instead of deprecated loadConfig()
`pushMetaToMonitor` and `resolveAgentId` were both calling
`api.runtime?.config?.loadConfig?.()` to read the agent list. That
deprecated path (openclaw warns at gateway start:
"plugin runtime config.loadConfig() is deprecated; use config.current()")
synchronously rebuilds the full plugin-metadata snapshot — realpathSync
walks every plugin's package.json + manifest + source up the directory
tree, hashWatchedFiles fingerprints every watched plugin file, and
discoverInDirectory re-scans every `dist/extensions/<plugin>` (~100 of
them on prod t2). Each rebuild costs ~6-7s of gateway CPU.

`pushMetaToMonitor` fires every `reportIntervalSec` (default 30s)
from `hooks/gateway-start.js`. With 100 plugins that put the gateway
into a chronic ~22-30% CPU baseline even with zero agent activity. V8
profile 2026-05-27 08:14:00 60s window (0 turns, 2 metadata pushes
during): lstat 44.2%, statSync(buildInstalledManifestRegistryIndexKey)
6.9%, hashWatchedFiles via memo key 1.7%, all routed through
`readPersistedInstalledPluginIndexInstallRecordsSync` -> per-plugin
`discoverInDirectory`.

Switching to `(api as any).config ?? api.runtime?.config?.loadConfig?.()`
reads from the snapshot cache the gateway already maintains — the same
pattern already used elsewhere in this file (e.g. the calendar wakeAgent
dispatcher at line 284). Same change applied to `resolveAgentId` (only
runs once at start, but same anti-pattern).

This is a plugin-side perf workaround. The underlying openclaw bug is
that `loadConfig()` rebuilds the snapshot rather than returning the
cached one — a chronic 'all sync cache validity checks pay the full
discovery cost' design issue worth pushing upstream separately (the
walks per-call cost we measured here is unrelated to and amplifies any
agent-turn-triggered walk path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 09:17:39 +01:00
h z
81d40ae63d fix(wakeup): drop WAKEUP_OK ack-token theatre (#10) 2026-05-26 08:13:42 +00:00
65a3fb8d2d fix(wakeup): drop WAKEUP_OK ack-token theatre from wakeup message
The wakeup dispatcher's `deliver` callback only does
`logger.info(reply.slice(0,100))` — no token detection, no scheduler
state change. The "first line of your reply MUST be exactly WAKEUP_OK
so the plugin records the ack" instruction was prompt theatre that
nothing in this plugin (or in openclaw) acted on. Confirmed by
reading openclaw/dist/plugin-sdk/src/auto-reply/tokens.d.ts which
declares HEARTBEAT_OK and SILENT_REPLY tokens but nothing for wakeup.

Symptom in the wild: agents would replay WAKEUP_OK every turn for no
gain — costing model budget on a no-op token — and the workflow doc
(`ClawSkills/workflows/hf-wakeup/flow.md`) carried a wandering
appendix explaining the ack "doesn't actually do anything anyway".

Rewrite the wakeup message to tell the agent the truth: drive the
hf-wakeup workflow to completion; the scheduler keeps re-waking
every 30s until the slot transitions out of `not_started` via
harborforge_calendar_complete or _abort. No ack token expected.

ClawSkills companion change (lyn/ClawSkills d0109f3) removes
WAKEUP_OK from skills/hf-hangman-lab/SKILL.md and
workflows/hf-wakeup/flow.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 09:10:47 +01:00
c2d00c18a7 feat(hf-plugin): __hfAgentStatus.hasOnCallCovering(agentId, from, to)
Cross-plugin accessor for "does agent have on_call slot covering this
window?" — first consumer is Dialectic.OpenclawPlugin signup pre-check
(its hf-precheck.ts has been degrading to "skipped" since Phase 3
ship pending this).

v1 honest scope: same-day windows only (scheduleCache is today-only
from /calendar/sync). Cross-day or empty-cache windows return undefined
which the caller treats as "skipped" (Dialectic backend stores
pre_validated:false as audit signal — same as before, just now we
actually validate when we can).

Logic: for each cached slot where slot_type=on_call AND status not in
{aborted,cancelled}, parse scheduled_at (HH:MM:SS or full ISO) and
estimated_duration to compute end; return true iff start<=from AND
end>=to. Returns false (not undefined) when cache has slots for the
agent on this date but none covers — that means "actually no coverage"
vs "I dont know".

Pairs with Dialectic.OpenclawPlugin/src/hf-precheck.ts which already
calls hf.hasOnCallCovering and handles all 3 return shapes.

No backend change required.
2026-05-23 14:58:37 +01:00
709f7e09ab feat(hf-plugin): expose globalThis.__hfAgentStatus.get(agentId)
Cross-plugin agent-status accessor for use by Fabric.OpenclawPlugin's
presence-sync loop (and any future plugin needing 'is agent X busy
right now'). Backed by CalendarBridgeClient.getAgentStatus() with a
30s in-memory TTL cache to avoid hammering the HF backend.

Returns one of 'idle' | 'on_call' | 'busy' | 'exhausted' | 'offline'
or undefined when the agent isn't known to HF. Cache miss + bridge
failure returns the last cached value (stale-data better than no
data for delivery-decision use cases).

Part of DIALECTIC-V2 Phase 1 (Fabric announce channel + busy-discard).
See /home/hzhang/arch/DIALECTIC-V2-DESIGN.md sections 7+8.
2026-05-23 11:31:27 +01:00
1c4cf773e5 fix(hf-plugin): wrap tool returns in MCP {content:[...]} shape
OpenClaw's Codex tool dispatcher (thread-lifecycle:255) expects every
tool execute() to return { content: [...] } and calls result.content.reduce()
to compute total text length. All 9 harborforge_* tools returned bare
objects ({ running, processing, currentSlot, ... }) which has no
.content field — so .reduce of undefined threw, and the agent saw the
cryptic 'Cannot read properties of undefined (reading reduce)' on
every call. This silently blocked every calendar slot transition on
prod for hours: agents could call harborforge_calendar_complete but
it always errored, so slots never moved out of not_started.

Fix is at the registerTool boundary: api.registerTool is wrapped once
to coerce every tool's execute return through ensureMcpContentShape.
Tools that already return the correct shape are unchanged. No per-tool
edits needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:48:05 +01:00
h z
ba420e858a Merge pull request 'fix: wakeup message says 'continue in same session', not 'only reply WAKEUP_OK'' (#9) from fix/wakeup-message-no-ack-only into main 2026-05-21 10:05:34 +00:00
e42b6b04ee fix(plugin): wakeup message says 'continue in same session', not 'only reply WAKEUP_OK'
E2e showed the old wakeup text trapped agents in an ack-only loop:

> "You have due slots. Follow the `hf-wakeup` workflow of skill
>  `hf-hangman-lab` to proceed. Only reply `WAKEUP_OK` in this session."

The two clauses contradicted each other — "follow the workflow" vs
"only reply WAKEUP_OK". MiniMax-M2.5 prioritised the literal "only"
and never proceeded past the ack; the scheduler then re-woke every 30s
because the slot stayed `not_started`, and the agent kept re-acking
forever (verified: 3 consecutive WAKEUP_OK-only replies across slot 7).

Rewrites the wakeup message to be explicit:
  - first line MUST be `WAKEUP_OK` (the ack token the plugin looks for)
  - then continue IN THE SAME session: drive calendar_status → task
    fetch → sub-workflow → calendar_complete/abort
  - flags the loop trap so the agent knows what to avoid

Bumps version 0.3.3 → 0.3.4.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 11:05:28 +01:00
h z
610f5961d1 Merge pull request 'fix: lift calendarScheduler to module scope (multi-register singleton)' (#8) from fix/scheduler-module-singleton into main 2026-05-21 09:54:55 +00:00
428fc254ee fix(plugin): lift calendarScheduler to module scope (multi-register singleton)
Trying the prior multi-agent-handle fix in dind-t2 surfaced a second bug
that PR #7 didn't reach: `harborforge_calendar_status` still returned
`Calendar scheduler not running` even though the gateway log showed the
scheduler had started 30+ seconds before the agent's call.

## Root cause

`register()` is invoked once per agent — `grep -c "HarborForge plugin
registered" /tmp/gw-stdout.log` reports 5 for a 5-agent claw. Every
invocation creates its own `let calendarScheduler` closure binding. But
`gateway_start` fires once and we only call `startCalendarScheduler()`
through that single hook, so exactly one of the five closures sees the
handle and the other four keep their bindings at `null`.

The host's tool router picks one of the five duplicate
`harborforge_calendar_status` registrations to dispatch to — most of the
time it's one of the four "null" closures, which is why every wakeup the
agent saw `Calendar scheduler not running`.

## Fix

Lift `let calendarScheduler` out of `register()` and into module scope.
All five register-call closures now reference the same binding; once the
single `gateway_start` initialises it, every tool sees it.

`startCalendarScheduler()` now early-returns when `calendarScheduler` is
already set, so duplicate `gateway_start` firings (if the host ever does
that) don't double-install intervals.

Bumps version 0.3.2 → 0.3.3.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 10:54:36 +01:00
h z
ff9929ad31 Merge pull request 'fix: real per-agent slot handle for multi-agent calendar tools' (#7) from fix/multi-agent-scheduler-handle into main 2026-05-21 09:39:51 +00:00
07a07b6876 fix(plugin): real per-agent slot handle for multi-agent calendar tools
In multi-agent sync mode every harborforge_calendar_* tool was returning
`calendarScheduler.<method> is not a function`. The cause: index.ts replaced
`calendarScheduler` (typed `CalendarScheduler | null`) with a `{ stop() }`
stub right after wiring the runSync/runCheck intervals, so `isRunning()`,
`getCurrentSlot()`, `completeCurrentSlot()`, `abortCurrentSlot()`,
`pauseCurrentSlot()`, `resumeCurrentSlot()`, `getState()`,
`isRestartPending()` and `getStateFilePath()` all blew up at call time.

Replaces the stub with a `MultiAgentSchedulerHandle` that:
  - tracks the last slot dispatched per agent (recorded by `wakeAgent`)
  - exposes status/complete/abort/pause/resume taking the calling agentId
  - resolves the implicit "current slot" via woken-cursor first then a
    cache scan over not_started/deferred/ongoing slots
  - PATCHes via `bridge.updateSlotAs(agentId, …)` so audit headers reflect
    the real caller (bridge constructor agentId is 'unused' in multi-agent)
  - mirrors the legacy `isRunning/isProcessing/getState/...` surface so
    the single-agent fallback (`CalendarScheduler`) keeps working unchanged

Each calendar tool factory now takes `OpenClawPluginToolContext`, reads
`ctx.agentId`, and dispatches through the handle. Single-agent path
(when `calendarScheduler` is a real `CalendarScheduler`) is preserved
behind `instanceof` checks.

Drops the dead `trackSessionCompletion` poll loop (only definition, no
caller) which referenced the removed `completeCurrentSlot`. Bumps
plugin version 0.2.0 → 0.3.2.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 10:38:57 +01:00
h z
a003416e56 Merge pull request 'fix: wake dedupe + inline slot context + complete contracts.tools' (#6) from fix/wake-dedupe-and-contracts into main 2026-05-20 14:48:06 +00:00
9ba591795b fix: wake dedupe + inline slot context + complete contracts.tools
Three issues making HF→agent wakeup unusable in practice, surfaced by
DinD sim end-to-end test (recruiter agent + slot for 招募 manager task):

1. **Plugin re-woke the same slot every 30s.** The inline runCheck only
   destructured agentId from scheduleCache.getAgentsWithDueSlots() and
   dropped the slots array, then called wakeAgent without recording the
   wake. The simplified inline scheduler also never PATCHes slot status
   server-side from not_started→ongoing, so the next 30s check sees the
   slot still due and wakes again. After 4 wakes the agent's wakeup
   session was full of WAKEUP_OK noise.

   Fix: keep slots in runCheck, add an in-memory wakedSlotKeys set
   keyed by (agentId, slotId|virtual_id|scheduled_at). Dedupe on this
   set; clear it inside the sync interval (fresh wake budget per sync).
   Server-side slot transition still TODO (requires re-introducing the
   CalendarScheduler class path or PATCH /calendar/slots/.../agent-update
   here); the dedupe at least stops the wake spam.

2. **Wakeup message had no slot context.** The wakeup body just said
   'follow hf-wakeup workflow' with no slot id/event_data/task_code.
   The agent then had to call harborforge_calendar_status to learn
   anything — which itself is broken in the simplified scheduler (it
   queries a CalendarScheduler instance that never gets created).

   Fix: pass dueSlots into wakeAgent and inline the highest-priority
   slot's {slot_id, scheduled_at, priority, slot_type, event_data} as
   a JSON block in the wakeup message. The agent reads event_data.
   task_code directly and routes via workflow_lookup without any
   round-trip. Per PLG-CAL-001 docs in hf-hangman-lab SKILL.md, this
   is the documented contract; we are bringing the message in line.

3. **contracts.tools listed 5 of the 9 registered tools.** Manifest had
   harborforge_status/telemetry/monitor_telemetry/calendar_status/
   calendar_complete. Code also registers calendar_abort, calendar_pause,
   calendar_resume, harborforge_restart_status. With the new OpenClaw
   plugin host enforcement (same gotcha that bit Meridian — see
   zhi/Meridian#2), undeclared tools are silently dropped from the
   agent's tool list, so abort/pause/resume cannot be called by the
   agent. plugin doctor was emitting:
   'plugin tool is undeclared (harbor-forge): harborforge_calendar_abort'
   for each missing tool.

   Fix: add the 4 missing tool names to contracts.tools.

Also use api.config as the primary config source in wakeAgent (current
public API), falling back to runtime.config.loadConfig() for older
hosts — same pattern as the Meridian fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 12:02:25 +01:00
2 changed files with 199 additions and 41 deletions

View File

@@ -73,6 +73,30 @@ interface PluginAPI {
getAgentStatus?: () => Promise<{ status: string } | null>;
}
/**
* Coerce a tool execute() return value into the MCP `{ content: [...] }`
* shape that the openclaw Codex tool dispatcher requires.
*
* Background: openclaw's `convertToolContents()` does `result.content.reduce(...)`
* to compute total text length before flattening. Every HF tool here returned a
* bare object (`{ running, processing, currentSlot, ... }`) which has no
* `.content` field, so `undefined.reduce` threw and every call to
* `harborforge_*` from a Codex-harness agent surfaced as the cryptic
* `Cannot read properties of undefined (reading 'reduce')`. The fix is to
* wrap every tool's execute return; doing it at the `registerTool` boundary
* keeps each tool body unchanged.
*/
function ensureMcpContentShape(result: unknown): { content: Array<{ type: 'text'; text: string }> } {
if (
result && typeof result === 'object' &&
Array.isArray((result as { content?: unknown }).content)
) {
return result as { content: Array<{ type: 'text'; text: string }> };
}
const text = typeof result === 'string' ? result : JSON.stringify(result, null, 2);
return { content: [{ type: 'text', text }] };
}
function register(api: PluginAPI): void {
const logger = api.logger || {
info: (...args: any[]) => console.log('[HarborForge]', ...args),
@@ -81,6 +105,22 @@ function register(api: PluginAPI): void {
warn: (...args: any[]) => console.warn('[HarborForge]', ...args),
};
// Wrap api.registerTool so every tool's execute() return is coerced into
// the MCP `{ content: [...] }` shape openclaw expects. See
// `ensureMcpContentShape` above.
const _origRegisterTool = api.registerTool.bind(api);
api.registerTool = (factory: (ctx: any) => any) => {
_origRegisterTool((ctx: any) => {
const def = factory(ctx);
if (!def || typeof def.execute !== 'function') return def;
const origExecute = def.execute;
return {
...def,
execute: async (...args: any[]) => ensureMcpContentShape(await origExecute(...args)),
};
});
};
function resolveConfig() {
return getPluginConfig(api);
}
@@ -88,7 +128,9 @@ function register(api: PluginAPI): void {
/** Resolve agent ID from env, config, or fallback. */
function resolveAgentId(): string {
if (process.env.AGENT_ID) return process.env.AGENT_ID;
const cfg = api.runtime?.config?.loadConfig?.();
// Read from cached `api.config` first — see pushMetaToMonitor for why
// the deprecated `api.runtime?.config?.loadConfig?.()` path is heavy.
const cfg = (api as any).config ?? api.runtime?.config?.loadConfig?.();
return cfg?.agents?.list?.[0]?.id ?? cfg?.agents?.defaults?.id ?? 'unknown';
}
@@ -144,6 +186,25 @@ function register(api: PluginAPI): void {
* Push OpenClaw metadata to the Monitor bridge.
* This enriches Monitor heartbeats with OpenClaw version/plugin/agent info.
* Failures are non-fatal — Monitor continues to work without this data.
*
* IMPORTANT — read config from the cached `api.config` surface, NOT from
* the deprecated `api.runtime?.config?.loadConfig?.()` path. The
* deprecated path triggers a full plugin-metadata-snapshot rebuild on
* every call: realpathSync walks every plugin's package.json + manifest
* + source paths (lstats up the directory tree), `hashWatchedFiles`
* fingerprints all watched plugin files, and `discoverInDirectory`
* re-scans every `dist/extensions/<plugin>` dir. On t2 with ~100 plugins
* each rebuild costs ~6-7s of CPU; with this push firing every 30s
* (default reportIntervalSec) the chronic baseline was ~22-25% gateway
* CPU even with zero agent activity (V8 profile 2026-05-27 08:14:00 60s:
* lstat 44.2%, statSync 6.9%, hashWatchedFiles via memo key 1.7%, all
* routed through readPersistedInstalledPluginIndexInstallRecordsSync ->
* discoverInDirectory). Switching to `api.config` reads from the
* already-loaded snapshot cache; the elsewhere-in-this-file pattern was
* already `api.config ?? api.runtime?.config?.loadConfig?.()`.
*
* Same fix is applied to `resolveAgentId` below — that's read once at
* gateway start so the impact is smaller, but it's the same anti-pattern.
*/
async function pushMetaToMonitor() {
const bridgeClient = getBridgeClient();
@@ -151,7 +212,7 @@ function register(api: PluginAPI): void {
let agentNames: string[] = [];
try {
const cfg = api.runtime?.config?.loadConfig?.();
const cfg = (api as any).config ?? api.runtime?.config?.loadConfig?.();
const agentsList = cfg?.agents?.list;
if (Array.isArray(agentsList)) {
agentNames = agentsList
@@ -267,21 +328,22 @@ function register(api: PluginAPI): void {
)}\n\`\`\``;
}
// First-line ack `WAKEUP_OK` is the plugin's ack-receipt token; the
// agent MUST then continue in the same session and drive the
// `hf-wakeup` workflow to completion (calendar_status → task fetch →
// sub-workflow → calendar_complete/abort). Without that continuation
// the scheduler keeps re-waking every 30s because the slot stays
// `not_started` forever.
// The wakeup dispatcher's `deliver` callback below only logs the
// reply text — it does NOT inspect any ack token. The earlier
// `WAKEUP_OK` first-line-ack convention was prompt-only theatre;
// nothing in this plugin or in openclaw acted on it. The only
// thing that ends a wake cycle is the slot transitioning out of
// `not_started`, which happens when the agent calls
// `harborforge_calendar_complete` or `harborforge_calendar_abort`.
// Tell the agent that plainly instead of asking for a fake ack.
const wakeupMessage =
`You have due slots. **First line of your reply MUST be exactly ` +
`\`WAKEUP_OK\`** so the plugin records the ack. Then, **in this ` +
`same session**, drive the \`hf-wakeup\` workflow of skill ` +
`\`hf-hangman-lab\` to completion — read slot context, call the ` +
`harborforge_calendar_* tools, route to the right sub-workflow, ` +
`and finish with harborforge_calendar_complete or abort. Do NOT ` +
`stop after the ack — the scheduler will re-wake you every 30s ` +
`until the slot transitions out of \`not_started\`.${slotBlock}`;
`You have due slots. Drive the \`hf-wakeup\` workflow of skill ` +
`\`hf-hangman-lab\` to completion in this session — read slot ` +
`context, call the harborforge_calendar_* tools, route to the ` +
`right sub-workflow, and finish with harborforge_calendar_complete ` +
`or harborforge_calendar_abort. The scheduler keeps re-waking you ` +
`every 30s until the slot transitions out of \`not_started\`, so ` +
`partial work or silence just produces another wake.${slotBlock}`;
const result = await dispatchInboundMessageWithDispatcher({
ctx: {
@@ -356,6 +418,94 @@ function register(api: PluginAPI): void {
}
}
// Cross-plugin exposure: agent status lookup for other plugins
// (currently Fabric.OpenclawPlugin uses this to skip delivering
// `announce` channel messages to busy agents — see DIALECTIC-V2
// design doc, Phase 1). Backed by calendarBridge.getAgentStatus
// with a small TTL cache to avoid hammering the HF backend.
type HfStatus = 'idle' | 'on_call' | 'busy' | 'exhausted' | 'offline';
const HF_STATUS_CACHE_TTL_MS = 30_000;
const hfStatusCache = new Map<string, { status: HfStatus; at: number }>();
const _G = globalThis as Record<string, unknown>;
_G['__hfAgentStatus'] = {
async get(agentId: string): Promise<HfStatus | undefined> {
if (!agentId) return undefined;
const cached = hfStatusCache.get(agentId);
if (cached && Date.now() - cached.at < HF_STATUS_CACHE_TTL_MS) {
return cached.status;
}
try {
const status = await calendarBridge.getAgentStatus(agentId);
if (status) {
const typed = status as HfStatus;
hfStatusCache.set(agentId, { status: typed, at: Date.now() });
return typed;
}
} catch {
/* fall through to cached-or-undefined */
}
return cached?.status;
},
/**
* Approximate "does agent have an on_call slot covering [from, to]?"
* for cross-plugin pre-check use (currently:
* Dialectic.OpenclawPlugin's signup HF coverage).
*
* v1 honest scope: we only have today's slots in scheduleCache
* (synced from /calendar/sync which is today-only). Returns:
* - true iff window is same-day AND some cached on_call slot
* starts <= from AND ends >= to
* - false iff window is same-day AND no such slot
* - undefined for cross-day windows OR cache empty for this
* agent (caller treats undefined as "I don't know" — see
* Dialectic plugin's hf-precheck.ts which degrades to
* "skipped" gracefully)
*
* Phase TBD: when HF backend ships a `/calendar/slots?agent&from&to`
* endpoint, swap this to call it for arbitrary windows. Until then,
* same-day-only coverage gates ~all debates created by analyze-intel
* (which schedules <2h windows) without needing a backend change.
*/
async hasOnCallCovering(
agentId: string,
fromIso: string,
toIso: string,
): Promise<boolean | undefined> {
if (!agentId || !fromIso || !toIso) return undefined;
const from = new Date(fromIso);
const to = new Date(toIso);
if (isNaN(from.getTime()) || isNaN(to.getTime())) return undefined;
if (!(from < to)) return undefined;
// Cross-day → cache only has today; can't decide.
const fromDate = from.toISOString().slice(0, 10);
const toDate = to.toISOString().slice(0, 10);
if (fromDate !== toDate) return undefined;
// Cache's cachedDate must match our window's date.
const cacheStatus = scheduleCache.getStatus();
if (cacheStatus.cachedDate !== fromDate) return undefined;
const slots = scheduleCache.getAgentSlots(agentId);
if (slots.length === 0) return undefined; // cache empty for this agent — can't decide
for (const s of slots) {
if (s.slot_type !== 'on_call') continue;
// status: ignore aborted/cancelled, accept not_started / ongoing / finished
if (s.status === 'aborted' || s.status === 'cancelled') continue;
const startStr = s.scheduled_at;
if (typeof startStr !== 'string') continue;
// scheduled_at can be HH:MM:SS (cache-relative date) or full ISO
const start =
/^\d{2}:\d{2}(:\d{2})?$/.test(startStr)
? new Date(`${fromDate}T${startStr}Z`)
: new Date(startStr);
if (isNaN(start.getTime())) continue;
const dur = typeof s.estimated_duration === 'number' ? s.estimated_duration : 0;
const end = new Date(start.getTime() + dur * 60_000);
if (start <= from && end >= to) return true;
}
return false;
},
};
// Track wakes already dispatched for a slot in the current sync
// window — the simplified inline scheduler does not PATCH slot
// status server-side, so without dedupe the check loop re-wakes

View File

@@ -31,6 +31,7 @@ const OLD_PLUGIN_NAME = 'harborforge-monitor';
const PLUGIN_SRC_DIR = join(__dirname, 'plugin');
const SKILLS_SRC_DIR = join(__dirname, 'skills');
const MONITOR_REPO_URL = 'https://git.hangman-lab.top/zhi/HarborForge.Monitor.git';
const CLI_REPO_URL = 'https://git.hangman-lab.top/zhi/HarborForge.Cli.git';
const args = process.argv.slice(2);
const options = {
@@ -43,6 +44,7 @@ const options = {
installCli: args.includes('--install-cli'),
installMonitor: 'no',
monitorBranch: 'main',
cliBranch: 'main',
};
const profileIdx = args.indexOf('--openclaw-profile-path');
@@ -60,6 +62,11 @@ if (monitorBranchIdx !== -1 && args[monitorBranchIdx + 1]) {
options.monitorBranch = String(args[monitorBranchIdx + 1]);
}
const cliBranchIdx = args.indexOf('--cli-branch');
if (cliBranchIdx !== -1 && args[cliBranchIdx + 1]) {
options.cliBranch = String(args[cliBranchIdx + 1]);
}
function resolveOpenclawPath() {
if (options.openclawProfilePath) return options.openclawProfilePath;
if (process.env.OPENCLAW_PATH) return resolve(process.env.OPENCLAW_PATH);
@@ -316,39 +323,40 @@ async function installCli() {
if (!options.installCli) return;
const totalSteps = 6;
logStep(5, totalSteps, 'Building and installing hf CLI...');
const openclawPath = resolveOpenclawPath();
const binDir = join(openclawPath, 'bin');
mkdirSync(binDir, { recursive: true });
// Find CLI source — look for HarborForge.Cli relative to project root
const projectRoot = resolve(__dirname, '..');
const cliDir = join(projectRoot, 'HarborForge.Cli');
if (!existsSync(cliDir)) {
// Try parent directory (monorepo layout)
const monoCliDir = resolve(projectRoot, '..', 'HarborForge.Cli');
if (!existsSync(monoCliDir)) {
logErr(`Cannot find HarborForge.Cli at ${cliDir} or ${monoCliDir}`);
logWarn('Skipping CLI installation');
return;
}
}
const effectiveCliDir = existsSync(cliDir)
? cliDir
: resolve(projectRoot, '..', 'HarborForge.Cli');
log(` Building hf from ${effectiveCliDir}...`, 'blue');
// Clone CLI repo to /tmp, build there, copy artifact out. Mirrors
// installManagedMonitor so the install never depends on a checked-out
// sibling repo at a fixed path.
const tmpDir = join('/tmp', `harborforge-cli-${Date.now()}`);
const hfBinary = join(binDir, 'hf');
try {
const hfBinary = join(binDir, 'hf');
exec(`go build -o ${hfBinary} ./cmd/hf`, { cwd: effectiveCliDir, silent: !options.verbose });
log(` Cloning ${CLI_REPO_URL} (branch ${options.cliBranch}) → ${tmpDir}...`, 'blue');
exec(`git clone --branch ${shellEscape(options.cliBranch)} --depth 1 ${shellEscape(CLI_REPO_URL)} ${shellEscape(tmpDir)}`, { silent: !options.verbose });
// Stamp the binary with the version string the prod CLI surfaces in
// `hf version`. Fall back to a date-only label if rev-parse fails for
// any reason (shallow clone shouldn't, but be defensive).
let versionLabel = `${new Date().toISOString().slice(0, 10)}+install`;
try {
const sha = exec(`git rev-parse --short HEAD`, { cwd: tmpDir, silent: true }).trim();
if (sha) versionLabel = `${new Date().toISOString().slice(0, 10)}+${options.cliBranch}-${sha}`;
} catch { /* keep fallback */ }
log(` Building hf (version=${versionLabel})...`, 'blue');
const ldflags = `-X git.hangman-lab.top/zhi/HarborForge.Cli/internal/commands.Version=${versionLabel}`;
exec(`go build -ldflags ${shellEscape(ldflags)} -o ${shellEscape(hfBinary)} ./cmd/hf`, { cwd: tmpDir, silent: !options.verbose });
chmodSync(hfBinary, 0o755);
logOk(`hf binary → ${hfBinary}`);
logOk(`hf binary → ${hfBinary} (branch hint: ${options.cliBranch})`);
} catch (err) {
logErr(`Failed to build hf CLI: ${err.message}`);
logWarn('CLI installation failed, plugin still installed');
} finally {
rmSync(tmpDir, { recursive: true, force: true });
}
}