Fabric.OpenclawPlugin

Author	SHA1	Message	Date
hzhang	20e55849eb	fix(channel): add describeAccount so health-monitor sees real configured openclaw's `channelManager.getRuntimeSnapshot()` — called every minute by the channel-health-monitor — runs accounts through `applyDescribedAccountFields(next, plugin.config.describeAccount?.(...))`. When the callback is missing it defaults `configured: true`. Fabric never defined it, so every health-monitor cycle: snapshot = { enabled: true, configured: true, running: false } For fabric's synthetic 'default' account (returned by `listFabricAccountIds` when `channels.fabric.accounts` is empty — the prod shape, where per-agent api-keys live in `~/.openclaw/fabric-identity.json` and the channel framework never runs `startAccount` so `running` stays false): isManagedAccount({enabled:true, configured:true}) === true -> not-running -> 'stopped' -> restart every ~10 min, logging '[fabric:default] health-monitor: restarting (reason: stopped)' The restart is a no-op (fabric's `gateway.startAccount` is absent so `startChannelInternal` returns early), but the log is loud and operators chasing real outages keep wasting time on it. Mirror `isConfigured` from describeAccount so the snapshot truthfully reports configured:false for any account without a fabricApiKey. The fabric plugin still self-manages real agents via `gateway_start` -> `FabricInbound.start()`; the framework just no longer thinks 'default' is something it should restart. Verified in sim (this patch alone, no debug instrumentation): - gateway up 8+ minutes, 0 restart events - pre-patch sim with same config restarted at 5min mark - evaluateChannelHealth snapshot for both 'default' and 'recruiter' accountId reads configured:false (instrumented with temporary console.log in channel-health-policy, since reverted) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 16:48:53 +01:00
hzhang	7dc70522d1	fix(inbound): refresh socket.io auth on (re)connect via callback Backend issues short-lived guildAccessToken (TTL=900s). The previous `auth: { token: tok }` shape captured the JWT once in connectAgent's closure: after socket.io's auto-reconnect the backend kept getting the same expired JWT and silently rejected the handshake at the application layer (RealtimeGateway logs 'socket rejected: <id>'). The client's 'connect' event still fired (TCP succeeded) so the plugin happily ran the channel-resync, emitted join_channel into the void, and logged 'joined N channel(s)' while the backend was actually broadcasting message.created to a room with zero subscribers. End-user symptom: DMs/group messages to agents silently dropped 15 min after gateway start, with no error anywhere on the agent side. Switch to the callback form, which socket.io re-evaluates on every (re)connect — same call site we already use for the HTTP path via freshGuildToken/tokenCache. Verified in sim (commit `2acb084` + this patch): 1. Connect new DM channel + post msg -> dispatch + reply ✓ 2. `docker restart fabric-backend-guild` to force socket disconnect 3. Plugin reconnects automatically and logs 'fabric: agent recruiter joined 12 channel(s) on sim-guild-1' ✓ (without the fix this reconnect was silently rejected; sim used to log 'WARN socket rejected: <id>' on the guild backend) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 13:50:24 +01:00
hzhang	9419d270e5	fix(presence-sync): tick mutex so setInterval overlap can't spawn parallel ticks The presence-sync tick iterates accounts serially with await on each agent-login + PUT round-trip — a single tick can easily run 20+s when there are several accounts. setInterval(intervalMs) does NOT wait for the previous tick to finish, so on a busy gateway the next tick fires on top of a still-running one and two parallel iterations each PUT the same agentId within ~10 ms. That tipped the guild backend's first-time-insert race (separate fix in nav/Fabric.Backend.Guild) into 500s on prod (caught in t2 gateway 2026-05-25 23:23:35Z; 6 of 6 agents showed paired log lines 4-10 ms apart for the same agent → idle). Fix: a simple `inflight` boolean. tick() returns immediately if already running; the next interval beat catches up. lastStatus !== bridge.get gating already means status changes catch the next tick anyway, so skipping a beat costs nothing the next beat won't fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 02:25:08 +01:00
hzhang	a87de27cff	fix(presence-sync): use /api prefix + Bearer guildAccessToken (not x-api-key) Two layered bugs in the presence-sync loop, both causing every PUT to fail forever in prod: 1. Missing /api prefix. URL was `${guildBaseUrl}/agents/<id>/presence` but the guild backend sets a global prefix 'api' in main.ts `setGlobalPrefix('api')`. Every other REST call in this plugin (channel.ts channels list, fabric-client.ts postMessage, canvas) already prepends /api/ — only presence-sync missed it. Returned 404 "Cannot PUT /agents/...". 2. Wrong auth scheme. Plugin sent `x-api-key: <fabricApiKey>`, but the endpoint sits behind the global APP_GUARD = ApiKeyGuard, which actually expects `Authorization: Bearer <guildAccessToken>` (despite its name — confusing naming on the backend side). With /api added, error became 401 "missing bearer token". Confirmed by `docker exec fabric-backend-guild grep APP_GUARD /app/dist/app.module.js` and manual curl: Bearer guild token → 200 OK. Fix - presence-sync.ts: do agent-login on demand to obtain a fresh guildAccessToken, cache it per-agent for 13 min (under the 15-min JWT TTL), use it as Bearer for the PUT. 401 response invalidates the cache so the next tick re-logs-in. Pushes are gated on status changes (rare), so the login overhead is negligible. - inbound.ts: firstGuildEndpointByAgent → firstGuildByAgent storing both endpoint and nodeId (presence-sync needs nodeId to pick the right token out of guildAccessTokens[]). - index.ts: pass FabricClient to PresenceSync constructor. Verified in sim After restart, gateway log shows `fabric: presence-sync recruiter → idle` (200 OK), zero failed PUTs, where previously it would log a 404 every ~5s per agent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 23:54:38 +01:00
hzhang	b8e0e424fa	fix(inbound): route fabric DM channels as peer.kind='direct' / ChatType='direct' Inbound was hardcoding `peer: { kind: 'group' }` and `ChatType: 'group'` for every fabric channel regardless of xType. As a result: - sessionKey for a DM was `agent:<id>:fabric:group:<chan>` instead of `agent:<id>:fabric:direct:<chan>` - ctx.ChatType='group' caused user-prompt metadata to render `is_group_chat: true` on a DM - openclaw's `isDirectMessage()` check (ChatType==='direct') returned false, so DM-specific prompt and turn behavior never engaged Caught by recruiter test in session 40c51de2: the model's thinking trace acknowledged "fabric DM channel" (from the ClawPrompts chat-injector hook) but the surrounding user-prompt metadata contradicted it with `is_group_chat: true`, and the model reasoned its way out of running `workflow_start`. Fix factors a small helper `fabricPeerRoutingForXType` (and a cache- backed `fabricPeerRoutingForChannel` for outbound) in channel.ts that maps: - 'dm' → { peerKind: 'direct', chatType: 'direct' } - rest → { peerKind: 'group', chatType: 'group' } (no change) Inbound uses m.xType directly (live, authoritative). Outbound has no xType in its call signature, so it consults the channel-meta cache populated by inbound (same `getChannelType` already exposed via __fabric). Cache miss falls back to 'group' — the pre-fix default, no regression. The proactive-DM-without-prior-inbound edge case still routes that one outbound as 'group'; the next round agrees on 'direct'. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 14:26:42 +01:00
hzhang	c5429129d9	feat(channel-meta): expose globalThis.__fabric.getChannelType for narrow gating Inbound `message.created` already carries `xType` (dm / triage / group / broadcast / etc.) — record it in a per-channel cache so other plugins can answer "is this channel a DM?" without poking the Center API. New module src/channel-meta.ts: - in-memory Map<channelId, xType> - lazily loaded from ~/.openclaw/fabric-channel-meta.json on first access (so first-ever DM after a fresh gateway start still hits cache from the previous run) - debounced 250ms flush on dirty; force-flush on gateway_stop - recordChannelType(channelId, xType): called from inbound - getChannelType(channelId): null if unknown — caller MUST treat null as "don't know", NOT as "assume DM" (would re-introduce the false- positive on group channels we're trying to eliminate) Wiring: - inbound.ts socket.on('message.created'): records xType BEFORE the self-author / dedup gates (channel type is observer-agnostic) - index.ts: installs globalThis.__fabric = { getChannelType } on registerFull(); flushes on gateway_stop Consumer: ClawPrompts' fabric-chat-injector will start gating its prompt injection on getChannelType(channelId) === 'dm' (companion PR on ClawPrompts). Removes the phase-1 "any fabric channel" false-positive. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 11:28:36 +01:00
hzhang	0e36457d8f	fix(tools): execute receives (callId, args), not (args) — pre-existing bug OpenClaw plugin-sdk's registerTool execute signature is: execute: async (_id: string, params) => { ... } Fabric tools were calling it as `(p) => { ... }`, so `p` held the call id (a string) and the real args were silently dropped onto the floor. Every tool that read a required field from `p` failed with the field surfacing as undefined. fabric-guild-list (just added) appeared to work because all its properties are optional — `p.nameFilter` and `p.purposeFilter` both being undefined produced empty filter needles, which let the unfiltered guild list through. The real bug surfaced the moment fabric-channel-list (required: guildNodeId) was invoked: the ctxGuild helper saw `undefined` and reported `agent not a member of guild undefined`. Compare dialectic plugin's tools.ts which has always used the correct `async (_id: string, params) => {...}` shape and worked end-to-end. Aligning the fabric signature to match. Verified end-to-end on sim: - fabric-guild-list returns 1 guild with the purpose set via the new `cli node set-purpose` - fabric-channel-list returns 3 channels including a now-populated `purpose` field on each row - fabric-channel-set-purpose successfully patches a channel and the subsequent fabric-channel-list shows the new purpose	2026-05-23 19:35:38 +01:00
hzhang	5ff464a055	feat(plugin): fabric-guild-list + fabric-channel-set-purpose tools + purpose on existing tools Adds two agent-facing tools that close the discoverability loop: - fabric-guild-list — enumerates guilds the agent belongs to with name + purpose + status (no api calls beyond the existing agentLogin response). Optional nameFilter/purposeFilter for narrowing. - fabric-channel-set-purpose — PATCH /api/channels/:id { purpose } so agents can backfill or update an existing channel's purpose. Extends existing tools: - fabric-channel-list now returns purpose on each row. - create-{chat,work,report,discussion}-channel accept optional purpose. FabricClient + FabricSession type changes carry the new field through. Manifest contracts.tools updated (jiti loader needs both manifest entry and onStartup activation to register). Lets workflows that previously needed hardcoded channel ids instead say 'find a guild whose purpose mentions debate, then a channel of x_type announce whose purpose covers public debate broadcasts.'	2026-05-23 19:22:10 +01:00
hzhang	a060ff98a2	feat(inbound): listen for backend-pushed channel.joined/left events Companion to nav/Fabric.Backend.Guild#<TBD> which adds the server-side emitToUser broadcast on channel membership changes. Before, the inbound only learned about new channels via the 60s polling resync (worst-case 60s lag). Now the backend tells us directly so sub/unsub is realtime. socket.on('channel.joined', evt) → join the socket.io room for evt.channelId and add to the local 'joined' set. socket.on('channel.left', evt) → leave + remove from 'joined'. Both events are idempotent (`if (joined.has(id))` / `if (!joined.has(id))`) so duplicate emits from server are safe. Polling resync still runs every 60s as a safety net for transient socket drops between emit and reconnect, partial server failures, etc. When backend lacks this support (older deployments), nothing breaks — the event simply never fires and polling carries the load as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 08:08:33 +01:00
hzhang	d1d5ad10ca	fix: dynamically sync inbound channel subscriptions The fabric inbound previously called `joinAll()` once on socket.io `connect` — it fetched the agent's channel list via `GET /api/channels?guildId=...` and emitted `join_channel` for each. Any channel the agent joined after connect (e.g. a fresh DM created by another user that includes this agent) was unreachable until the gateway restarted: the socket was never subscribed to that room, so backend `message.created` push events never arrived. Backend doesn't emit a user-scoped `channel.joined` event we could piggy-back on (only `message.created`), so the fix is to poll. Every 60s the agent's channel list is re-fetched and diffed against a local `joined` set: - new channel ids → `socket.emit('join_channel', {channelId})` + add - ids in `joined` but absent from the fresh list → `leave_channel` emit + remove (best-effort; cleans subs if the agent is removed from a channel) Re-uses `freshGuildToken()` so the resync fetch survives token expiry (15-min TTL). Initial `connect` resets the local `joined` set since the server forgets prior room subscriptions on reconnect. Timers are tracked in `channelSyncTimers` and cleared in `stop()` alongside socket disconnect. Verified against prod server.t2 scenario: hzhang creates DM channel including agent 'manager' → without this fix, manager only sees the message after a gateway restart; with this fix, manager receives the message within at most 60s (next resync tick). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 07:45:59 +01:00
hzhang	92945b777d	feat(fabric): dm channels deliver any non-self message (no wakeup gate) inbound: FabricMessage gains xType; the wakeup gate is bypassed when xType==='dm' (self messages are already filtered upstream), so a 1:1 dm always reaches the model regardless of wakeup metadata. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 09:18:20 +01:00
hzhang	8774cfd7cc	feat(fabric): coalesce a split agent turn into ONE message (deterministic) OpenClaw delivers an agent turn whose blocks are text -> thinking/tool -> text via multiple inbound deliver() calls (a non-text block is a delivery boundary), so one turn became N Fabric messages. Fix: buffer deliver() segments per channel (src/coalesce.ts) and flush them as ONE postMessage at a deterministic boundary — the finally after dispatchInboundReplyWithBase() resolves, which provably runs only after every deliver() of the turn (verified: deliver,deliver -> dispatch returned -> flush). No hooks, no timers, no idle guessing. The agent_end hook was rejected: it fires BEFORE deliver(). gateway_stop flushes any leftover; a long safety timeout is a leak-guard only. channels.fabric.coalesce=false restores raw per-segment posting. Verified on local openclaw + Fabric with a fake text/thinking/text model: single trigger -> exactly one merged message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 22:15:46 +01:00
hzhang	ab126825ef	feat(security): commandsSyncKey is a required channel-config field (Guild C-2) The slash-command sync secret now comes from channels.fabric.commandsSyncKey (configSchema marks it required) and is no longer read from FABRIC_COMMANDS_SYNC_KEY env. command-sync resolves it from config and threads it into client.syncCommands; when absent, sync is skipped with a clear warning. README updated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 18:44:25 +01:00
hzhang	bb63a57384	feat(security): send x-commands-sync-key when configured (Guild C-2) syncCommands attaches the FABRIC_COMMANDS_SYNC_KEY header when the operator sets it, so the guild can restrict slash-command catalog writes to this plugin. No-op / backward compatible when unset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 17:47:14 +01:00
hzhang	c03562046d	feat(plugin): sync OpenClaw slash-command catalog to Fabric - command-sync.ts: buildFabricCommandSpecs(cfg) reads OpenClaw native command specs via openclaw/plugin-sdk/native-command-registry (listNativeCommandSpecsForConfig + findCommandByNativeName), resolves dynamic arg choices to a static snapshot (resolveCommandArgChoices) — same data Discord registers as slash commands. - syncFabricCommands(): on gateway_start, after inbound starts, PUT the catalog to each connected guild (FabricClient.syncCommands -> PUT /api/commands; idempotent, one per guild). - Fabric stays a TEXT-command surface (no nativeCommands capability): execution still flows as a /<cmd> message into OpenClaw's command system; this catalog only drives frontend autocomplete. Verified: 41 specs built (args/choices incl. dynamic), synced to test-guild1, GET /api/commands round-trips count=41. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 16:06:22 +01:00
hzhang	fac6debfa5	feat(plugin): fabric-channel tool (members / join / leave) One tool, three actions backed by FabricClient channelMembers (GET /channels/:id/members -> [{userId,bypass}]), joinChannel, and new leaveChannel (POST /channels/:id/leave). Verified: client-level smoke against the running guild — members initial=[tester], after join echo2 present, after leave echo2 gone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 15:33:47 +01:00
hzhang	aaabb0ddb0	feat(plugin): fabric-canvas tool; fabric-register env=AGENT_ID only - bin/fabric-register.mjs: only AGENT_ID is read from the environment; --api-key is flag-only (no FABRIC_API_KEY); dropped FABRIC_CENTER_API_BASE / FABRIC_IDENTITY_FILE / OPENCLAW_PATH env fallbacks (flags + sensible defaults; --center still falls back to openclaw.json). - New fabric-canvas tool (one tool, four actions): read / share / update / close the channel's single pinned canvas. Backed by FabricClient get/share/update/removeCanvas (GET/PUT/PATCH/DELETE; empty 2xx body -> null). update/close are sharer-only server-side. - README updated. Verified: client-level smoke against the running guild — read(empty→null) → share(v1) → read → update(v2) → close(→null) all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 15:28:13 +01:00
hzhang	26c12533fb	refactor(plugin): fabric-register is a script, not a tool Binding an agent's Fabric API key was an OpenClaw tool; make it a self-contained Node script installed to ~/.openclaw/bin/fabric-register instead. - bin/fabric-register.mjs: no plugin deps; AGENT_ID env wins, else --agent-id required; --api-key validated via POST /auth/agent/login; on success upserts ~/.openclaw/fabric-identity.json (format matches IdentityRegistry). Flags/env for center, identity-file, openclaw-path. - install.mjs: copy the script to ~/.openclaw/bin (chmod 0755) on install, remove on uninstall; Next-steps updated. - tools.ts: drop the fabric-register tool; ctxGuild error now points to the script / static accounts config. - README updated. Verified: missing-id -> exit 2; --agent-id and AGENT_ID both bind and write a valid identity file; bad key -> 401, no write. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 13:12:48 +01:00
hzhang	892db9f9be	feat(plugin): record non-wakeup messages as session history (no model) Previously a non-wakeup message returned immediately and was fully discarded — the agent kept zero record of it, so when later woken in a discuss/work channel it replied without the conversation context. Now non-wakeup messages are ingested into the agent's OpenClaw session via recordInboundSession (createIfMissing) WITHOUT dispatch: the real model is not invoked and nothing is sent back to Fabric. This is correct for the turn engine — only the woken speaker emits a normal message or /no-reply; non-woken agents stay silent — while still giving the agent full channel context whenever it IS woken. Verified live: report-channel (all recipients wakeup=false) message logs 'recorded (no wakeup, history only)' with 0 dispatch/deliver/ posted; wakeup path unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 10:42:36 +01:00
hzhang	fc7efd0227	fix(plugin): force automatic source-reply delivery (fixes no-reply) OpenClaw defaults group-chat replies to sourceReplyDeliveryMode 'message_tool_only', which suppresses auto-delivery of the agent's text reply (it expects the agent to call a message tool). With ChatType 'group', the Fabric plugin's deliver callback was therefore NEVER invoked — the agent ran but no reply ever returned to Fabric. Fabric already gates when an agent speaks via the per-recipient wakeup flag, so once a turn is dispatched the reply must always flow back. Pass replyOptions.sourceReplyDeliveryMode='automatic' so OpenClaw delivers the agent's reply through regardless of the group default (source-reply-delivery-mode honors a truthy requested mode). Verified live end-to-end: human posts -> wakeup -> agent runs -> 'fabric: deliver' + 'fabric: posted reply' -> agent message appears in the Fabric channel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 10:02:47 +01:00
hzhang	d79a04b8a3	fix(plugin): refresh guild token per dispatch (fixes attachment 401) Guild access tokens are short-lived (~15 min); the inbound socket survives via socket.io reconnect but the token captured at connect time goes stale, so attachment downloads (and reply posts) start 401ing on long-lived agents. Re-login with the agent's Fabric API key on a short TTL and use the fresh token for fetch + post. Verified live: 'fabric: fetched 1 attachment(s)' now succeeds where it previously logged 'attachment fetch 401'. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 00:03:31 +01:00
hzhang	cc655ffcc3	fix(plugin): pass only local MediaPaths (drop SSRF-blocked MediaUrls) Live round-trip test showed openclaw's SSRF guard blocking the localhost guild file URL passed via MediaUrls. We already download the bytes with the agent's guild token, so MediaUrls is redundant and noisy — provide only local MediaPaths/MediaTypes. Verified: plugin logs 'fetched N attachment(s)' and the SSRF WARN is gone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 21:50:34 +01:00
hzhang	42228e0a23	chore(plugin): rebuild dist (file delivery) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 20:17:28 +01:00
hzhang	f59c693186	fix: never split replies into multiple messages (Fabric has no length limit) Unlike Discord, Fabric has no message-length cap. Single-chunk chunker (text -> [text]), textChunkLimit=MAX_SAFE_INTEGER, capabilities blockStreaming=false, replyOptions.disableBlockStreaming=true -> every agent reply delivered as exactly one Fabric message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 18:30:25 +01:00
hzhang	9cb262367e	feat: working v1 — full Fabric<->openclaw round-trip verified Real channel-turn dispatch (resolveAgentRoute + finalizeInboundContext + dispatchInboundReplyWithBase), wakeup->drop/dispatch, messaging target grammar (fabric:<id>) + outbound.sendText, tools use execute/parameters. Verified live: human msg in Fabric -> wakeup -> openclaw agent runs -> reply posted back into the Fabric channel as the agent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 18:24:35 +01:00
hzhang	ece77ff2c7	feat: loadable openclaw channel plugin v1 (agent=account) Rewritten against the real openclaw v2026.5.7 plugin SDK (generic third-party channel path): createChannelPluginBase + createChatChannelPlugin with required capabilities, minimal ChannelSetupAdapter, agent=account config resolution, attached outbound -> Fabric POST, inbound socket per account -> runtime.channel.turn (wakeup->admission). Compat notes mark SDK-coupled seams for future openclaw upgrades. Verified: builds clean, installs, 'openclaw channels list' -> Fabric installed/configured/enabled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 17:54:32 +01:00
hzhang	7fed6d07f6	build: PaddedCell-style install.mjs + SDK-aligned packaging install.mjs (--install/--build-only/--uninstall/--openclaw-profile-path), tsconfig outDir dist/fabric, package.json openclaw file dep + main. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 17:30:37 +01:00

27 Commits