Two layered bugs in the presence-sync loop, both causing every PUT to
fail forever in prod:
1. **Missing /api prefix.** URL was `${guildBaseUrl}/agents/<id>/presence`
but the guild backend sets a global prefix 'api' in main.ts
`setGlobalPrefix('api')`. Every other REST call in this plugin
(channel.ts channels list, fabric-client.ts postMessage, canvas)
already prepends /api/ — only presence-sync missed it. Returned 404
"Cannot PUT /agents/...".
2. **Wrong auth scheme.** Plugin sent `x-api-key: <fabricApiKey>`, but
the endpoint sits behind the global APP_GUARD = ApiKeyGuard, which
actually expects `Authorization: Bearer <guildAccessToken>` (despite
its name — confusing naming on the backend side). With /api added,
error became 401 "missing bearer token". Confirmed by `docker exec
fabric-backend-guild grep APP_GUARD /app/dist/app.module.js` and
manual curl: Bearer guild token → 200 OK.
**Fix**
- presence-sync.ts: do agent-login on demand to obtain a fresh
guildAccessToken, cache it per-agent for 13 min (under the 15-min
JWT TTL), use it as Bearer for the PUT. 401 response invalidates
the cache so the next tick re-logs-in. Pushes are gated on status
changes (rare), so the login overhead is negligible.
- inbound.ts: firstGuildEndpointByAgent → firstGuildByAgent storing
both endpoint and nodeId (presence-sync needs nodeId to pick the
right token out of guildAccessTokens[]).
- index.ts: pass FabricClient to PresenceSync constructor.
**Verified in sim**
After restart, gateway log shows `fabric: presence-sync recruiter →
idle` (200 OK), zero failed PUTs, where previously it would log a 404
every ~5s per agent.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6.4 KiB
6.4 KiB