fix: dynamically sync inbound channel subscriptions #1
Reference in New Issue
Block a user
Delete Branch "fix/inbound-dynamic-channel-sync"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The fabric inbound previously called
joinAll()once on socket.ioconnect— fetched the agent's channel list viaGET /api/channels?guildId=...and emittedjoin_channelfor each. Any channel the agent joined after connect (e.g. a fresh DM created by another user that includes this agent) was unreachable until the gateway restarted: the socket was never subscribed to that room, so backendmessage.createdpush events never arrived.Real-world failure (prod server.t2): hzhang created a DM channel with agent
managershortly after gateway restart. Manager's inbound was already connected with 0 channels subscribed; the new DM's messages went into a backend room nobody on this socket was listening to. Restart of the gateway picked up the new channel; without that, messages were silently lost.Fix
Backend doesn't emit a user-scoped
channel.joinedevent we could piggy-back on (grep emit /app/dist/realtime/realtime.gateway.jsonly showsmessage.created), so this is a poll-based reconciliation:channelSyncTimersfield + 60ssetIntervalper(agent, guild)socketsyncChannels(kind): re-fetches/api/channels?guildId=...using a fresh guild token (freshGuildToken, survives 15-min token TTL)joined: Set<string>current - joined→socket.emit('join_channel', {channelId})+ addjoined - current→socket.emit('leave_channel', {channelId})+ removeconnect:joined.clear()+ runsyncChannels('initial')(the server forgets subscriptions on reconnect)stop(): clears allchannelSyncTimersalongside socket disconnectLogs distinguish initial (
joined N channel(s)) vs delta (channel resync ... +N -M).Trade-off
60s upper bound on detect-new-channel latency. A user-scoped backend
channel.joinedevent would let us do this push-based with zero latency, but that requires backend work (separate change in Fabric backend-guild).🤖 Generated with Claude Code
The fabric inbound previously called `joinAll()` once on socket.io `connect` — it fetched the agent's channel list via `GET /api/channels?guildId=...` and emitted `join_channel` for each. Any channel the agent joined *after* connect (e.g. a fresh DM created by another user that includes this agent) was unreachable until the gateway restarted: the socket was never subscribed to that room, so backend `message.created` push events never arrived. Backend doesn't emit a user-scoped `channel.joined` event we could piggy-back on (only `message.created`), so the fix is to poll. Every 60s the agent's channel list is re-fetched and diffed against a local `joined` set: - new channel ids → `socket.emit('join_channel', {channelId})` + add - ids in `joined` but absent from the fresh list → `leave_channel` emit + remove (best-effort; cleans subs if the agent is removed from a channel) Re-uses `freshGuildToken()` so the resync fetch survives token expiry (15-min TTL). Initial `connect` resets the local `joined` set since the server forgets prior room subscriptions on reconnect. Timers are tracked in `channelSyncTimers` and cleared in `stop()` alongside socket disconnect. Verified against prod server.t2 scenario: hzhang creates DM channel including agent 'manager' → without this fix, manager only sees the message after a gateway restart; with this fix, manager receives the message within at most 60s (next resync tick). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>