perf(meta-push): use cached api.config instead of deprecated loadConfig() — kills ~25% chronic baseline CPU #11
Reference in New Issue
Block a user
Delete Branch "fix/meta-push-use-cached-api-config"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Why
t2 gateway sustains 22-30% CPU baseline even with zero agent activity / zero turn / zero queued message. V8 profile 2026-05-27 08:14:00 (60s window, 0 session turns, 2 metadata pushes during):
Tracked back to
pushMetaToMonitorin this plugin: it calledapi.runtime?.config?.loadConfig?.()to read the agent name list. That deprecated path (plugin runtime config.loadConfig() is deprecated; use config.current()— emitted on every gateway start) synchronously rebuilds the full plugin-metadata snapshot every call: realpathSync walks every plugin's package.json + manifest + source up the directory tree, hashWatchedFiles fingerprints every watched plugin file, discoverInDirectory re-scans everydist/extensions/<plugin>(~100 of them on prod t2). ~6-7s of CPU per rebuild.pushMetaToMonitorfires everyreportIntervalSec(default 30s) fromhooks/gateway-start.js. 100 plugins × ~7s walk / 30s = ~23% sustained = matches the measured baseline almost exactly.What
pushMetaToMonitor(andresolveAgentId, same anti-pattern, only runs once at gateway-start) now read from the cached(api as any).config ?? api.runtime?.config?.loadConfig?.(). The cachedapi.configis the snapshot the gateway maintains internally; the deprecated path was a fallback for older host versions. Other code in this same file (line 284, the calendar wakeAgent dispatcher) already uses this pattern —pushMetaToMonitorwas just the only place that hadn't been updated.Verification plan
Deploy to prod t2, wait 2-3 minutes for the deprecated path to stop firing, take a fresh 60s V8 profile during a zero-turn window. Expected: lstat % drops from ~44% → near 0%, gateway idle baseline back to ~99% idle. Will validate before merging.
Upstream issue (separate)
The underlying openclaw bug is that
loadConfig()rebuilds the snapshot rather than returning the cached one. Even theapi.configpath has its own cache-validity check that walkshashWatchedFiles(~800 statx per call) every timeloadPluginMetadataSnapshotruns — the "fast" path is still O(N watched files). That's a chronic baseline overhead per agent turn (~13.5% lstat in single-turn local repro). Worth pushing upstream — separate from this plugin-side fix which just stops the every-30s with no agent activity baseline.🤖 Generated with Claude Code