hzhang 49bcde41ec init: OpenClaw Perf Cache — fs.{stat,lstat,realpath}{,Sync} TTL memo for plugin-tree paths
Wraps the global fs functions with a 1s TTL memo, scoped via path
whitelist to plugin-discovery paths only. Workaround for upstream
openclaw issue #86791: `loadPluginMetadataSnapshot()`'s cache-validity
check re-runs `hashWatchedFiles` on every lookup, which walks every
plugin's package.json + manifest + source via realpathSync ->
ancestor lstat chain. On prod t2 with ~100 plugins, one cache-check
pass is ~6 400 lstat + ~400 stat (~6-7s CPU per call). Fires on every
agent turn, every loadConfig() call, every channel routing decision.

This plugin doesn't fix the upstream design; it just absorbs the
repeated stats within a 1s window so the same paths aren't re-statted
6× per second during a discovery walk.

Verified on prod t2 (2026-05-27):
  - Cache hit ratio: 92.1-98.2% (stable across windows)
  - Idle baseline (0 turn, 0 push): 0.6-3.7% CPU (was 25%+ pre-fix)
  - Per-turn cost: notably reduced; previously 100% sustained per turn

Path whitelist:
  - /openclaw/dist/extensions/
  - /.openclaw/plugins/
  - /node_modules/@openclaw/
  - /openclaw/plugin-sdk/

All other paths pass through to original fs functions unchanged.

Manifest requires `activation.onStartup: true` so openclaw register()s
the plugin even though it exposes no tools/contracts (otherwise jiti
caches the module without ever calling register).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 10:17:54 +01:00

OpenClaw Perf Cache

A 200-line OpenClaw plugin that wraps fs.statSync / fs.lstatSync / fs.realpathSync (and their fs.promises async siblings) with a 1-second TTL memo, scoped to plugin-tree paths only, as a workaround for an upstream openclaw performance bug.

Why this exists

loadPluginMetadataSnapshot() in openclaw's dist/plugin-metadata-snapshot-*.js keeps a memo of the resolved plugin registry, but its cache-validity check runs hashWatchedFiles(memo.watchedFiles) on every lookup. That call re-fingerprints every plugin's package.json + openclaw.plugin.json + source + setupSource paths via realpathSync → ancestor lstat chain.

On a prod gateway with ~100 installed plugins (the bundled dist/extensions/* set), one cache-check pass is roughly:

100 plugins × 4 watched-files × 2 realpath/file × ~8 lstat/realpath
  ≈ 6 400 lstat + ~400 stat per call
  ≈ 67 s CPU

The check fires from many call sites — every agent turn (tool middleware loader), every loadConfig() call, every channel routing decision — turning what should be a cheap snapshot hit into a sustained CPU drain.

Same hot path is observed in these upstream tickets:

What this plugin does

On register() it patches the global fs functions:

Wrapped Pass-through when
fs.statSync path does NOT match a plugin-tree needle
fs.lstatSync path does NOT match a plugin-tree needle
fs.realpathSync path does NOT match a plugin-tree needle
fs.promises.stat path does NOT match a plugin-tree needle
fs.promises.lstat path does NOT match a plugin-tree needle
fs.promises.realpath path does NOT match a plugin-tree needle

Plugin-tree needles (substring match — any one matches):

  • /openclaw/dist/extensions/
  • /.openclaw/plugins/
  • /node_modules/@openclaw/
  • /openclaw/plugin-sdk/

Matched calls get a 1 000 ms TTL memo keyed by (fn-name, path, JSON(opts)). Cached errors throw the same error on subsequent reads within the window.

Counters are logged once a minute:

[perf-cache] last 60s: hits=812 misses=27 (hit-ratio 96.8%) passthrough=1493 errors=0 cache_size=804

passthrough = calls that bypassed memo because the path wasn't a plugin tree path — that count is essentially "rest of the system" and should be mostly unchanged by us.

Safety notes

  • Pass-through for non-plugin paths. Business code (logs, session files, skills/, secrets/, anything outside the whitelist) sees the unmodified fs. Only plugin-discovery paths are intercepted.
  • 1 s TTL. Plugin manifest mtime resolution is millisecond level, so a manifest change becomes visible at most ~1 s later. Dev-loop impact is negligible.
  • Bounded memory. cache.clear() fires when entries > 4 000.
  • Idempotent. Module re-import (jiti reload) is a no-op via a sentinel flag on globalThis.
  • Argument-aware. Cache key includes a JSON of trailing args so statSync(p) and statSync(p, { bigint: true }) don't collide.

Install

git clone https://git.hangman-lab.top/hzhang/OpenclawPerfCache.git
cd OpenclawPerfCache
npm --prefix plugin install
node scripts/install.mjs --install
systemctl --user restart openclaw-gateway

Update (rebuild + recopy, no config touch)

node scripts/install.mjs --update
systemctl --user restart openclaw-gateway

Uninstall

node scripts/install.mjs --uninstall
systemctl --user restart openclaw-gateway

If openclaw ever fixes the upstream cache-validity-check, this plugin can be uninstalled with no consequence.

Description
OpenClaw plugin: TTL memo cache for fs.lstat/realpath on plugin-tree paths — workaround for upstream #86791 (per-call plugin-metadata-snapshot validity check costs ~6400 lstat/call)
Readme 37 KiB
Languages
TypeScript 73.3%
JavaScript 26.7%