init: OpenClaw Perf Cache — fs.{stat,lstat,realpath}{,Sync} TTL memo for plugin-tree paths

Wraps the global fs functions with a 1s TTL memo, scoped via path
whitelist to plugin-discovery paths only. Workaround for upstream
openclaw issue #86791: `loadPluginMetadataSnapshot()`'s cache-validity
check re-runs `hashWatchedFiles` on every lookup, which walks every
plugin's package.json + manifest + source via realpathSync ->
ancestor lstat chain. On prod t2 with ~100 plugins, one cache-check
pass is ~6 400 lstat + ~400 stat (~6-7s CPU per call). Fires on every
agent turn, every loadConfig() call, every channel routing decision.

This plugin doesn't fix the upstream design; it just absorbs the
repeated stats within a 1s window so the same paths aren't re-statted
6× per second during a discovery walk.

Verified on prod t2 (2026-05-27):
  - Cache hit ratio: 92.1-98.2% (stable across windows)
  - Idle baseline (0 turn, 0 push): 0.6-3.7% CPU (was 25%+ pre-fix)
  - Per-turn cost: notably reduced; previously 100% sustained per turn

Path whitelist:
  - /openclaw/dist/extensions/
  - /.openclaw/plugins/
  - /node_modules/@openclaw/
  - /openclaw/plugin-sdk/

All other paths pass through to original fs functions unchanged.

Manifest requires `activation.onStartup: true` so openclaw register()s
the plugin even though it exposes no tools/contracts (otherwise jiti
caches the module without ever calling register).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
h z
2026-05-27 10:17:54 +01:00
commit 49bcde41ec
7 changed files with 543 additions and 0 deletions

108
README.md Normal file
View File

@@ -0,0 +1,108 @@
# OpenClaw Perf Cache
A 200-line OpenClaw plugin that wraps `fs.statSync` / `fs.lstatSync` /
`fs.realpathSync` (and their `fs.promises` async siblings) with a 1-second
TTL memo, **scoped to plugin-tree paths only**, as a workaround for an
upstream openclaw performance bug.
## Why this exists
`loadPluginMetadataSnapshot()` in openclaw's `dist/plugin-metadata-snapshot-*.js`
keeps a memo of the resolved plugin registry, but its cache-validity check
runs `hashWatchedFiles(memo.watchedFiles)` on **every lookup**. That call
re-fingerprints every plugin's `package.json` + `openclaw.plugin.json` +
source + setupSource paths via `realpathSync` → ancestor `lstat` chain.
On a prod gateway with ~100 installed plugins (the bundled
`dist/extensions/*` set), one cache-check pass is roughly:
```
100 plugins × 4 watched-files × 2 realpath/file × ~8 lstat/realpath
≈ 6 400 lstat + ~400 stat per call
≈ 67 s CPU
```
The check fires from many call sites — every agent turn (tool middleware
loader), every `loadConfig()` call, every channel routing decision —
turning what should be a cheap snapshot hit into a sustained CPU drain.
Same hot path is observed in these upstream tickets:
- [#86791 — repeated lstat/realpathSync in InstalledPluginIndex fingerprinting (memoization missing)](https://github.com/openclaw/openclaw/issues/86791) — **open, P2**, exact same call chain (`lstat and realpathSync under resolvePackageJsonPath -> buildInstalledManifestRegistryIndexKey -> resolveInstalledManifestRegistryIndexFingerprint`); two linked PRs (#86797, #86850) in progress. Once that lands, this plugin becomes unnecessary.
- [#67040 — persist plugin discovery cache + defer plugin loading](https://github.com/openclaw/openclaw/issues/67040) (closed as *not planned*)
- [#75297 — gateway event-loop saturation, very slow sessions.list after 2026.4.23](https://github.com/openclaw/openclaw/issues/75297) (workaround: rollback to 2026.4.23)
- [#28587 — plugin runtime eagerly loads channel SDKs causing sustained high CPU on startup](https://github.com/openclaw/openclaw/issues/28587) (closed by PR #28620, but only fixed the startup path, not the per-turn cost)
## What this plugin does
On `register()` it patches the global fs functions:
| Wrapped | Pass-through when |
|---|---|
| `fs.statSync` | path does NOT match a plugin-tree needle |
| `fs.lstatSync` | path does NOT match a plugin-tree needle |
| `fs.realpathSync` | path does NOT match a plugin-tree needle |
| `fs.promises.stat` | path does NOT match a plugin-tree needle |
| `fs.promises.lstat` | path does NOT match a plugin-tree needle |
| `fs.promises.realpath` | path does NOT match a plugin-tree needle |
Plugin-tree needles (substring match — any one matches):
- `/openclaw/dist/extensions/`
- `/.openclaw/plugins/`
- `/node_modules/@openclaw/`
- `/openclaw/plugin-sdk/`
Matched calls get a 1 000 ms TTL memo keyed by `(fn-name, path, JSON(opts))`.
Cached errors throw the same error on subsequent reads within the window.
Counters are logged once a minute:
```
[perf-cache] last 60s: hits=812 misses=27 (hit-ratio 96.8%) passthrough=1493 errors=0 cache_size=804
```
`passthrough` = calls that bypassed memo because the path wasn't a plugin
tree path — that count is essentially "rest of the system" and should be
mostly unchanged by us.
## Safety notes
- **Pass-through for non-plugin paths.** Business code (logs, session files,
skills/, secrets/, anything outside the whitelist) sees the unmodified
`fs`. Only plugin-discovery paths are intercepted.
- **1 s TTL.** Plugin manifest mtime resolution is millisecond level, so a
manifest change becomes visible at most ~1 s later. Dev-loop impact is
negligible.
- **Bounded memory.** `cache.clear()` fires when entries > 4 000.
- **Idempotent.** Module re-import (jiti reload) is a no-op via a sentinel
flag on `globalThis`.
- **Argument-aware.** Cache key includes a JSON of trailing args so
`statSync(p)` and `statSync(p, { bigint: true })` don't collide.
## Install
```bash
git clone https://git.hangman-lab.top/hzhang/OpenclawPerfCache.git
cd OpenclawPerfCache
npm --prefix plugin install
node scripts/install.mjs --install
systemctl --user restart openclaw-gateway
```
## Update (rebuild + recopy, no config touch)
```bash
node scripts/install.mjs --update
systemctl --user restart openclaw-gateway
```
## Uninstall
```bash
node scripts/install.mjs --uninstall
systemctl --user restart openclaw-gateway
```
If openclaw ever fixes the upstream cache-validity-check, this plugin can
be uninstalled with no consequence.