Wraps the global fs functions with a 1s TTL memo, scoped via path whitelist to plugin-discovery paths only. Workaround for upstream openclaw issue #86791: `loadPluginMetadataSnapshot()`'s cache-validity check re-runs `hashWatchedFiles` on every lookup, which walks every plugin's package.json + manifest + source via realpathSync -> ancestor lstat chain. On prod t2 with ~100 plugins, one cache-check pass is ~6 400 lstat + ~400 stat (~6-7s CPU per call). Fires on every agent turn, every loadConfig() call, every channel routing decision. This plugin doesn't fix the upstream design; it just absorbs the repeated stats within a 1s window so the same paths aren't re-statted 6× per second during a discovery walk. Verified on prod t2 (2026-05-27): - Cache hit ratio: 92.1-98.2% (stable across windows) - Idle baseline (0 turn, 0 push): 0.6-3.7% CPU (was 25%+ pre-fix) - Per-turn cost: notably reduced; previously 100% sustained per turn Path whitelist: - /openclaw/dist/extensions/ - /.openclaw/plugins/ - /node_modules/@openclaw/ - /openclaw/plugin-sdk/ All other paths pass through to original fs functions unchanged. Manifest requires `activation.onStartup: true` so openclaw register()s the plugin even though it exposes no tools/contracts (otherwise jiti caches the module without ever calling register). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
109 lines
4.5 KiB
Markdown
109 lines
4.5 KiB
Markdown
# OpenClaw Perf Cache
|
||
|
||
A 200-line OpenClaw plugin that wraps `fs.statSync` / `fs.lstatSync` /
|
||
`fs.realpathSync` (and their `fs.promises` async siblings) with a 1-second
|
||
TTL memo, **scoped to plugin-tree paths only**, as a workaround for an
|
||
upstream openclaw performance bug.
|
||
|
||
## Why this exists
|
||
|
||
`loadPluginMetadataSnapshot()` in openclaw's `dist/plugin-metadata-snapshot-*.js`
|
||
keeps a memo of the resolved plugin registry, but its cache-validity check
|
||
runs `hashWatchedFiles(memo.watchedFiles)` on **every lookup**. That call
|
||
re-fingerprints every plugin's `package.json` + `openclaw.plugin.json` +
|
||
source + setupSource paths via `realpathSync` → ancestor `lstat` chain.
|
||
|
||
On a prod gateway with ~100 installed plugins (the bundled
|
||
`dist/extensions/*` set), one cache-check pass is roughly:
|
||
|
||
```
|
||
100 plugins × 4 watched-files × 2 realpath/file × ~8 lstat/realpath
|
||
≈ 6 400 lstat + ~400 stat per call
|
||
≈ 6–7 s CPU
|
||
```
|
||
|
||
The check fires from many call sites — every agent turn (tool middleware
|
||
loader), every `loadConfig()` call, every channel routing decision —
|
||
turning what should be a cheap snapshot hit into a sustained CPU drain.
|
||
|
||
Same hot path is observed in these upstream tickets:
|
||
|
||
- [#86791 — repeated lstat/realpathSync in InstalledPluginIndex fingerprinting (memoization missing)](https://github.com/openclaw/openclaw/issues/86791) — **open, P2**, exact same call chain (`lstat and realpathSync under resolvePackageJsonPath -> buildInstalledManifestRegistryIndexKey -> resolveInstalledManifestRegistryIndexFingerprint`); two linked PRs (#86797, #86850) in progress. Once that lands, this plugin becomes unnecessary.
|
||
- [#67040 — persist plugin discovery cache + defer plugin loading](https://github.com/openclaw/openclaw/issues/67040) (closed as *not planned*)
|
||
- [#75297 — gateway event-loop saturation, very slow sessions.list after 2026.4.23](https://github.com/openclaw/openclaw/issues/75297) (workaround: rollback to 2026.4.23)
|
||
- [#28587 — plugin runtime eagerly loads channel SDKs causing sustained high CPU on startup](https://github.com/openclaw/openclaw/issues/28587) (closed by PR #28620, but only fixed the startup path, not the per-turn cost)
|
||
|
||
## What this plugin does
|
||
|
||
On `register()` it patches the global fs functions:
|
||
|
||
| Wrapped | Pass-through when |
|
||
|---|---|
|
||
| `fs.statSync` | path does NOT match a plugin-tree needle |
|
||
| `fs.lstatSync` | path does NOT match a plugin-tree needle |
|
||
| `fs.realpathSync` | path does NOT match a plugin-tree needle |
|
||
| `fs.promises.stat` | path does NOT match a plugin-tree needle |
|
||
| `fs.promises.lstat` | path does NOT match a plugin-tree needle |
|
||
| `fs.promises.realpath` | path does NOT match a plugin-tree needle |
|
||
|
||
Plugin-tree needles (substring match — any one matches):
|
||
|
||
- `/openclaw/dist/extensions/`
|
||
- `/.openclaw/plugins/`
|
||
- `/node_modules/@openclaw/`
|
||
- `/openclaw/plugin-sdk/`
|
||
|
||
Matched calls get a 1 000 ms TTL memo keyed by `(fn-name, path, JSON(opts))`.
|
||
Cached errors throw the same error on subsequent reads within the window.
|
||
|
||
Counters are logged once a minute:
|
||
|
||
```
|
||
[perf-cache] last 60s: hits=812 misses=27 (hit-ratio 96.8%) passthrough=1493 errors=0 cache_size=804
|
||
```
|
||
|
||
`passthrough` = calls that bypassed memo because the path wasn't a plugin
|
||
tree path — that count is essentially "rest of the system" and should be
|
||
mostly unchanged by us.
|
||
|
||
## Safety notes
|
||
|
||
- **Pass-through for non-plugin paths.** Business code (logs, session files,
|
||
skills/, secrets/, anything outside the whitelist) sees the unmodified
|
||
`fs`. Only plugin-discovery paths are intercepted.
|
||
- **1 s TTL.** Plugin manifest mtime resolution is millisecond level, so a
|
||
manifest change becomes visible at most ~1 s later. Dev-loop impact is
|
||
negligible.
|
||
- **Bounded memory.** `cache.clear()` fires when entries > 4 000.
|
||
- **Idempotent.** Module re-import (jiti reload) is a no-op via a sentinel
|
||
flag on `globalThis`.
|
||
- **Argument-aware.** Cache key includes a JSON of trailing args so
|
||
`statSync(p)` and `statSync(p, { bigint: true })` don't collide.
|
||
|
||
## Install
|
||
|
||
```bash
|
||
git clone https://git.hangman-lab.top/hzhang/OpenclawPerfCache.git
|
||
cd OpenclawPerfCache
|
||
npm --prefix plugin install
|
||
node scripts/install.mjs --install
|
||
systemctl --user restart openclaw-gateway
|
||
```
|
||
|
||
## Update (rebuild + recopy, no config touch)
|
||
|
||
```bash
|
||
node scripts/install.mjs --update
|
||
systemctl --user restart openclaw-gateway
|
||
```
|
||
|
||
## Uninstall
|
||
|
||
```bash
|
||
node scripts/install.mjs --uninstall
|
||
systemctl --user restart openclaw-gateway
|
||
```
|
||
|
||
If openclaw ever fixes the upstream cache-validity-check, this plugin can
|
||
be uninstalled with no consequence.
|