init: OpenClaw Perf Cache — fs.{stat,lstat,realpath}{,Sync} TTL memo for plugin-tree paths
Wraps the global fs functions with a 1s TTL memo, scoped via path whitelist to plugin-discovery paths only. Workaround for upstream openclaw issue #86791: `loadPluginMetadataSnapshot()`'s cache-validity check re-runs `hashWatchedFiles` on every lookup, which walks every plugin's package.json + manifest + source via realpathSync -> ancestor lstat chain. On prod t2 with ~100 plugins, one cache-check pass is ~6 400 lstat + ~400 stat (~6-7s CPU per call). Fires on every agent turn, every loadConfig() call, every channel routing decision. This plugin doesn't fix the upstream design; it just absorbs the repeated stats within a 1s window so the same paths aren't re-statted 6× per second during a discovery walk. Verified on prod t2 (2026-05-27): - Cache hit ratio: 92.1-98.2% (stable across windows) - Idle baseline (0 turn, 0 push): 0.6-3.7% CPU (was 25%+ pre-fix) - Per-turn cost: notably reduced; previously 100% sustained per turn Path whitelist: - /openclaw/dist/extensions/ - /.openclaw/plugins/ - /node_modules/@openclaw/ - /openclaw/plugin-sdk/ All other paths pass through to original fs functions unchanged. Manifest requires `activation.onStartup: true` so openclaw register()s the plugin even though it exposes no tools/contracts (otherwise jiti caches the module without ever calling register). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
108
README.md
Normal file
108
README.md
Normal file
@@ -0,0 +1,108 @@
|
||||
# OpenClaw Perf Cache
|
||||
|
||||
A 200-line OpenClaw plugin that wraps `fs.statSync` / `fs.lstatSync` /
|
||||
`fs.realpathSync` (and their `fs.promises` async siblings) with a 1-second
|
||||
TTL memo, **scoped to plugin-tree paths only**, as a workaround for an
|
||||
upstream openclaw performance bug.
|
||||
|
||||
## Why this exists
|
||||
|
||||
`loadPluginMetadataSnapshot()` in openclaw's `dist/plugin-metadata-snapshot-*.js`
|
||||
keeps a memo of the resolved plugin registry, but its cache-validity check
|
||||
runs `hashWatchedFiles(memo.watchedFiles)` on **every lookup**. That call
|
||||
re-fingerprints every plugin's `package.json` + `openclaw.plugin.json` +
|
||||
source + setupSource paths via `realpathSync` → ancestor `lstat` chain.
|
||||
|
||||
On a prod gateway with ~100 installed plugins (the bundled
|
||||
`dist/extensions/*` set), one cache-check pass is roughly:
|
||||
|
||||
```
|
||||
100 plugins × 4 watched-files × 2 realpath/file × ~8 lstat/realpath
|
||||
≈ 6 400 lstat + ~400 stat per call
|
||||
≈ 6–7 s CPU
|
||||
```
|
||||
|
||||
The check fires from many call sites — every agent turn (tool middleware
|
||||
loader), every `loadConfig()` call, every channel routing decision —
|
||||
turning what should be a cheap snapshot hit into a sustained CPU drain.
|
||||
|
||||
Same hot path is observed in these upstream tickets:
|
||||
|
||||
- [#86791 — repeated lstat/realpathSync in InstalledPluginIndex fingerprinting (memoization missing)](https://github.com/openclaw/openclaw/issues/86791) — **open, P2**, exact same call chain (`lstat and realpathSync under resolvePackageJsonPath -> buildInstalledManifestRegistryIndexKey -> resolveInstalledManifestRegistryIndexFingerprint`); two linked PRs (#86797, #86850) in progress. Once that lands, this plugin becomes unnecessary.
|
||||
- [#67040 — persist plugin discovery cache + defer plugin loading](https://github.com/openclaw/openclaw/issues/67040) (closed as *not planned*)
|
||||
- [#75297 — gateway event-loop saturation, very slow sessions.list after 2026.4.23](https://github.com/openclaw/openclaw/issues/75297) (workaround: rollback to 2026.4.23)
|
||||
- [#28587 — plugin runtime eagerly loads channel SDKs causing sustained high CPU on startup](https://github.com/openclaw/openclaw/issues/28587) (closed by PR #28620, but only fixed the startup path, not the per-turn cost)
|
||||
|
||||
## What this plugin does
|
||||
|
||||
On `register()` it patches the global fs functions:
|
||||
|
||||
| Wrapped | Pass-through when |
|
||||
|---|---|
|
||||
| `fs.statSync` | path does NOT match a plugin-tree needle |
|
||||
| `fs.lstatSync` | path does NOT match a plugin-tree needle |
|
||||
| `fs.realpathSync` | path does NOT match a plugin-tree needle |
|
||||
| `fs.promises.stat` | path does NOT match a plugin-tree needle |
|
||||
| `fs.promises.lstat` | path does NOT match a plugin-tree needle |
|
||||
| `fs.promises.realpath` | path does NOT match a plugin-tree needle |
|
||||
|
||||
Plugin-tree needles (substring match — any one matches):
|
||||
|
||||
- `/openclaw/dist/extensions/`
|
||||
- `/.openclaw/plugins/`
|
||||
- `/node_modules/@openclaw/`
|
||||
- `/openclaw/plugin-sdk/`
|
||||
|
||||
Matched calls get a 1 000 ms TTL memo keyed by `(fn-name, path, JSON(opts))`.
|
||||
Cached errors throw the same error on subsequent reads within the window.
|
||||
|
||||
Counters are logged once a minute:
|
||||
|
||||
```
|
||||
[perf-cache] last 60s: hits=812 misses=27 (hit-ratio 96.8%) passthrough=1493 errors=0 cache_size=804
|
||||
```
|
||||
|
||||
`passthrough` = calls that bypassed memo because the path wasn't a plugin
|
||||
tree path — that count is essentially "rest of the system" and should be
|
||||
mostly unchanged by us.
|
||||
|
||||
## Safety notes
|
||||
|
||||
- **Pass-through for non-plugin paths.** Business code (logs, session files,
|
||||
skills/, secrets/, anything outside the whitelist) sees the unmodified
|
||||
`fs`. Only plugin-discovery paths are intercepted.
|
||||
- **1 s TTL.** Plugin manifest mtime resolution is millisecond level, so a
|
||||
manifest change becomes visible at most ~1 s later. Dev-loop impact is
|
||||
negligible.
|
||||
- **Bounded memory.** `cache.clear()` fires when entries > 4 000.
|
||||
- **Idempotent.** Module re-import (jiti reload) is a no-op via a sentinel
|
||||
flag on `globalThis`.
|
||||
- **Argument-aware.** Cache key includes a JSON of trailing args so
|
||||
`statSync(p)` and `statSync(p, { bigint: true })` don't collide.
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
git clone https://git.hangman-lab.top/hzhang/OpenclawPerfCache.git
|
||||
cd OpenclawPerfCache
|
||||
npm --prefix plugin install
|
||||
node scripts/install.mjs --install
|
||||
systemctl --user restart openclaw-gateway
|
||||
```
|
||||
|
||||
## Update (rebuild + recopy, no config touch)
|
||||
|
||||
```bash
|
||||
node scripts/install.mjs --update
|
||||
systemctl --user restart openclaw-gateway
|
||||
```
|
||||
|
||||
## Uninstall
|
||||
|
||||
```bash
|
||||
node scripts/install.mjs --uninstall
|
||||
systemctl --user restart openclaw-gateway
|
||||
```
|
||||
|
||||
If openclaw ever fixes the upstream cache-validity-check, this plugin can
|
||||
be uninstalled with no consequence.
|
||||
Reference in New Issue
Block a user