Files
OpenclawPerfCache/README.md
hzhang dfb581f5f0 docs: note upstream PRs #86797/#86850 share the same WeakMap strategy, narrower scope
Both PRs use WeakMap<index-object, fingerprint> to memoize
`resolveInstalledManifestRegistryIndexFingerprint`. Different layer
than this plugin's fs-level TTL — theirs only fires when the same
InstalledPluginIndex instance is passed twice; ours fires for every
stat/realpath on a plugin-tree path regardless of caller.

discoverInDirectory walks, the realpathSync ancestor-lstat chain, and
the loadConfig()-via-pushMetaToMonitor path don't go through the
fingerprint function, so this plugin keeps catching that traffic even
once their PR lands. Reevaluate the need then.

Both upstream PRs are blocked on review (type-check failure + need for
real-behavior profile evidence). Our prod 92-98% hit-ratio + 25%→<1%
baseline is exactly the evidence shape they're asking for, but
upstream involvement is intentionally out of scope per project decision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 10:37:11 +01:00

109 lines
4.8 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# OpenClaw Perf Cache
A 200-line OpenClaw plugin that wraps `fs.statSync` / `fs.lstatSync` /
`fs.realpathSync` (and their `fs.promises` async siblings) with a 1-second
TTL memo, **scoped to plugin-tree paths only**, as a workaround for an
upstream openclaw performance bug.
## Why this exists
`loadPluginMetadataSnapshot()` in openclaw's `dist/plugin-metadata-snapshot-*.js`
keeps a memo of the resolved plugin registry, but its cache-validity check
runs `hashWatchedFiles(memo.watchedFiles)` on **every lookup**. That call
re-fingerprints every plugin's `package.json` + `openclaw.plugin.json` +
source + setupSource paths via `realpathSync` → ancestor `lstat` chain.
On a prod gateway with ~100 installed plugins (the bundled
`dist/extensions/*` set), one cache-check pass is roughly:
```
100 plugins × 4 watched-files × 2 realpath/file × ~8 lstat/realpath
≈ 6 400 lstat + ~400 stat per call
≈ 67 s CPU
```
The check fires from many call sites — every agent turn (tool middleware
loader), every `loadConfig()` call, every channel routing decision —
turning what should be a cheap snapshot hit into a sustained CPU drain.
Same hot path is observed in these upstream tickets:
- [#86791 — repeated lstat/realpathSync in InstalledPluginIndex fingerprinting (memoization missing)](https://github.com/openclaw/openclaw/issues/86791) — **open, P2**, exact same call chain (`lstat and realpathSync under resolvePackageJsonPath -> buildInstalledManifestRegistryIndexKey -> resolveInstalledManifestRegistryIndexFingerprint`). Two linked PRs (#86797, #86850) both try the same WeakMap-keyed-by-index-object memo strategy; both are blocked on review (type-check + real-behavior-proof). They only cover `resolveInstalledManifestRegistryIndexFingerprint` — they don't help with `discoverInDirectory` walks or with `realpathSync`'s ancestor-`lstat` chain. Even after they merge, this plugin's fs-layer TTL still catches stat traffic the WeakMap doesn't see; reevaluate then.
- [#67040 — persist plugin discovery cache + defer plugin loading](https://github.com/openclaw/openclaw/issues/67040) (closed as *not planned*)
- [#75297 — gateway event-loop saturation, very slow sessions.list after 2026.4.23](https://github.com/openclaw/openclaw/issues/75297) (workaround: rollback to 2026.4.23)
- [#28587 — plugin runtime eagerly loads channel SDKs causing sustained high CPU on startup](https://github.com/openclaw/openclaw/issues/28587) (closed by PR #28620, but only fixed the startup path, not the per-turn cost)
## What this plugin does
On `register()` it patches the global fs functions:
| Wrapped | Pass-through when |
|---|---|
| `fs.statSync` | path does NOT match a plugin-tree needle |
| `fs.lstatSync` | path does NOT match a plugin-tree needle |
| `fs.realpathSync` | path does NOT match a plugin-tree needle |
| `fs.promises.stat` | path does NOT match a plugin-tree needle |
| `fs.promises.lstat` | path does NOT match a plugin-tree needle |
| `fs.promises.realpath` | path does NOT match a plugin-tree needle |
Plugin-tree needles (substring match — any one matches):
- `/openclaw/dist/extensions/`
- `/.openclaw/plugins/`
- `/node_modules/@openclaw/`
- `/openclaw/plugin-sdk/`
Matched calls get a 1 000 ms TTL memo keyed by `(fn-name, path, JSON(opts))`.
Cached errors throw the same error on subsequent reads within the window.
Counters are logged once a minute:
```
[perf-cache] last 60s: hits=812 misses=27 (hit-ratio 96.8%) passthrough=1493 errors=0 cache_size=804
```
`passthrough` = calls that bypassed memo because the path wasn't a plugin
tree path — that count is essentially "rest of the system" and should be
mostly unchanged by us.
## Safety notes
- **Pass-through for non-plugin paths.** Business code (logs, session files,
skills/, secrets/, anything outside the whitelist) sees the unmodified
`fs`. Only plugin-discovery paths are intercepted.
- **1 s TTL.** Plugin manifest mtime resolution is millisecond level, so a
manifest change becomes visible at most ~1 s later. Dev-loop impact is
negligible.
- **Bounded memory.** `cache.clear()` fires when entries > 4 000.
- **Idempotent.** Module re-import (jiti reload) is a no-op via a sentinel
flag on `globalThis`.
- **Argument-aware.** Cache key includes a JSON of trailing args so
`statSync(p)` and `statSync(p, { bigint: true })` don't collide.
## Install
```bash
git clone https://git.hangman-lab.top/hzhang/OpenclawPerfCache.git
cd OpenclawPerfCache
npm --prefix plugin install
node scripts/install.mjs --install
systemctl --user restart openclaw-gateway
```
## Update (rebuild + recopy, no config touch)
```bash
node scripts/install.mjs --update
systemctl --user restart openclaw-gateway
```
## Uninstall
```bash
node scripts/install.mjs --uninstall
systemctl --user restart openclaw-gateway
```
If openclaw ever fixes the upstream cache-validity-check, this plugin can
be uninstalled with no consequence.