Files
OpenclawPerfCache/README.md
hzhang 49bcde41ec init: OpenClaw Perf Cache — fs.{stat,lstat,realpath}{,Sync} TTL memo for plugin-tree paths
Wraps the global fs functions with a 1s TTL memo, scoped via path
whitelist to plugin-discovery paths only. Workaround for upstream
openclaw issue #86791: `loadPluginMetadataSnapshot()`'s cache-validity
check re-runs `hashWatchedFiles` on every lookup, which walks every
plugin's package.json + manifest + source via realpathSync ->
ancestor lstat chain. On prod t2 with ~100 plugins, one cache-check
pass is ~6 400 lstat + ~400 stat (~6-7s CPU per call). Fires on every
agent turn, every loadConfig() call, every channel routing decision.

This plugin doesn't fix the upstream design; it just absorbs the
repeated stats within a 1s window so the same paths aren't re-statted
6× per second during a discovery walk.

Verified on prod t2 (2026-05-27):
  - Cache hit ratio: 92.1-98.2% (stable across windows)
  - Idle baseline (0 turn, 0 push): 0.6-3.7% CPU (was 25%+ pre-fix)
  - Per-turn cost: notably reduced; previously 100% sustained per turn

Path whitelist:
  - /openclaw/dist/extensions/
  - /.openclaw/plugins/
  - /node_modules/@openclaw/
  - /openclaw/plugin-sdk/

All other paths pass through to original fs functions unchanged.

Manifest requires `activation.onStartup: true` so openclaw register()s
the plugin even though it exposes no tools/contracts (otherwise jiti
caches the module without ever calling register).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 10:17:54 +01:00

109 lines
4.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# OpenClaw Perf Cache
A 200-line OpenClaw plugin that wraps `fs.statSync` / `fs.lstatSync` /
`fs.realpathSync` (and their `fs.promises` async siblings) with a 1-second
TTL memo, **scoped to plugin-tree paths only**, as a workaround for an
upstream openclaw performance bug.
## Why this exists
`loadPluginMetadataSnapshot()` in openclaw's `dist/plugin-metadata-snapshot-*.js`
keeps a memo of the resolved plugin registry, but its cache-validity check
runs `hashWatchedFiles(memo.watchedFiles)` on **every lookup**. That call
re-fingerprints every plugin's `package.json` + `openclaw.plugin.json` +
source + setupSource paths via `realpathSync` → ancestor `lstat` chain.
On a prod gateway with ~100 installed plugins (the bundled
`dist/extensions/*` set), one cache-check pass is roughly:
```
100 plugins × 4 watched-files × 2 realpath/file × ~8 lstat/realpath
≈ 6 400 lstat + ~400 stat per call
≈ 67 s CPU
```
The check fires from many call sites — every agent turn (tool middleware
loader), every `loadConfig()` call, every channel routing decision —
turning what should be a cheap snapshot hit into a sustained CPU drain.
Same hot path is observed in these upstream tickets:
- [#86791 — repeated lstat/realpathSync in InstalledPluginIndex fingerprinting (memoization missing)](https://github.com/openclaw/openclaw/issues/86791) — **open, P2**, exact same call chain (`lstat and realpathSync under resolvePackageJsonPath -> buildInstalledManifestRegistryIndexKey -> resolveInstalledManifestRegistryIndexFingerprint`); two linked PRs (#86797, #86850) in progress. Once that lands, this plugin becomes unnecessary.
- [#67040 — persist plugin discovery cache + defer plugin loading](https://github.com/openclaw/openclaw/issues/67040) (closed as *not planned*)
- [#75297 — gateway event-loop saturation, very slow sessions.list after 2026.4.23](https://github.com/openclaw/openclaw/issues/75297) (workaround: rollback to 2026.4.23)
- [#28587 — plugin runtime eagerly loads channel SDKs causing sustained high CPU on startup](https://github.com/openclaw/openclaw/issues/28587) (closed by PR #28620, but only fixed the startup path, not the per-turn cost)
## What this plugin does
On `register()` it patches the global fs functions:
| Wrapped | Pass-through when |
|---|---|
| `fs.statSync` | path does NOT match a plugin-tree needle |
| `fs.lstatSync` | path does NOT match a plugin-tree needle |
| `fs.realpathSync` | path does NOT match a plugin-tree needle |
| `fs.promises.stat` | path does NOT match a plugin-tree needle |
| `fs.promises.lstat` | path does NOT match a plugin-tree needle |
| `fs.promises.realpath` | path does NOT match a plugin-tree needle |
Plugin-tree needles (substring match — any one matches):
- `/openclaw/dist/extensions/`
- `/.openclaw/plugins/`
- `/node_modules/@openclaw/`
- `/openclaw/plugin-sdk/`
Matched calls get a 1 000 ms TTL memo keyed by `(fn-name, path, JSON(opts))`.
Cached errors throw the same error on subsequent reads within the window.
Counters are logged once a minute:
```
[perf-cache] last 60s: hits=812 misses=27 (hit-ratio 96.8%) passthrough=1493 errors=0 cache_size=804
```
`passthrough` = calls that bypassed memo because the path wasn't a plugin
tree path — that count is essentially "rest of the system" and should be
mostly unchanged by us.
## Safety notes
- **Pass-through for non-plugin paths.** Business code (logs, session files,
skills/, secrets/, anything outside the whitelist) sees the unmodified
`fs`. Only plugin-discovery paths are intercepted.
- **1 s TTL.** Plugin manifest mtime resolution is millisecond level, so a
manifest change becomes visible at most ~1 s later. Dev-loop impact is
negligible.
- **Bounded memory.** `cache.clear()` fires when entries > 4 000.
- **Idempotent.** Module re-import (jiti reload) is a no-op via a sentinel
flag on `globalThis`.
- **Argument-aware.** Cache key includes a JSON of trailing args so
`statSync(p)` and `statSync(p, { bigint: true })` don't collide.
## Install
```bash
git clone https://git.hangman-lab.top/hzhang/OpenclawPerfCache.git
cd OpenclawPerfCache
npm --prefix plugin install
node scripts/install.mjs --install
systemctl --user restart openclaw-gateway
```
## Update (rebuild + recopy, no config touch)
```bash
node scripts/install.mjs --update
systemctl --user restart openclaw-gateway
```
## Uninstall
```bash
node scripts/install.mjs --uninstall
systemctl --user restart openclaw-gateway
```
If openclaw ever fixes the upstream cache-validity-check, this plugin can
be uninstalled with no consequence.