OpenClaw Health Monitor

OpenClaw Cross-Platform Health Monitor — diagnose and fix gateway issues. Background health polling, multi-channel alerts (WhatsApp/Telegram), event loop monitoring, and automated diagnostics. Works on Windows, Linux, and macOS.

Install

openclaw plugins install clawhub:@jordan-thirkle/openclaw-winhealth

OpenClaw Cross-Platform Health Monitor 🩺

Diagnose and fix OpenClaw gateway issues. Background health polling, multi-channel alerts, event loop monitoring, and automated diagnostics. Works on Windows, Linux, and macOS.

License: MIT Tests Platform OpenClaw Version PRs Welcome GitHub Release

Privacy & Security

This plugin monitors your gateway's operational health. By default in v1.4.0, no data leaves your machine.

  • Monitoring probes are read-only — they check http://127.0.0.1:18789 without modifying gateway state
  • Diagnostic bundles are local files created only when you run winhealth_diagnostics or openclaw gateway diagnostics export. These may contain system metadata — review before sharing
  • Log tail extraction (winhealth_diagnostics --include_logs true) reads recent gateway log messages. Logs may contain file paths, identifiers, or operational metadata — use with caution
  • External alerts (WhatsApp/Telegram) are off by default (alertChannel: "none"). You must explicitly opt in and understand what alert payloads contain
  • Alert payloads contain only: severity level, metric value, and recommended action. No API keys, conversations, or configuration data
  • Gateway token is read from your environment for local health probes only — never logged, persisted, or transmitted

Read the full disclosure: SECURITY.md | SkillSpector Audit


Why This Exists

OpenClaw gateways can experience performance regressions across all platforms. After extensive debugging of the 2026.5.22 performance regression (event loop blocking, CLI tool slowness, prewarm bottlenecks), I built this to automatically detect and diagnose these issues across Windows, Linux, and macOS.

First system health monitoring tool on ClawHub. 27 automated tests with CI/CD pipeline.

Features

🔍 Health Checks

  • Gateway health snapshot — event loop (p99, max, utilization), channel status, memory
  • Windows Scheduled Task — state, last result, last run time
  • Prewarm detection — identifies 2026.5.22+ provider auth prewarm blocking (30-79s stalls)
  • Stuck subagent detection — finds background subagents blocking gateway restart
  • CLI vs HTTP delta — the key discovery: CLI tool can be 20-30x slower than HTTP endpoint on Windows

🚨 Alerts

  • WhatsApp and Telegram alerts when thresholds breach (off by default — requires explicit opt-in)
  • Configurable thresholds (event loop p99, memory RSS)
  • Alert management (list, dismiss, clear)
  • Optional auto-diagnose on alerts (off by default — see SECURITY.md)

🩺 Diagnostics

  • Full diagnostic bundle export (openclaw gateway diagnostics export) — review output before sharing
  • Recent log tail extraction (disabled by default — enable with include_logs: true)
  • Channel health probe
  • Gateway status summary

📊 Background Monitoring

  • Periodic health polling (configurable, default 5 minutes)
  • Automatic alert generation on degradation
  • Non-blocking — uses gateway_start lifecycle hook
  • Zero-dependency core (uses OpenClaw SDK only)

Installation

Prerequisites

  • OpenClaw ≥ 2026.5.0
  • Node.js ≥ 22.19
  • Windows 10/11, Linux, or macOS

Install the Plugin

openclaw plugins install clawhub:@jordan-thirkle/openclaw-winhealth

Install the Skill

openclaw skills install windows-health-monitor

Restart the gateway to load the plugin:

openclaw gateway restart

Post-Install Verification

After installation and restart, confirm the plugin is active:

# 1. Verify the plugin is registered
openclaw plugins inspect winhealth --runtime --json

# 2. Check the gateway log for startup
openclaw gateway logs --tail 20 | grep "winhealth"

# Expected output: "winhealth: started, polling every 5m"
# Expected output: "winhealth: health check passed" (after ~60s initial delay)

# 3. Run a manual health check
openclaw run winhealth_check --json

# Expected output: JSON with eventLoop, channels, agents, etc.

Uninstall

# Remove the plugin
openclaw plugins remove winhealth

# Remove the skill
openclaw skills remove windows-health-monitor

# Restart the gateway
openclaw gateway restart

Configuration

Add to your openclaw.json:

// Minimal config — local monitoring only, no external transmission:
{
  "plugins": {
    "allow": ["winhealth"],
    "entries": {
      "winhealth": {
        "enabled": true,
        "config": {
          "pollIntervalMinutes": 5,
          "eventLoopThresholdMs": 5000,
          "alertChannel": "none",
          "alertTarget": "+15555550123",
          "checkPrewarm": true,
          "checkWindowsTask": true,
          "checkBackgroundSubagents": true
        }
      }
    }
  }
}

See SECURITY.md before enabling external alert channels.

Config Reference

FieldTypeDefaultDescription
enabledbooleantrueEnable/disable background monitoring
pollIntervalMinutesinteger5Minutes between health checks (1-60)
eventLoopThresholdMsinteger5000Event loop p99 threshold for alert (500-30000)
alertChannelstring"none"Alert channel: "whatsapp", "telegram", or "none" (off by default — see SECURITY.md)
alertTargetstring""Target for alerts (phone number or user ID). Only used when alertChannel is not "none"
checkPrewarmbooleantrueCheck for provider auth prewarm blocking
checkWindowsTaskbooleantrueCheck Windows Scheduled Task health
checkBackgroundSubagentsbooleantrueCheck for stuck background subagents

Usage

Agent Tools

Once the plugin is loaded, agents can use three tools:

ToolPurpose
winhealth_checkQuick health snapshot — event loop, channels, Windows task, prewarm, alerts
winhealth_diagnosticsFull diagnostic bundle — export, logs, status, channels
winhealth_alertsManage alerts — list, dismiss, clear

Ask your agent:

"Run a winhealth_check and tell me if anything is wrong."

"Run winhealth_diagnostics and summarize the findings."

"Show me active winhealth alerts."

Manual CLI

The skill also provides manual diagnostic commands:

# Quick health snapshot
openclaw health --verbose --json

# Channel status
openclaw channels status --probe

# Windows task
Get-ScheduledTask -TaskName "OpenClaw Gateway"

# Full diagnostic export
openclaw gateway diagnostics export

Web Dashboard

The plugin includes a live health dashboard with radial gauges, metrics cards, and alert history.

Prerequisites:

  • The canvas plugin must be installed and enabled in your OpenClaw config
  • Canvas host root configured (e.g., "host": { "root": "~/.openclaw/workspace/canvas" })

Setup:

# Clone the repo to get the dashboard files
git clone https://github.com/jordan-thirkle/openclaw-winhealth.git
# Copy the WinHealth dashboard to your canvas host root
cp openclaw-winhealth/dashboard/index.html ~/.openclaw/workspace/canvas/winhealth/index.html
# Copy the Command Center dashboard (optional)
cp openclaw-winhealth/dashboard/command.html ~/.openclaw/workspace/canvas/command/index.html

Access:

  • WinHealth Dashboard: http://127.0.0.1:18789/__openclaw__/canvas/winhealth/
  • Command Center: http://127.0.0.1:18789/__openclaw__/canvas/command/

The dashboard uses sessionStorage by default — your gateway token is cleared when you close the tab. Enable "Remember token" to persist it across sessions. See SECURITY.md for dashboard security details.

Alert Examples

Event Loop Degradation

⚠️ OpenClaw Health Alert
[CRITICAL] Event loop degraded: p99=8500ms (threshold 5000ms)

Consider: OPENCLAW_SKIP_PROVIDER_AUTH_PREWARM=1

Stuck Subagents

⚠️ OpenClaw Health Alert
[CRITICAL] 4 background subagent(s) blocking gateway restart

Prewarm Detection

⚠️ OpenClaw Health Alert
[WARNING] Provider auth prewarm slow: 68000ms. Consider OPENCLAW_SKIP_PROVIDER_AUTH_PREWARM=1

Known Issues This Detects

IssueDetection
2026.5.22 prewarm blockingLog check for provider auth state pre-warmed in Xms eventLoopMax=Yms
CLI tool slownessHealth via HTTP vs CLI response time delta
Stuck background subagentsLog check for restart.*deferred.*background task.*active
WhatsApp reconnection stormChannel health probe + connection age
Scheduled Task stallGet-ScheduledTask state check
Memory pressureRSS threshold monitoring
Event loop saturationp99 delay + utilization monitoring

Architecture

Gateway Startup
  │
  └─ gateway_start hook
       │
       ├─ Initial health check (60s grace)
       └─ setInterval (configurable, default 5m)
            │
            ├─ HTTP health probe (127.0.0.1:18789/health)
            ├─ Windows task check (Get-ScheduledTask)
            ├─ Prewarm detection (log grep)
            ├─ Stuck subagent detection (log grep)
            │
             ├─ Threshold evaluation
             └─ Alert routing (WhatsApp / Telegram)

Development

git clone https://github.com/jordan-thirkle/openclaw-winhealth.git
cd openclaw-winhealth
npm install

# Test locally
openclaw plugins install .
openclaw plugins inspect winhealth --runtime --json

Publish to ClawHub

# Dry run
npm run publish:clawhub:dry

# Publish
npm run publish:clawhub

Contributing

Issues and PRs welcome. Before submitting:

  1. Test on Windows 10/11 native
  2. Run openclaw plugins inspect winhealth --runtime --json
  3. Include reproduction steps for any issues

Related Projects

License

MIT © Jordan Thirkle