Loading...
Loading...
Standardized benchmarks for OpenClaw forks running on real hardware constraints. Each test runs in a Docker container matching the device's CPU and memory profile.
| # | Device | Fork | Score |
|---|
A repeatable, containerized pipeline that tests every fork under real hardware constraints.
Fork is cloned into a Docker container with CPU and RAM limits matching the target device.
Dependencies installed via the native toolchain — Go, Rust, Python, TypeScript, or C.
Entry point detected, cold start timed, peak memory tracked via cgroup, disk usage measured.
Results combined into a 0–100 composite score weighted across four dimensions.
Messaging, browser, code exec, memory, files, search, MCP, tool use
Cold start time — clone + install + startup. Under 5s = full marks
Disk footprint after install. Under 20MB = full marks
5 pts for dependency install, 5 pts for successful startup
Detected via static source analysis and runtime module probing. Each passed test contributes 5 pts to the capabilities score.
Run benchmarks locally or contribute improvements