Cost control for AI coding agents.
Budget guardrails, runaway loop detection, process wrappers, and output compression.
Built on the iii-engine. Four interfaces. Zero external dependencies.
Why · Quick Start · Budget · Runaway · Guard · Compression · Interfaces · Architecture
Notice. Subject is running four concurrent agents. Projected monthly spend exceeds threshold by 182%. Recommending immediate containment.
One developer. Six agents running in parallel. Claude Code in one terminal, Cursor open in the IDE, Codex piped into a script, Copilot silently billing by the token. The invoice lands at the end of the month and it is already too late.
Rimuru is the control plane between you and that bill. It discovers every agent on your machine, tracks spend in real time, enforces hard caps before the write hits state, detects runaway loops before they finish burning, wraps processes with a kill switch, and compresses tool output so context windows stop bleeding tokens. One engine. Four interfaces. Zero external dependencies -- no Postgres, no Redis, no Docker.
Everything ships as iii-engine primitives (Worker / Function / Trigger). State lives in the engine's in-memory KV under scoped namespaces. The CLI talks to the engine via trigger(). The Web UI, TUI, and Desktop app all share the same function surface.
System initiated. Acquiring agents. Preparing dashboard.
curl -fsSL https://raw.githubusercontent.com/rohitg00/rimuru/main/install.sh | bashInstalls the iii engine if missing. Drops rimuru-worker, rimuru, rimuru-tui into ~/.local/bin, copies the iii config to ~/.config/rimuru/config.yaml, and creates the durable state directory at ~/.local/share/rimuru/. Takes about thirty seconds on a warm cache.
iii --config ~/.config/rimuru/config.yaml # start iii with durable state
rimuru-worker # start the worker
open http://localhost:3100Rimuru stores cost records, budget counters, guard history, and session data under ~/.local/share/rimuru/ via iii-engine's file-backed KV. The shipped config flushes dirty state every 250 ms (save_interval_ms: 250), so restart-survival is bounded at roughly a quarter second of the most recent writes — iii-engine doesn't currently flush on shutdown, so anything written in the last flush window can be lost if the process is killed. For everyday use that's negligible; if you're running experiments where every last cost row matters, stop iii cleanly and give it a second before restart. Running bare iii (without --config) or iii --use-default-config falls back to the in-memory store, which iii itself warns against — everything you record disappears on shutdown.
Override the data directory with RIMURU_DATA_DIR when running the installer (e.g. RIMURU_DATA_DIR=/var/lib/rimuru ./install.sh); the installer rewrites the config in place so iii honors the override.
Detect your agents. See what you are spending. Set a cap.
rimuru agents detect
rimuru costs summary
rimuru config set budget_monthly 50
rimuru config set budget_action block
rimuru budget statusOther platforms: pre-built binaries, Docker image, or build from source.
Scanning filesystem. Six tools detected. Session histories indexed.
Rimuru auto-discovers agents from their local config directories. No token swap, no auth dance, no proxy -- it reads session history the way each tool writes it.
![]() Claude Code ~/.claude/
|
![]() Cursor ~/Library/Application Support/Cursor/
|
![]() GitHub Copilot VS Code extension |
![]() Codex ~/.config/codex/
|
![]() Goose ~/.config/goose/
|
![]() OpenCode ~/.opencode/
|
Model pricing is maintained for 8 models across 5 providers. Six agent adapters ship in-tree. Cost records are idempotent -- re-syncing the same session overwrites rather than duplicates.
Unique Skill acquired.
Budget Enginebinds cost records to configured thresholds. Monthly, daily, session, and per-agent-daily caps will be enforced before write.
Four cap levels. One enforcement path. Every cost record runs through rimuru.budget.check before it touches state. When caps are exceeded and budget_action = block, the recording is rejected and the error propagates back.
rimuru config set budget_monthly 50
rimuru config set budget_daily 5
rimuru config set budget_session 2
rimuru config set budget_daily_agent 10
rimuru config set budget_action block| Field | Default | What it caps |
|---|---|---|
budget_monthly |
0.0 |
Total spend for the current calendar month |
budget_daily |
0.0 |
Total spend for today |
budget_session |
0.0 |
Spend attributed to a single session_id |
budget_daily_agent |
0.0 |
Per-agent spend for today |
budget_alert_threshold |
0.8 |
Warning fires at this fraction of any cap |
budget_action |
alert |
alert (log + hook), warn (log only), or block (reject the record) |
Status projects end-of-month spend from the current daily rate against the actual days in this month.
rimuru budget status
# monthly_limit: $50.00 monthly_spent: $18.42 monthly_remaining: $31.58
# daily_limit: $5.00 daily_spent: $1.37 daily_remaining: $3.63
# burn_rate: $1.84/day projected_monthly: $55.20 // over capThreshold crossings fire the budget.warning hook. Cap breaches fire budget.exceeded. Alerts are persisted with millisecond + UUID keys so burst alerts never collide.
Fail-open on service outage. If the budget check itself is unreachable, cost recording logs a warning and proceeds. Only a successful "exceeded + block" response halts the write.
Analytical Skill engaged. Scanning last ten turns. Pattern recognition active. Severity score: 0.82. Recommending intervention.
Four detection patterns run over the last N turns of a session. Severity scoring tells you how stuck it is. Token accounting tells you how much was wasted.
| Pattern | Trigger | Severity |
|---|---|---|
repeated_calls |
Same tool name repeated > runaway_repeat_threshold times |
min(1, n/10) |
repeated_errors |
Identical content_type + role signature > threshold times |
min(1, n/8) |
token_explosion |
Last 3 turns > ratio × baseline input tokens (baseline excludes last 3) | min(1, (ratio-1)/4) |
oscillation |
Two tools alternating > 4 times | min(1, n/8) |
rimuru.runaway.analyze '{"session_id": "...", "window": 15}'
rimuru.runaway.scan '{}'
rimuru.runaway.configure '{"repeat_threshold": 4, "token_explosion_ratio": 2.5}'The token-explosion baseline excludes the last 3 turns so spike samples can never inflate the average they are compared against. Invalid thresholds (zero window, ratio ≤ 1.0, non-boolean auto_scan_enabled) are rejected at write time.
Bind Skill activated. Target process wrapped. Polling cost every five seconds. On threshold breach, issuing kill signal.
Wrap any agent process in a cost limit. The wrapper polls cost totals every five seconds from the guard start time and either warns or kills the child when the limit is crossed.
rimuru guard start --limit 5.00 --action kill -- claude
rimuru guard start --limit 2.00 --action warn -- cursor --cli
rimuru guard status
rimuru guard history- Spawn first, register second. If
Command::spawn()fails, nothing gets written to KV. The register payload includes the real PID. - Scoped cost query.
rimuru.costs.summaryis called withsince = started_at-- global totals from yesterday never trip today's guard. - Stderr diagnostics. The wrapper banner, warnings, and summary go to
stderrso they do not corrupt the wrapped process'sstdout. - Atomic completion. History is written before the active guard is deleted. A half-failed write leaves the ledger consistent.
- Typed flags.
--actionis aclap::ValueEnum(killorwarn).--limitrejects NaN, infinity, and non-positive values at parse time.
Transmutation Skill engaged. Tool response exceeds threshold. Selecting compression strategy. Auto mode:
JsonPaths. Estimated token savings: 73%.
The MCP proxy compresses tool responses over 2000 tokens before they reach the agent. Auto inspects the payload shape and routes to the strategy that fits.
| Strategy | Picked when | What it does |
|---|---|---|
Auto |
default -- inspects content | Routes to one of the strategies below |
JsonPaths |
object or array with > 50 keys or > 5K tokens | Shallow keys kept; deep nodes → {__depth, __keys}; long arrays → head + tail + truncation marker |
ErrorsOnly |
string with error / warning / fail / panic lines | Keeps matched lines plus 3 lines of prior context |
TreeView |
string that looks like a file listing | Paths rendered as an indented tree |
Summarize |
plain text | First 10 + last 5 lines, removes the middle |
Truncate |
fallback | Char-safe truncation at ~max_tokens × 3 chars |
All string truncation uses char_indices().nth() -- multi-byte characters cannot split mid-codepoint. Max-char arithmetic uses saturating_mul so extreme max_tokens never wraps. If a smart strategy does not drop below the cap, the fallback is Truncate.
rimuru mcp stats
# github::issues_list calls=42 saved=18,924 tokens compressed=12
# filesystem::read_file calls=117 saved=61,302 tokens compressed=48Four manifestation forms registered. Select the one that matches your current task.
Every function is reachable from four front-ends. Pick the one that fits.
| Form | Binary | What it is |
|---|---|---|
| Web UI | embedded in rimuru-worker on :3100 |
13 pages -- Dashboard, Agents, Sessions, Costs, Models, Advisor, Context, MCP Proxy, City, Plugins, Hooks, Settings, Terminal. Single-file React build, ~1 MB, served directly by the worker. |
| CLI | rimuru |
13 command groups, --format table|json|yaml, direct iii trigger calls (no HTTP middleman). |
| TUI | rimuru-tui |
Ratatui 0.29. 12 tabs, 15 Tensura-themed color schemes (Rimuru Slime, Great Sage, Predator, Veldora, Shion, Milim, Diablo, and more). Keyboard nav: Tab, j/k, 1-9, /, t, ?. |
| Desktop | rimuru-desktop |
Tauri v2 native app with embedded worker. 46 IPC commands across 10 modules, system tray, global shortcut, window state persistence. |
Detecting hardware. CPU + RAM + GPU acquired. Mapping API models to local equivalents. Projected monthly savings computed.
The advisor detects CPU, RAM, and GPU (Metal / CUDA / ROCm) and maps every API model you use to a local equivalent from a 50+ model catalog. Fit is scored Perfect, Good, Marginal, or Too Tight, with tok/s estimates and projected monthly savings.
rimuru models advisor
# Claude Sonnet 4 -> Qwen2.5-14B (Q4_K_M) fit: Perfect savings: $34/mo
# Claude Opus 4.5 -> Llama 3.3 70B (Q4_0) fit: Marginal savings: $128/mo
# GPT-4o -> Qwen2.5-32B (Q5_K_M) fit: Good savings: $41/mo iii Engine (WS :49134)
│
┌────────────────────┼────────────────────┐
│ │ │
rimuru-worker rimuru-cli rimuru-desktop
(core + iii-http) (iii trigger) (Tauri v2 + embedded worker)
│
┌─────┴─────┐
│ │
Web UI rimuru-tui
(:3100) (HTTP client)
| Crate | Binary | What it is |
|---|---|---|
rimuru-core |
rimuru-worker |
iii worker with 60+ registered functions, 45+ HTTP endpoints via iii-http, budget / runaway / guard / compression / advisor modules, agent adapters, hook dispatcher, embedded web UI |
rimuru-cli |
rimuru |
Clap CLI. Connects via register_worker, invokes functions through iii.trigger(). No HTTP hop |
rimuru-tui |
rimuru-tui |
Ratatui 0.29 terminal UI. HTTP client against the worker |
rimuru-desktop |
rimuru-desktop |
Tauri v2 app with the worker embedded. 46 IPC commands forwarding to the in-process engine |
State lives in the engine's in-memory KV under scoped namespaces: agents, sessions, cost_records, cost_daily, cost_agent, budgets, budget_alerts, guards, guard_history, mcp_servers, mcp_metrics, hooks, plugins, config, context_breakdowns.
Two call paths. Overlapping but not identical surfaces.
- HTTP via
iii-httpon:3111-- registered from the central route table incrates/rimuru-core/src/triggers/api.rs. Payloads are normalized byextract_input()so path params, query params, and bodies merge into oneValue. - Direct trigger via the iii WebSocket --
iii.trigger(TriggerRequest { function_id: "rimuru.budget.check", ... }). This is what the CLI uses to skip the HTTP hop.
The HTTP layer now covers everything including the v0.4.0 guardrails: agents, sessions, costs, budget, runaway, guard, context, models, advisor, metrics, health, MCP proxy, hooks, plugins, and config. The CLI prefers direct triggers for latency; external clients (Web UI, curl, scripts) use the HTTP routes.
Function namespaces:
rimuru.agents.* list, get, create, connect, disconnect, detect (plus update/delete/status/sync via trigger)
rimuru.sessions.* list, get, active, history (plus cleanup via trigger)
rimuru.costs.* summary, daily, by_agent, record (plus daily_rollup via trigger)
rimuru.budget.* check, status, set, alerts
rimuru.runaway.* analyze, scan, configure
rimuru.guard.* list, register, complete, history
rimuru.hardware.* get, detect
rimuru.models.* list, get, sync
rimuru.advisor.* assess, catalog
rimuru.metrics.* current, history
rimuru.context.* breakdown, breakdown_by_session, utilization, waste
rimuru.mcp.proxy.* connect, tools, call, search, stats, disconnect
rimuru.hooks.* register, dispatch
rimuru.plugins.* install, uninstall, start, stop
rimuru.config.* get, set
rimuru.health.* check
Full HTTP endpoint list with methods and paths is in docs/api.md.
git clone https://github.com/rohitg00/rimuru.git
cd rimuru
cd ui && npm ci && npm run build && cd .. # single-file production UI
cargo build --release # all four crates
cargo fmt --check
cargo clippy --all-targets -- -D warnings
cargo test --workspaceBinaries land in target/release/: rimuru-worker, rimuru, rimuru-tui, rimuru-desktop.
The README SVG design system (Tempest UI) lives in scripts/generate-readme-tags.py. Regenerate tags from Python -- don't hand-edit the SVGs under docs/assets/tags/.
python3 scripts/generate-readme-tags.pyPull requests welcome. Keep the no-emoji rule in commits, README, and section headers.
Apache License 2.0. See LICENSE.
Built on iii-engine (Worker / Function / Trigger primitives). Terminal UI powered by Ratatui. Desktop app powered by Tauri. The name and "Unique Skill" framing are a nod to That Time I Got Reincarnated as a Slime by Fuse.
Tempest System // v0.4.0 Benimaru // made by Rohit Ghumare







