
Running Claude Code across a multi-node Mac Mini cluster without execution logs is like flying without a black box. When something breaks at 2 AM, "I checked bash history" is not a postmortem — it's a guess. This post shows how a single wrapper script and two jq queries turn raw CLI invocations into a queryable audit trail with real anomaly detection.
This is for anyone running Claude Code in automation, across multiple machines, or with CI pipelines where flag combinations matter for security and reliability.
The Problem: Flags Are Intent, and You're Not Recording It
Most CLI audit setups capture the command name. Almost none capture the full flag combination. That gap is where incidents live.
The --dangerously-skip-permissions flag means something very different from --output-format json. One is a deliberate override; the other is routine formatting. If you're logging both the same way — or not logging either — you've lost the signal.
On the cluster that triggered this writeup, a node was firing --dangerously-skip-permissions seventeen times overnight. The model flag --model claude-opus-4-5 appeared in three of those runs. No one had authorized either. There was no record to check. The only reason it surfaced at all was a spike in API token consumption the next morning.
Before the fix, tracing any anomalous run averaged 40 minutes — digging through ~/.bash_history, correlating timestamps manually, comparing across four machines. That's not debugging. That's archaeology.
The first thing I tried was script command session recording and ~/.bash_history with HISTTIMEFORMAT. Both approaches broke down immediately: script output is unstructured text, bash_history doesn't survive across sessions reliably, and neither gives you per-flag granularity across hosts in a parseable format.
What I actually needed was structured, append-only, machine-readable logs — one JSON line per invocation.
The Fix: One Wrapper Script
The core insight is that exec claude "$@" passes all arguments through unchanged, so the wrapper is completely transparent to the caller. You record first, then hand off.
#!/usr/bin/env bash
# ~/bin/claude-log.sh — flag audit wrapper
LOG_FILE="$HOME/.claude/audit.jsonl"
TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
HOSTNAME=$(hostname)
ENTRY=$(printf '{"ts":"%s","host":"%s","args":%s}\n' \
"$TIMESTAMP" \
"$HOSTNAME" \
"$(printf '%s\n' "$@" | jq -R . | jq -sc .)")
echo "$ENTRY" >> "$LOG_FILE"
exec claude "$@"
Wire it in permanently:
alias claude='bash ~/bin/claude-log.sh'
Add that alias to ~/.bashrc or ~/.zshrc on every node. Now every claude call appends one JSON line before execution:
{"ts":"2026-05-08T02:14:33Z","host":"mac-mini-4","args":["--dangerously-skip-permissions","--model","claude-opus-4-5","--print","run task"]}
Verify it's working:
tail -f ~/.claude/audit.jsonl | jq .
Expected output on first run:
{
"ts": "2026-05-08T09:01:02Z",
"host": "mac-mini-1",
"args": [
"--output-format",
"json",
"--print",
"summarize logs"
]
}
Clean, structured, one line per run. The jq -R . | jq -sc . pipeline handles argument quoting and special characters correctly — no escaping bugs even with prompts that contain quotes or newlines.
Anomaly Detection Queries
Once the log accumulates a few days of data, patterns become obvious. Here are the queries I run daily.
Dangerous flag frequency — last 24 hours:
cat ~/.claude/audit.jsonl | \
jq -r 'select(.ts >= "2026-05-07") | .args[]' | \
grep -E 'dangerously|skip' | \
sort | uniq -c | sort -rn | head -10
Sample output from the incident:
17 --dangerously-skip-permissions
3 --dangerously-skip-permissions --model claude-opus-4-5
Seventeen hits. Three with a high-cost model. Both numbers were zero the day before. That's your alert condition.
Per-host invocation volume:
jq -r '.host' ~/.claude/audit.jsonl | sort | uniq -c | sort -rn
42 mac-mini-1
38 mac-mini-2
17 mac-mini-3
94 mac-mini-4
Mac Mini 4 running more than twice the cluster average at 2 AM? That's worth a second look.
Flag combination frequency — full cross-tab:
jq -r '[.host, (.args | join(" "))] | @tsv' ~/.claude/audit.jsonl | \
sort | uniq -c | sort -rn | head -20
This gives you a ranked list of exact (host, flag-combo) pairs. After roughly 94 entries, pattern detection accuracy climbs sharply — enough distinct invocations to separate signal from noise.
Variations and Gotchas
Multi-node sync. On the cluster, I rsync each node's audit.jsonl to a central machine every 15 minutes:
# crontab on each node
*/15 * * * * rsync -az ~/.claude/audit.jsonl central:/var/log/claude/$(hostname).jsonl
Then merge for cross-host queries:
cat /var/log/claude/*.jsonl | jq -s 'sort_by(.ts)[]' > /tmp/cluster-audit.jsonl
Docker containers. Mount the log directory as a volume so logs survive container restarts:
volumes:
- ~/.claude:/root/.claude
Without this, every container restart resets your audit trail to zero.
Log rotation. audit.jsonl will grow. Add a weekly rotation:
# logrotate config: /etc/logrotate.d/claude-audit
~/.claude/audit.jsonl {
weekly
rotate 12
compress
missingok
notifempty
}
The jq version gotcha. On older macOS systems, jq -sc behaves differently. If you see malformed JSON in the args array, check your jq version:
jq --version # need 1.6+
brew install jq # if needed
Prompt injection in args. If your prompts contain newlines or special JSON characters, the jq -R . | jq -sc . pipeline handles them safely. Do not use echo "$@" | python3 -c "import json,sys; ..." as a substitute — it will break on prompts with quotes.
| Approach | Structured | Multi-host | Flag granularity | Query speed |
|---|---|---|---|---|
~/.bash_history |
No | No | Args only | Manual |
script session |
No | No | Full session | Unqueryable |
| syslog | Partial | Yes | Command only | Slow |
audit.jsonl (this) |
Yes | Yes (with rsync) | Full flag array | Sub-second |
Before and after, measured:
| Metric | Before | After |
|---|---|---|
| Trace time for anomalous run | ~40 min | <90 sec |
| Cross-host correlation | Manual, error-prone | Single jq query |
| Flag-level visibility | None | Full array |
| Alert latency | Next morning | Real-time (with tail -f) |
Closing
A terminal without execution logs is a black box. A wrapper script and two jq queries flip that to a fully auditable system in under ten minutes. The value multiplies with every additional node — four machines means four times the blind spots if you skip this, and four times the signal if you don't.
Start with audit.jsonl running. Add the anomaly query as a cron job. Wire the high-frequency dangerous-flag query to a notification channel of your choice. The 94-entry threshold before pattern detection gets reliable is roughly two or three days on a moderately active cluster — so the sooner you start, the sooner the data starts working for you.
Next step: feed the structured log into a lightweight SQLite table and run window-function queries for time-series anomaly scoring.
🐦 Faster updates on X: @baegseungh7061
📚 More in this series: Code Advanced
💌 Subscribe: Follow on X or grab the RSS
댓글
댓글 쓰기