
If you're using Claude Code's pre/post hooks as simple before-and-after scripts, you're leaving most of their power on the table. Wire a message queue between them and a single file-write event becomes a non-blocking trigger for linting, tests, Slack notifications, and logs — all running concurrently while Claude moves on to the next task.
This is the event bus pattern applied to a local AI coding agent. Here's how I built it, what broke first, and the benchmark numbers from my Mac Mini cluster.
The Problem: Synchronous Hooks Block Claude
The default hook setup is synchronous. Claude hits a Write tool call, your postToolUse hook fires, and Claude waits for it to return before continuing. That's fine for a quick echo or a file rename — it's painful as soon as you add anything real.
My original postToolUse script called ruff for linting and posted a Slack message. Both are fast individually. Together, on a warm Mac Mini M4, one file-save was costing 2.1 seconds of blocking time. Chain five file writes in one Claude session and you've added ten seconds of dead time where Claude is just waiting on your infrastructure.
The instinct is to optimize the scripts. That's the wrong instinct.
The hook process owns Claude's stdout, so Claude genuinely cannot proceed until the hook exits. More downstream tasks means more blocking, linearly.
The Fix: Hooks Publish, Consumers Process
The core insight: hooks should do one thing — append an event to a queue. A separate long-running consumer process handles everything downstream. Claude only blocks for the time it takes to write a JSON file to /tmp.
Here's the wiring in .claude/settings.json:
{
"hooks": {
"preToolUse": [
{
"matcher": "Write",
"hooks": [
{
"type": "command",
"command": "python3 /opt/hooks/pre_publish.py"
}
]
}
],
"postToolUse": [
{
"matcher": "Write",
"hooks": [
{
"type": "command",
"command": "python3 /opt/hooks/post_publish.py"
}
]
}
]
}
}
The post_publish.py script reads the hook payload from stdin, builds a minimal event object, and drops it as a timestamped JSON file into /tmp/hook_queue/. That's it — no linting, no HTTP calls, no subprocess chains.
# post_publish.py — write event to queue, nothing else
import json, sys, pathlib, time
queue_dir = pathlib.Path('/tmp/hook_queue')
queue_dir.mkdir(exist_ok=True)
payload = json.loads(sys.stdin.read())
event = {
'ts': time.time(),
'tool': payload.get('tool_name'),
'file': payload.get('tool_input', {}).get('file_path', ''),
}
(queue_dir / f"{event['ts']}.json").write_text(json.dumps(event))
The consumer runs as a daemon and polls the queue directory every 500ms:
# consumer.py — poll queue, run downstream tasks
import json, pathlib, subprocess, time
queue_dir = pathlib.Path('/tmp/hook_queue')
while True:
for f in sorted(queue_dir.glob('*.json')):
event = json.loads(f.read_text())
if event.get('file'):
subprocess.run(['ruff', 'check', event['file']])
# add more consumers here: pytest, curl, etc.
f.unlink()
time.sleep(0.5)
After this change, Claude's blocking time dropped from 2.1 seconds to under 0.08 seconds per write — a 26× reduction, measured on the same Mac Mini M4 with n8n 2.8.4 as the orchestration layer. Claude's subjective responsiveness is night-and-day different.
Idempotency and Ordering: Where Async Pipelines Break
The first thing that fails in any async pipeline is duplicate processing. If Claude saves the same file twice in quick succession — which happens during multi-step refactors — your post hook fires twice. Without deduplication, the linter runs twice and Slack gets two identical alerts.
The fix is a dedup key per event. I use (file_path + floor(timestamp, 1s)) — any two events for the same file within the same second collapse to one.
# consumer.py with dedup
import json, pathlib, subprocess, time
queue_dir = pathlib.Path('/tmp/hook_queue')
processed = set()
while True:
for f in sorted(queue_dir.glob('*.json')):
event = json.loads(f.read_text())
key = f"{event.get('file')}:{int(event.get('ts', 0))}"
if key not in processed and event.get('file'):
subprocess.run(['ruff', 'check', event['file']])
processed.add(key)
f.unlink()
time.sleep(0.5)
On a four-node Mac Mini cluster where file change events were hitting 30+ per second, this brought duplicate Slack alerts down to zero. Keep the processed set bounded — evict entries older than 60 seconds if memory is a concern.
Ordering is a separate concern. The file queue sorts glob results lexicographically, and since filenames are timestamps, this gives you natural FIFO. If you switch to Redis Streams later, XREAD gives you the same guarantee with better throughput.
Fan-Out: Multiple Consumers, One Event
Once the publish/consume split is in place, adding more downstream tasks costs almost nothing on Claude's side. Spin up more consumer processes — each watching the same queue directory with its own filter logic.
/tmp/hook_queue/
1746782400.123.json ← post hook drops this
consumer_lint.py ← picks it up, runs ruff
consumer_test.py ← picks it up, runs pytest on changed module
consumer_notify.py ← picks it up, posts to Slack
Each consumer process runs independently. If the test runner is slow, it doesn't affect the Slack notification. If Slack is down, the linter still runs. This is the same fan-out pattern I use for Draw Things image generation events: the post hook fires when a 20-step render finishes, and two separate consumers handle thumbnail saving and Ollama caption generation simultaneously.
Run each consumer as a persistent daemon. On macOS, a minimal launchd plist handles this cleanly:
<!-- ~/Library/LaunchAgents/com.hooks.consumer-lint.plist -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.hooks.consumer-lint</string>
<key>ProgramArguments</key>
<array>
<string>/usr/bin/python3</string>
<string>/opt/hooks/consumer_lint.py</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
</dict>
</plist>
Load it with launchctl load ~/Library/LaunchAgents/com.hooks.consumer-lint.plist. On Linux, the equivalent systemd unit is two dozen lines and works the same way.
| Approach | Claude block time | Add new task | Failure isolation |
|---|---|---|---|
| Synchronous hooks | ~2.1s per write | Slows Claude further | One failure blocks all |
| File queue + consumer | < 0.08s per write | New process, no Claude change | Consumers fail independently |
| Redis Streams | < 0.08s per write | Consumer group, no Claude change | Best — replay on crash |
The Redis Streams upgrade is worth it once you have more than three consumers or need event replay after a consumer crash. For most solo dev setups, the file queue is enough.
Closing
The architecture is deliberately boring: hooks only append, consumers only process. That separation is the entire trick. Pre/post hooks in Claude Code are not a scripting system — they're a publish interface. Treat them that way and Claude stays fast regardless of how much downstream automation you pile on.
Next step worth exploring: add a pre-hook that writes a "session start" event, and a corresponding consumer that aggregates all file-change events in that session into a single end-of-session diff report.
TAGS: claude-code, async-pipeline, hooks, event-bus, python, developer-tools
🐦 Faster updates on X: @baegseungh7061
📚 More in this series: Code Advanced
💌 Subscribe: Follow on X or grab the RSS
댓글
댓글 쓰기