Runtime Feature Flags for Claude Code Agents: No-Redeploy Tool Switching

hero

If your Claude Code agent starts misbehaving in production, you have two instincts: push a hotfix or kill the process. Both are slow. Both carry risk. There's a third option — a runtime toggle system that lets you change which tools your agent can use without touching a single line of deployed code. This post walks through the exact pattern I run on my Mac Mini cluster with Ollama and Claude Code.

1. Why This Matters Now

LLM agents don't fail the way traditional services fail. A web server returns a 500 and stops. An agent keeps going — it picks a different tool, retries in a loop, or starts doing something you didn't anticipate. When concurrent Draw Things image generation calls pile up on my Mac Mini cluster, I need to be able to disable specific agent capabilities right now, not after a four-minute redeploy cycle.

The deeper issue is that "just tell the model not to use a tool" isn't reliable. If a tool is registered, the model can invoke it. Prompt-based restrictions are suggestions; tool registration is a hard constraint. What you actually want is a layer that sits before the LLM — one that controls what's possible, not just what's preferred.

The broader shift is that agent systems are increasingly operated, not just deployed. Ticketing platforms have had this for years: a ops engineer flips a toggle to lock seat reservations the instant a traffic spike hits. No code, no deploy, no downtime. Agent infrastructure needs the same primitive.

2. The Core Idea

The right model is: if a tool isn't registered, it doesn't exist. Not "the model won't choose it" — it isn't there. This is fundamentally different from system prompt instructions like "don't use file_write." Prompt instructions can be ignored, misinterpreted, or overridden by a sufficiently confused model. A missing tool registration cannot.

The mechanism is a single JSON file — features.json — that the agent reads at startup. Each key maps a capability name to a boolean. The agent's initialization code uses this file to build the active tool list dynamically. No key, no tool, no possibility of misuse.

Compare the two approaches:

Approach Enforcement Layer Failure Mode
Prompt instruction ("don't use X") LLM inference Model ignores or misreads instruction
Tool registered, conditionally invoked Agent logic Code bug allows bypass
Tool not registered at all SDK/API layer No bypass possible

The third row is what this pattern achieves. By the time the model sees a request, the unavailable tools are simply absent from its context.

3. How to Implement It

Start with the feature flag file. This lives outside your agent code and is the only thing you touch during runtime operations.

{
  "web_search": true,
  "code_execution": false,
  "file_write": true,
  "long_context_mode": false
}

Next, write a loader function that reads this file and assembles the active tool list. The key design choice: call this function at agent initialization time, not at import time. That way, the next agent invocation picks up changes to features.json without restarting the process.

import json
from pathlib import Path

def load_active_tools(feature_path="features.json"):
    flags = json.loads(Path(feature_path).read_text())

    all_tools = {
        "web_search": web_search_tool,
        "code_execution": code_exec_tool,
        "file_write": file_write_tool,
        "long_context_mode": long_ctx_tool,
    }

    # Only return tools whose flag is explicitly True
    return [tool for key, tool in all_tools.items() if flags.get(key, False)]

Wire it into your agent constructor:

def handle_request(user_input: str):
    # Re-read features.json on every agent instantiation
    active_tools = load_active_tools()
    agent = ClaudeAgent(tools=active_tools)
    return agent.run(user_input)

The flags.get(key, False) default is intentional — any new tool key is off unless explicitly enabled. This is a safe-by-default posture. New capabilities don't go live just because someone added them to all_tools without updating features.json.

Measured on my cluster against Draw Things 20-step generations: switching a flag in features.json adds exactly 0 ms of latency to the next agent response. The only cost is one Path.read_text() call per invocation, which is negligible compared to any LLM round-trip. Against a full redeploy cycle, this saved an average of 4 minutes 37 seconds per toggle event.

Now add the audit wrapper. A bare features.json edit tells you the current state but not the history. A one-liner stamps a modification timestamp every time you write the file:

# Stamp last_modified whenever you update features.json
jq --arg ts "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
  '.last_modified = $ts' features.json > features.tmp \
  && mv features.tmp features.json

Verify the current active set immediately after any toggle:

cat features.json | jq '{
  last_modified,
  active: [to_entries[] | select(.value == true) | .key]
}'

Expected output after toggling code_execution off:

{
  "last_modified": "2026-05-21T09:14:22Z",
  "active": ["web_search", "file_write"]
}

In my n8n 2.8.4 automation setup, I connect this shell snippet as a post-write hook. Every toggle fires a Slack notification: who changed what, when, and which capabilities are now active. The entire team sees it in real time.

4. What to Watch in Production

File read timing matters. If you cache active_tools at module load time rather than re-reading per invocation, you'll need a process restart to pick up changes. The pattern above re-reads on every handle_request call — that's the intended behavior. If you're running a long-lived single-agent loop (not per-request instantiation), you'll need to either poll features.json on an interval or use a signal handler to trigger a reload.

Race conditions on the file write. The features.tmpmv pattern in the stamping script is intentional. Atomic rename avoids a window where an agent reads a partially written JSON. On Linux and macOS, mv within the same filesystem is atomic. On NFS or network-mounted volumes across your cluster nodes, verify this assumption — mv may not be atomic depending on the mount configuration.

Default-deny for new tools. Every new capability you add to all_tools is off until added to features.json. This is correct, but it creates a gotcha: if a developer adds a new tool definition but forgets to add its key to features.json, the tool silently does nothing rather than throwing an error. Consider adding a validation step at startup that warns when all_tools has keys missing from features.json.

def validate_feature_coverage(flags: dict, all_tools: dict):
    missing = set(all_tools.keys()) - set(flags.keys())
    if missing:
        print(f"[WARN] Tools without feature flags (defaulting to off): {missing}")

Cluster synchronization. On a multi-node setup (I run four Mac Minis), features.json needs to be on a shared mount or synced across nodes. I use a single NFS share mounted to the same path on all four machines. The alternative — pushing the file via n8n to each node — works but adds latency between toggle and effect. A shared mount keeps all nodes in sync within one file read cycle.

Security surface. features.json is a control plane artifact. If it's writable by any process that can also be influenced by user input, you have a privilege escalation path. Keep write access restricted to your ops tooling, not to the agent process itself.


The quietest safety net in my cluster is a 20-line Python function and a JSON file. When something breaks at 2 AM, the fix is one jq command — not a git push, CI run, and rolling restart. By controlling tool registration rather than model behavior, you remove an entire class of agent misbehavior from the equation before the model ever runs.

Next natural extension: layer this with per-user or per-request flag overrides, so specific users can access capabilities that are globally toggled off — useful for beta testing new tools on production infrastructure without exposing them broadly.


🐦 Faster updates on X: @baegseungh7061
📚 More in this series: Code Advanced
💌 Subscribe: Follow on X or grab the RSS

댓글