hero

Real-Time Security Reviews Inside Claude Code

If you use Claude Code daily to generate or accelerate code, your current workflow has a gap: security checks happen after the code is already wired in. The new Claude Code security review plugin closes that gap by catching vulnerabilities the moment you save a file, not after a PR lands or a scanner runs overnight.

This tutorial walks through what the plugin actually checks, how to enable and configure it, what to verify in production, and where it fits alongside tools you already run.

1. Why this matters now

The standard security workflow in most teams looks like this: write code, push a branch, wait for CI, get a comment from a reviewer or static analysis tool, then go back and untangle a fix from logic that's had time to calcify. By the time a SQL injection pattern or hardcoded API key surfaces in review, the function that contains it has often been called from three other places.

Claude Code changes that timing. The model generates code fast — faster than most developers can audit in the moment. The faster the output, the higher the probability that a common AI-generated antipattern slips through: credential literals baked into source, authentication checks that can be bypassed with a crafted header, queries assembled by string concatenation. These aren't hypothetical risks; they're the patterns that appear repeatedly across AI-assisted codebases precisely because language models learned from real code that contained them.

The security plugin shifts detection into the same feedback loop as autocomplete. Think of it the way a spell-checker draws a red line under a typo while you type rather than waiting for you to submit the document. The vulnerability is flagged before it becomes load-bearing.

2. The core idea

The security review plugin runs a vulnerability scan inline, triggered on file save, scoped to what Claude Code just wrote or modified.

The categories it targets map directly to what shows up most often in AI-generated code:

Vulnerability class	Why AI code generates it	Example
SQL injection	String interpolation is the path of least resistance	`f"SELECT * FROM users WHERE id={uid}"`
Hardcoded secrets	Keys are inlined to "make the example work"	`API_KEY = "sk-live-..."`
Auth bypass patterns	Conditional logic is sometimes inverted or incomplete	`if not is_admin: pass`
Insecure deserialization	Convenience imports from `pickle`, `yaml.load`	`yaml.load(data)` without `Loader=`
Overly permissive CORS / headers	Default headers omitted for brevity	`Access-Control-Allow-Origin: *`

Anthropic ran this plugin internally at scale before the public release. That matters because internal usage on real production code surfaces edge cases that a limited beta misses — false positive rates on legitimate query builders, interaction with ORMs, behavior on templated SQL with parameterized inputs. The public release is post-validation, not an experiment.

Claude Sandbox, announced alongside the plugin, handles a related but distinct problem: it constrains what Claude Code's agent can touch when it's executing code autonomously. File system scope, network egress, and subprocess access are bounded to an isolated environment. If you run automation pipelines where Claude Code executes scripts on your behalf, the sandbox is the boundary that prevents unexpected side effects.

3. How to implement it

Enable the security review plugin

Make sure you're on a supported version first.

claude --version
# Requires 1.x or later with plugin support
claude update

Enable the plugin from the CLI:

claude plugin enable security-review
claude plugin list
# Expected output:
# security-review   enabled   v1.0.0

If you manage configuration as code, add it to your project's .claude/settings.json:

{
  "plugins": {
    "security-review": {
      "enabled": true,
      "severity_threshold": "medium",
      "categories": [
        "sql-injection",
        "hardcoded-secrets",
        "auth-bypass",
        "insecure-deserialization",
        "permissive-headers"
      ],
      "on_save": true
    }
  }
}

Trigger a manual scan

You can run the scanner against an existing file without waiting for a save event:

claude security-scan ./src/api/users.py

Example output for a vulnerable file:

[HIGH]   Line 42 — SQL injection: string interpolation in query
         f"SELECT * FROM users WHERE email='{email}'"
         → Use parameterized queries: cursor.execute("... WHERE email=%s", (email,))

[HIGH]   Line 17 — Hardcoded secret detected
         API_KEY = "sk-live-4f8a..."
         → Move to environment variable or secrets manager

[MEDIUM] Line 89 — Permissive CORS header
         response.headers["Access-Control-Allow-Origin"] = "*"
         → Restrict to known origins in production

Fix and re-verify

After applying the suggestions:

claude security-scan ./src/api/users.py
# Expected:
# No issues found. (0 findings)

Configure the sandbox for agent workflows

If you use Claude Code in agentic mode (where it executes scripts autonomously), scope the sandbox in your settings:

{
  "sandbox": {
    "enabled": true,
    "allow_network": false,
    "allow_filesystem_write": ["./output", "/tmp/claude-work"],
    "allow_subprocess": false
  }
}

Verify the sandbox boundary is active:

claude sandbox status
# Expected:
# Sandbox: active
# Network: blocked
# Filesystem write: ./output, /tmp/claude-work
# Subprocess: blocked

4. What to watch in production

False positives on legitimate query builders. ORMs and query DSLs sometimes look like string interpolation to a pattern-based scanner. If you use SQLAlchemy's text() with bound parameters or Django's extra() with params=, add them to your allowlist rather than suppressing the entire category.

"allowlist": [
  "sqlalchemy.text with params",
  "django.db.models.expressions.RawSQL with params"
]

Overlap with existing scanners. If you already run Semgrep, Bandit, or Snyk in CI, you'll see findings duplicated between the plugin and your pipeline. That's not a problem in itself, but it's worth aligning severity thresholds so developers aren't getting conflicting signals. The plugin is faster feedback; CI scanners are the authoritative gate.

Secrets already committed. The plugin catches secrets as you write them. It doesn't retroactively scan git history. Run git-secrets or trufflehog on your repo history separately if you're enabling this on an established codebase.

Sandbox scope for network-dependent agents. If your automation pipeline needs to call external APIs (fetching data, posting results), you'll need to whitelist specific hosts rather than blocking all network access. Overly strict sandbox settings will break legitimate workflows silently — the agent will fail without a clear error.

"sandbox": {
  "allow_network": ["api.github.com", "hooks.slack.com"],
  "block_network_by_default": true
}

Mac vs. Linux behavior. The sandbox uses OS-level isolation primitives that behave differently across platforms. On macOS it uses App Sandbox entitlements; on Linux it uses namespaces and seccomp. Test your sandbox configuration on the same OS your CI runs on. A sandbox that passes locally on macOS may be more or less restrictive in a Linux container.

FAQ

When should I use Claude Code?

Claude Code is most useful when you're generating non-trivial code fast — new API endpoints, data processing functions, authentication flows. The security plugin makes it especially appropriatefor security-sensitive surfaces where you'd normally want a second pair of eyes. The inline feedback means you're not trading speed for safety; you get both. For trivial one-liners or boilerplate with no security surface, the overhead is low enough that it doesn't matter either way.

What should I check before applying Claude Code in production?

Three things: first, confirm the plugin is active and on a version that covers your language and framework (Python, TypeScript, Go, and Rust have the most complete rule sets at launch). Second, compare the plugin's category list against your existing threat model — if your team already tracks OWASP Top 10 in Jira, map the plugin categories to those items so nothing falls between the cracks. Third, if you're enabling the sandbox for agentic workflows, run it in observation mode for a week before enforcing it; that lets you see what it would block without breaking anything.

What is the easiest way to verify the result?

Write a small known-vulnerable file — a function with a bare string-interpolated SQL query and a hardcoded token — save it, and confirm the plugin flags both. Then fix each one, save again, and confirm zero findings. This takes under two minutes and gives you confidence the plugin is wired up correctly before you rely on it in real sessions. After that, run claude security-scan across your most security-sensitive directory and triage the output once before trusting the on-save workflow.

Closing

The shift here is timing: security feedback at write time instead of review time. Enable the plugin, verify it fires on your stack, scope your sandbox to match your actual agent workflows, and reconcile it with whatever CI scanner you already run.

If you're building automation pipelines with Claude Code's agent capabilities, the sandbox configuration is where to spend the most time — that's where the blast radius is largest if the scope is wrong.

TAGS: claude-code, security, AI automation, Anthropic, SQL injection, developer tools

🐦 Faster updates on X: @baegseungh7061
📚 More in this series: AI Insights
💌 Subscribe: Follow on X or grab the RSS

Seunghyeon's Agentic Lab

이 블로그 검색