Autonomous Debugging with Claude Code: Error Logs to Green Tests

hero

If you've already got Claude Code running basic tasks, the next level is making it fix its own mistakes — and yours. This guide walks through building a self-contained error-fixing loop: from feeding Claude the right context, to having it run tests until everything passes, without you babysitting every step.

overall debugging loop flow

Section 1: The Problem With Pasting Just the Error

The most common mistake I see — and made myself — is copying only the error message into Claude. Something like:

TypeError: Cannot read properties of undefined (reading 'map')

Claude will give you a generic answer. It doesn't know your data model, your async patterns, or whether that array is supposed to come from a DB call or a local fixture. You get boilerplate that doesn't fit.

The fix is to pass the entire terminal output, not just the red line. Here's the difference in practice:

Bad prompt:

Fix this: TypeError: Cannot read properties of undefined (reading 'map')

Good prompt:

Here's the full terminal output from `npm run dev`:

[paste full log — stack trace, surrounding log lines, the file paths Node printed]

The relevant file is src/api/users.ts. Fix the root cause.

With full context, Claude can trace the call chain. It knows users.ts called into a database helper that returned null instead of an empty array — and it patches that function, not just adds a null-guard in the wrong place.

shallow vs deep context input comparison

What consistently worked for me: wrap the log in a shell heredoc and pipe it straight to Claude Code from the CLI, so nothing gets truncated by copy-paste.

claude "Fix the root cause of this error. Full log below:
$(cat ./logs/last-run.log)"

Section 2: CLAUDE.md — Your Inline Senior Engineer

Here's the real leverage point. Drop a CLAUDE.md file in your project root. This file is automatically read by Claude Code at session start, and it acts as standing instructions for every interaction in that project.

Without it, Claude guesses at your conventions. With it, Claude behaves like a developer who's been on the team for a year.

Minimal CLAUDE.md skeleton:

# Project: payments-service

## Language & Runtime
- Node.js 20, TypeScript strict mode
- No `any` types — use `unknown` and narrow

## Error Handling Rules
- All async functions must return `Result<T, AppError>` not throw
- Never swallow errors silently — always log with `logger.error()`
- Database null returns should produce a typed `NotFoundError`

## Testing
- Test runner: Vitest
- Run tests: `npm test`
- Every fix must have a corresponding test in `__tests__/`

## Code Style
- Prefer early returns over nested if-else
- Max function length: 40 lines — extract if longer

Once this is in place, when you ask Claude to fix something, tell it explicitly:

Review CLAUDE.md before proposing any fix. Your solution must follow the Result<T, AppError> error pattern and include a Vitest test.

That single instruction transforms the output quality. The gotcha I hit early: Claude will read CLAUDE.md automatically, but it won't always apply every rule unless you remind it. For critical constraints (like the error-handling pattern), repeat them in your prompt.

CLAUDE.md injection into fix pipeline

Section 3: The Fix Loop — Test-Driven Debugging

This is where it gets interesting. Don't just ask Claude to fix the code — ask it to fix the code and verify the fix by running tests. Here's the exact prompt pattern I use:

Fix the error in src/api/users.ts shown in this log:
[paste log]

After fixing:
1. Run `npm test -- --reporter=verbose`
2. If any tests fail, identify why and fix them
3. Repeat until all tests pass
4. Show me the final diff only

Claude Code can execute shell commands directly. It will run npm test, read the output, adjust the fix if something new breaks, and iterate. On a well-scoped bug this usually converges in 2-3 cycles.

Here's what that loop looks like in practice when Claude Code is working through a failing test:

# Claude runs:
npm test -- --reporter=verbose

# Output:
FAIL src/__tests__/users.test.ts
  ✗ getUserById returns NotFoundError for missing user

# Claude reads the failure, finds the missing else-branch, patches it:
# [edits users.ts]

# Claude runs again:
npm test -- --reporter=verbose

# Output:
PASS src/__tests__/users.test.ts
  ✓ getUserById returns NotFoundError for missing user
  ✓ getUserById returns user for valid id

For Python projects, same pattern with pytest:

# Prompt Claude to run:
pytest -x --tb=short 2>&1 | tee /tmp/test-output.txt

The -x flag stops on the first failure, keeping the output tight and focused — Claude handles it better than a wall of failures at once.

test-driven fix iteration loop

Section 4: Variations and Gotchas

Multi-file errors. When the stack trace spans three files, list all of them explicitly in your prompt. Claude won't always chase import chains automatically.

The error originates in src/api/users.ts but the root cause is likely in
src/db/userRepository.ts or src/models/user.ts. Read all three before proposing a fix.

Environment differences. If you're running inside Docker and Claude is running tests from your host, path resolution can break. Use the full container-relative path in your CLAUDE.md and prompts. On Mac, watch out for case-insensitive filesystem behavior masking import errors that would blow up on Linux CI.

Environment	Gotcha	Workaround
macOS local	Case-insensitive FS hides import casing bugs	Run tests in Docker to match CI
Docker dev container	Volume mounts can cache stale files	Restart container before final test run
CI (GitHub Actions)	NODE_ENV differs from local	Explicitly set `NODE_ENV=test` in your test command

Claude stops iterating too early. Sometimes Claude will apply a fix, see one green test, and call it done — even if others are failing. Counter this in your prompt:

Do not stop until ALL tests pass, not just the test directly related to this error.
Run the full test suite as the final step.

CLAUDE.md length. Keep it under 200 lines. I tried a 600-line CLAUDE.md with exhaustive rules and Claude started ignoring the bottom half. Prioritize: error handling pattern, test runner command, and the two or three conventions you care most about.

CLAUDE.md length impact on rule adherence

Closing

The core insight here is simple: Claude Code is not a search engine for fixes — it's an execution engine. Give it full log context, give it your conventions upfront in CLAUDE.md, and tell it to run tests until they pass. That loop removes you from the mechanical part of debugging entirely.

Next natural step: wire this into your CI pipeline. When a build fails, have the CI job write the log to a file, then trigger a Claude Code session that runs the fix loop automatically before pinging you. That's the build that heals itself. ⚡

🐦 Faster updates on X: @baegseungh7061
📚 More in this series: Code Practical
💌 Subscribe: Follow on X or grab the RSS

Seunghyeon's Agentic Lab

이 블로그 검색