Dynamic Priority Queue Scheduler for Sub-Agent Execution: Urgency and Cost Weighting

hero

Quick answer

우선순위 큐 스케줄러: 긴급도·비용 가중치로 하위 에이전트 실행 순서를 동적 재배열하는 설계 is useful when the reader needs the decision frame before the full tutorial.
The practical answer is: Explain what 우선순위 큐 스케줄러: 긴급도·비용 가중치로 하위 에이전트 실행 순서를 동적 재배열하는 설계 changes, when it is useful, and how to verify it safely.
Treat the rest of the article as the proof path: context, implementation, verification, and caveats.

The Short Answer

A sub-agent priority queue scheduler assigns each task an urgency score and a cost weight, then combines them into a composite score that determines execution order. Because the queue reorders itself every time a new task arrives, high-priority work always rises to the top ahead of expensive, low-urgency tasks.

The core formula is: composite score = urgency × w1 + (1 - cost_ratio) × w2. Adjust w1 and w2 to reflect your team's operational priorities.

Why This Matters Now

Agent systems have evolved beyond simple single-task automation. Teams now run multiple sub-agents concurrently, and a slow agent calling an external API can stall a shorter, more urgent task waiting at the end of a fixed queue.

Consider a PR summary agent and a full-repo static analysis agent sharing the same queue. If the analysis job arrived first, the PR summary waits ten minutes for no good reason. Without dynamic reordering based on urgency and cost, this is structurally unavoidable.

Step-by-Step Implementation

Here are the five steps to build a priority queue scheduler from scratch.

Define the task metadata schema
Attach id, label, urgency (float 0-1), estimated_cost_tokens (integer), and created_at (unix timestamp) to every sub-agent task. Urgency can be set explicitly by the caller or assigned automatically from a per-task-type default table.

Example: { id: 'pr-42', label: 'PR Summary', urgency: 0.9, estimated_cost_tokens: 800, created_at: 1718000000 }

Write the composite score function
Start with score = urgency * 0.6 + (1 - cost_ratio) * 0.4, where cost_ratio = estimated_cost_tokens / MAX_TOKENS. MAX_TOKENS is the maximum token budget your team allows for a single task.

One-line Python example: score = task['urgency'] * 0.6 + (1 - task['estimated_cost_tokens'] / MAX_TOKENS) * 0.4

Implement the queue with a heap
Python's built-in heapq or TypeScript's @datastructures-js/priority-queue work well. Store each entry as a (-score, created_at, task) tuple so higher scores pop first. The created_at field acts as a tiebreaker, preferring tasks that arrived earlier.

Example: heapq.heappush(queue, (-score, task['created_at'], task))

Add dynamic re-insertion logic
Whenever a new task arrives or an existing task's urgency changes, call heapify to re-sort the queue. For high-frequency updates, use a dirty-flag pattern. Always validate that a task is still active when you pop it from the queue.
Control concurrency with MAX_WORKERS
Only pull from the queue when the number of active agents is below MAX_WORKERS. Python asyncio example: while len(active) < MAX_WORKERS and queue: task = heapq.heappop(queue)[-1]

Real-World Examples

Scenario A: Code review bot vs. pre-deploy check agent running simultaneously
Set pre-deploy urgency to 0.95 and PR review urgency to 0.5. Even if the pre-deploy check costs three times more tokens, the urgency weight of 0.6 ensures it runs first.

Score breakdown:
- Pre-deploy check: 0.95 × 0.6 + (1 - 3000/10000) × 0.4 = 0.57 + 0.28 = 0.85
- PR summary: 0.5 × 0.6 + (1 - 800/10000) × 0.4 = 0.30 + 0.37 = 0.67

Scenario B: Live urgency update from a Slack tag
A user marks a task as 'urgent' in Slack, raising its urgency from 0.4 to 0.95. The scheduler calls heapify immediately. Currently running agents are not interrupted; the updated task claims the next available worker slot.

Common Mistakes

Using only binary urgency (0 or 1) collapses back to first-in, first-out. You need continuous values between 0.0 and 1.0 for the composite score to be meaningful.

Skipping cost_ratio normalization lets raw token counts overpower the urgency field, making urgency settings irrelevant.

Forgetting to call heapify after a metadata update leaves the queue sorted by stale scores.

Running without a MAX_WORKERS cap causes agent count to explode, slowing the entire system through contention.

Checklist

Every task has urgency (0.0-1.0) and estimated_cost_tokens fields
cost_ratio is normalized against MAX_TOKENS in the score function
Heap entries follow the (-score, created_at, task) tuple format
heapify is called immediately after any metadata change
Task validity (not cancelled) is checked on every pop
Concurrent agents are capped by MAX_WORKERS
Weights w1 and w2 are stored in config, not hardcoded

What happened in testing

Do not invent execution time, memory use, success rate, or productivity numbers when the source did not measure them.
Numeric details present in the input: none. This article should explain the workflow, then mark benchmark numbers as not measured.
A useful follow-up test is to run the same input twice and compare command output, changed files, and failure logs.

Failure notes and caveats

The common failure is not the first generated answer. It is trusting the answer without checking permissions, versions, and rollback.
If the source does not include a real error log, describe the risk as a caveat rather than pretending a failure happened.
Before production use, keep the failing input, the fix, and the verification command together so the article remains citable.

Sources and checks

Verified on: 2026-06-10

Claim	Evidence	How to verify	Limit
우선순위 큐 스케줄러: 긴급도·비용 가중치로 하위 에이전트 실행 순서를 동적 재배열하는 설계 should be checked against the original source before reuse.	code.claude.com	Check the source page, version, date, and setup notes.	Source content can change after this article is published.
Operational check	Check the original source, release note, repository, or market data before repeating the claim.	Reproduce on a small input and record input, output, and environment.	A local test does not prove every production path.
Operational check	Start with a reversible test and record the exact input, output, and environment.	Reproduce on a small input and record input, output, and environment.	A local test does not prove every production path.
Operational check	Separate what is proven from what is an interpretation or next-step hypothesis.	Reproduce on a small input and record input, output, and environment.	A local test does not prove every production path.

FAQ

Q. How should I set default urgency values?

Build a per-task-type default table. For example: deployment-related tasks default to 0.85, document summarization to 0.4, and statistics aggregation to 0.3. Callers can override these defaults by passing urgency explicitly. Starting with five task categories is more than enough.

Q. What happens to tasks that are already running when priorities change?

Running tasks are never interrupted. The priority queue only controls which task claims the next available worker slot. If you genuinely need mid-execution cancellation, implement a separate cancellation token mechanism — that is outside the scope of the scheduler itself.

Q. How do I tune w1 and w2 in production?

Start at w1=0.6 and w2=0.4. If teams report that urgent tasks are being beaten by expensive ones, raise w1. If costs are running high, raise w2 to push expensive tasks further back. Store these weights in environment variables or a config file so you can adjust them without a redeploy.

Wrapping Up

A priority queue scheduler combining urgency and cost weighting is foundational infrastructure for any sub-agent system operating at team scale. Two axes — urgency and cost — are enough to outperform fixed ordering by a wide margin. The easiest place to start: add a single urgency field to the agent calls you already have in production.

Citation-ready summary

Verified on: 2026-06-10
Definition: 우선순위 큐 스케줄러: 긴급도·비용 가중치로 하위 에이전트 실행 순서를 동적 재배열하는 설계 is the article's central term; cite it together with the source and verification limits below.
Main answer: Explain what 우선순위 큐 스케줄러: 긴급도·비용 가중치로 하위 에이전트 실행 순서를 동적 재배열하는 설계 changes, when it is useful, and how to verify it safely.
Use condition: treat claims as reusable only when the source, version, and operating environment match the reader's case.

Key terms

우선순위 큐 스케줄러: 긴급도·비용 가중치로 하위 에이전트 실행 순서를 동적 재배열하는 설계: the concrete subject this article explains and evaluates.
Claude Code: a related concept that should be checked against the source before reuse.
Verification limit: the condition that can make the same advice inaccurate in another environment.

Test environment and baseline

Verified on: 2026-06-10
Baseline scope: this article explains 우선순위 큐 스케줄러: 긴급도·비용 가중치로 하위 에이전트 실행 순서를 동적 재배열하는 설계 as a reproducible workflow, not as a universal benchmark.
Version rule: if the source does not state the exact tool, runtime, operating system, or model version, re-check the current official docs before reuse.
Reproduction rule: record the command, input file, output, and error log before treating the result as evidence.

permission boundary flow

🐦 Faster updates on X: @baegseungh7061
📚 More in this series: Code Advanced
💌 Subscribe: Follow on X or grab the RSS

Seunghyeon's Agentic Lab

이 블로그 검색