Idea In 30 Seconds
Too Many Tools is an anti-pattern where an agent gets too many tools without clear task boundaries.
As a result, action selection becomes noisy: the agent more often picks an irrelevant tool and spends steps re-selecting. This increases latency, cost, and response instability.
Simple rule: for each scenario, the agent should see only the minimal set of tools that are truly needed.
Anti-Pattern Example
The team builds a support agent for order, return, and payment requests.
Instead of a narrow set, the team gives the agent a large list of tools "for all cases".
response = agent.run(
"User: Why is payment failing for order #8341?"
)
In this setup, the agent has dozens of options and may go in the wrong direction:
tools = ["get_payment_status", "get_invoice", "get_order"]
selected_tool = agent.pick_tool(tools)
result = run_tool(selected_tool, order_id)
# the agent may choose get_invoice first,
# fail to find the failure reason,
# and only then move to get_payment_status
For this case, a short route with a few tools is enough:
payment_data = get_payment_status(order_id)
return format_payment_answer(payment_data)
In this case, tool overload adds:
- noise in action selection
- extra tool calls
- higher risk of wrong routing
Why It Happens And What Goes Wrong
This anti-pattern often appears when a team tries to build a "universal" agent for all requests at once.
Typical causes:
- adding new tools without removing old ones
- no per-task allowlist for tools
- fear that "one tool may not be enough"
- copying large toolsets from demos without validating production cases
As a result, teams get problems:
- unstable selection - agent picks a tool that formally fits, but is not the best one
- higher latency - selection loops and repeated calls take more time
- higher cost - extra tool or LLM steps for a typical request
- bloated context - prompt includes many tool descriptions and intermediate results
- hard debugging - difficult to track why the agent picked this path
Typical production signals that there are already too many tools:
- a typical user request triggers 3-5 tool calls where 1-2 would be enough
- the same task follows different paths across runs
- the team cannot clearly explain why tool A is chosen instead of tool B
- adding one new tool degrades quality of existing scenarios
As a result, the agent spends more steps selecting an action than solving the actual task. It is important that tool selection is part of LLM inference. The model chooses an action from the options it sees in the prompt. When the number of tools grows, the number of possible paths also grows, and selection becomes less stable: the model more often picks a tool that formally fits but is not the best one.
When this setup expands, without trace and execution visualization it becomes hard to understand why the agent chose this tool path. That is why production systems usually have a dedicated observability layer for agent runs.
Correct Approach
Start with the simplest route that reliably handles most requests today. Add tools only when there is a measurable failure, risk, or limitation in the current design.
Practical framework:
- define a clear tool allowlist for each task type
- keep a small tool set in each route
- add a new tool only when there is a measurable reason (for example, improved success rate or fewer errors without sharp growth in latency and cost per request)
def answer_payment_question(order_id: str, user_message: str) -> str:
route = classify_intent(user_message) # simple classifier or rules
if route == "payment_status":
allowed_tools = ["get_payment_status"]
data = run_tool("get_payment_status", order_id)
return format_payment_answer(data)
allowed_tools = ["get_payment_status", "get_invoice"]
return agent.run(
user_message=user_message,
allowed_tools=allowed_tools,
)
In this setup, the agent works with a small and relevant tool set, so action selection becomes more stable.
Quick Test
If these questions are answered with "yes", you have too-many-tools risk:
- Does a typical request regularly need 3+ tool calls where 1-2 should be enough?
- Does the same request go through different tool paths across different runs?
- Are new tools added faster than the team can constrain, remove, or review them?
How It Differs From Other Anti-Patterns
Tool Calling Everywhere vs Too Many Tools
| Tool Calling Everywhere | Too Many Tools |
|---|---|
| Main problem: tools are called even where reasoning or a simple workflow would be enough. | Main problem: there are too many tools, and the agent chooses between them unstably. |
| When it appears: when almost every request is automatically converted into a tool call. | When it appears: when one route has an overloaded tool set without a clear allowlist. |
Multi-Agent Overkill vs Too Many Tools
| Multi-Agent Overkill | Too Many Tools |
|---|---|
| Main problem: too many agents and complex coordination across roles. | Main problem: tool overload inside a single agent route. |
| When it appears: when one request goes through too many handoffs between agents. | When it appears: when the agent repeatedly tries multiple tools to find a relevant one. |
Giant System Prompt vs Too Many Tools
| Giant System Prompt | Too Many Tools |
|---|---|
| Main problem: one large system prompt with conflicting instructions. | Main problem: agent sees too many tools and makes action-selection mistakes. |
| When it appears: when new rules are continuously added into one monolithic prompt. | When it appears: when tools are added without reviewing outdated or duplicated tools. |
Self-Check: Do You Have This Anti-Pattern?
Quick check for the Too Many Tools anti-pattern.
Mark items for your system and check the status below.
Check your system:
Progress: 0/8
β There are signs of this anti-pattern
Move simple steps into a workflow and keep the agent only for complex decisions.
FAQ
Q: Does this mean many tools are always bad?
A: No. The problem is not the number itself. The problem is that the agent sees too many irrelevant options in a specific scenario.
Q: When should we add a new tool?
A: When there is a concrete signal: coverage gaps, quality failures, or route limits that the current set cannot solve without disproportionate growth in latency, cost, or debugging complexity.
Q: How can we reduce tool-selection chaos without a large refactor?
A: Start simple: introduce task-type allowlists, enforce step limits, and remove tools that are rarely used or produce duplicate results.
What Next
Related anti-patterns:
- Agent Everywhere Problem - when an agent is added even for deterministic tasks.
- Overengineering Agents - when the system grows extra layers without measurable benefit.
- Multi-Agent Overkill - when there are too many agents with blurry role boundaries.
What to build instead:
- Allowed Actions - how to set clear boundaries for what the agent is allowed to do.
- Routing Agent - how to route tasks and expose only relevant tools to the agent.
- Tool Execution Layer - where to control tool calls and access policies in architecture.