Anthropic’s Practical Guide to AI Agents: From Selection to Efficient Implementation
This article offers a detailed, Anthropic‑based guide on building effective AI agents and workflows, covering selection criteria, design patterns such as prompt chains, routing, parallelization, orchestrator‑worker and evaluation‑optimization, real‑world case studies, and concrete implementation recommendations that stress simplicity and composability.
Agent vs. Workflow
Anthropic distinguishes two concepts:
Workflow : a predefined orchestration of LLM calls and tools.
Agent : the LLM dynamically decides its processing steps and which tools to use.
Workflows are suited for predictable, well‑defined tasks; agents handle open‑ended problems where the solution path cannot be fixed in advance.
When to Use an Agent
Simplicity Principle : start with the simplest possible solution (a single LLM call with retrieval‑augmentation and good examples). Add complexity only when the task requires flexibility, dynamic decision‑making, or a feedback loop.
Key Trade‑off : class‑Agent systems improve performance at the cost of higher latency and expense.
Framework Trade‑offs
Frameworks hide prompts and LLM calls, which can make debugging difficult and encourage unnecessary complexity.
Developers are encouraged to call the LLM API directly; most patterns can be implemented in a few lines of code. If a framework is used, its internals must be understood.
Design Patterns for Class‑Agent Systems
Enhanced LLM
The basic module is an enhanced LLM that can retrieve information, invoke tools, and retain memory. Anthropic’s Model Context Protocol (MCP) provides a client‑side way to integrate third‑party tools.
Prompt Chain
A prompt chain breaks a task into ordered steps; each LLM call processes the output of the previous step. Gates can be inserted to keep the flow on track.
Suitable when the task can be cleanly decomposed into fixed subtasks.
Example: generate marketing copy, then translate it.
Routing
Routing classifies input and directs it to specialized downstream tasks, improving modularity.
Use when tasks belong to distinct categories that require different handling.
Example: route customer queries to separate flows for billing, refunds, or technical support.
Parallelization
Parallel execution lets the LLM handle multiple subtasks simultaneously and aggregates results.
Sectioning : split the task into independent subtasks.
Voting : run the same task multiple times and combine outcomes.
Applicable when parallelism improves speed or when multiple perspectives increase confidence.
Content moderation: separate LLMs evaluate violence, hate, profanity, political expression, then vote.
Code vulnerability scanning: parallel LLMs review code and flag issues.
Orchestrator‑Worker
The orchestrator LLM plans and delegates subtasks to specialized worker LLMs, then aggregates results.
Suitable for complex tasks where sub‑tasks cannot be predefined.
Evaluation‑Optimization Loop
A second LLM evaluates the output of the first LLM and provides feedback for improvement. The loop repeats until quality criteria are met.
Use when clear evaluation metrics exist (e.g., code tests, translation quality).
Example: literary translation – generate, evaluate, refine up to three iterations.
Example: complex information search – generate initial results, assess completeness, re‑search as needed.
Full Agent Mode
When the problem is open‑ended, an autonomous agent repeatedly plans, perceives the environment, and loops until a stop condition (e.g., max iterations) is reached.
Start: receive user command or clarify task.
Plan: devise a sequence of actions, request clarification if needed.
Perceive: gather facts from tool calls or code execution.
Feedback: pause for human input at checkpoints.
Terminate: stop when goals are met or limits are reached.
Agents can handle sophisticated tasks, but their implementation is often straightforward: LLMs using tools in a loop. Clear toolsets and documentation are crucial.
Agents are appropriate when the number of steps cannot be predicted and a fixed solution path is impossible. They incur higher cost and risk of error accumulation, so extensive sandbox testing and safety measures are recommended.
Practical Guidelines
Begin with simple prompts and evaluate comprehensively.
Add agent complexity only when necessary.
Maintain transparency by exposing the agent’s planning steps.
Design robust tool interfaces with detailed documentation and thorough testing.
References
[1] Anthropic Cookbook – https://github.com/anthropics/anthropic-cookbook/tree/main/patterns/agents
[2] Model Context Protocol – https://www.anthropic.com/news/model-context-protocol
[3] SWE‑bench – https://www.anthropic.com/research/swe-bench-sonnet
[4] Computer Use Demo – https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Smart Era Software Development
Committed to openness and connectivity, we build frontline engineering capabilities in software, requirements, and platform engineering. By integrating digitalization, cloud computing, blockchain, new media and other hot tech topics, we create an efficient, cutting‑edge tech exchange platform and a diversified engineering ecosystem. Provides frontline news, summit updates, and practical sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
