40 min read

The Evolution, Challenges, and Future Directions of AI Agents

An in‑depth overview traces the development of AI agents from early LLM milestones to modern “class‑Agent” models, examines core components such as memory, tool use, planning and reflection, analyzes current limitations, and outlines emerging solutions like workflows, multi‑agent systems, and model‑as‑product paradigms.

Tencent Technical Engineering

May 23, 2025

The Evolution, Challenges, and Future Directions of AI Agents

Timeline

We review the LLM‑based Agent development timeline, starting from the 2017 introduction of the attention mechanism.

Before 2017, AI was chaotic and NLP stagnated at RNN/LSTM.

The 2017 "Attention Is All You Need" paper introduced the Transformer, opening a new era.

GPT‑3’s release popularized code generation and GitHub Copilot.

ChatGPT (GPT‑3.5) brought conversational LLMs to the masses.

GPT‑4 (2023) broke the trillion‑parameter barrier, introduced plugins, Function Call, and sparked a boom in LLM application frameworks.

2024 saw rapid Agent development; scaling laws appeared to fail, and GPT‑4 plateaued.

2025 marks the post‑pre‑training scaling‑law era, with reinforcement‑learning‑driven Agents emerging.

How AI Agents Are Built

AI Agents are a specific form of LLM application that rely on text completion. The core workflow is simple: the model receives a prompt and generates a continuation (Generated Text). However, prompt engineering and model pre‑training are two massive sub‑domains.

What Is an AI Agent?

Many mistakenly label any LLM‑based chatbot as an AI Agent. In reality, an AI Agent = Large Model + Memory + Tool Use + Autonomous Planning.

Multi‑turn Dialogue & Memory

Memory means the Agent can recall past interactions. Simple memory is achieved by appending prior dialogue to the prompt, but this quickly exhausts context windows, leading to token limits.

Tool Use

Agents must be able to invoke external tools (e.g., web search) to augment their answers. The distinction between manual and automatic tool modes illustrates true Agent autonomy.

Function Call

LLMs can output a structured JSON command describing a tool invocation. OpenAI first baked this capability into the model, and other providers have followed.

MCP (Model Context Protocol)

Anthropic’s MCP standardizes tool discovery and execution, allowing Agents to read/write files or call APIs directly.

Autonomous Planning & Reflection

Planning is often implemented via Chain‑of‑Thought (CoT) prompting, which breaks complex tasks into smaller steps. ReAct (Reasoning‑Acting) adds a loop of thought → action → observation → answer, resembling the PDCA cycle.

Why Agents Fail

Agents struggle due to hallucinations and memory management challenges. The probability nature of LLMs means errors accumulate across multiple tool calls, reducing overall success rates.

Context Window Limits

Fixed‑size windows (e.g., GPT‑4’s 32k tokens) cause truncation of long histories, and attention efficiency degrades as context grows.

Relevant Memory Retrieval

Storing dialogue in vector databases enables selective retrieval, but retrieval quality directly impacts answer correctness.

Solutions

Three major remedy categories have emerged:

Introduce fixed workflows to increase determinism.

Engineer extreme optimizations on top of the ReAct framework.

Adopt multi‑Agent collaboration to emulate human teams.

Workflow’s Second Spring

Workflows combine low‑code platforms with LLMs, allowing rapid prompt iteration and visual debugging, though they lack true autonomous reasoning.

Beyond ReAct

Plan‑and‑Execute adds a global plan stage, reducing token buildup. ReWOO separates planning, parallel tool execution, and solving, eliminating observation steps. LLMCompiler compiles task DAGs for parallel execution and dynamic re‑planning.

Multi‑Agent Architectures

Two primary forms:

Social‑cooperative simulations (e.g., Stanford Town) where Agents interact freely.

Task‑oriented pipelines (e.g., MetaGPT, AutoGen, CrewAI) that assign specific roles to achieve a concrete goal.

Network, Supervisor, and Hierarchical topologies define communication patterns.

Agentic Workflow

Proposed by Andrew Ng, it combines tool use, multi‑Agent collaboration, planning, and reflection to dynamically decompose and solve complex tasks.

Example: CrewAI Customer‑Discount Recommendation

A workflow defines three steps (extract purchase history, match best discount, generate notification) and three specialized Agents (history analysis, discount management, creative copy). The hierarchical mode uses a manager LLM to dynamically schedule sub‑Agents.

Rise of “Class‑Agent” Models

OpenAI’s O1 and DeepSeek R1 exemplify models that internally perform reasoning before answer generation, blurring the line between model and Agent.

O1

O1 conducts an internal “thought” phase hidden from users for policy and efficiency reasons.

DeepSeek R1

R1 openly publishes its reasoning chain, uses multi‑objective reinforcement learning (GRPO) for training, and demonstrates that targeted RL can outperform sheer parameter scaling.

Productization: Model‑as‑Product

Deep Research (OpenAI) and O3 are end‑to‑end trained Agent models that encapsulate the entire workflow (question → tool use → verification → final report) within a single model, eliminating the need for external engineering.

Three Co‑existing Agent Types

Pure engineering Agents built from prompt engineering + code (quick MVPs).

SFT‑tuned Agents that reduce prompt cost and improve instruction following.

End‑to‑end RL‑trained Agent models for high‑volume, vertical use‑cases.

Agent Social Collaboration

A2A protocols assign each Agent an identity card, enabling secure handshake, communication, and coordinated action across global ecosystems.

Leadership in the AI Era

Professionals must transition from individual contributors to AI leaders, setting goals, managing AI collaboration, and validating outputs while deepening domain expertise.

Future Outlook

AI will reshape every industry, creating new roles such as Prompt Engineer, AI Ethicist, and Metaverse Architect. Embracing AI quickly is essential to stay relevant.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Prompt Engineering AI Agent multi-agent function call agentic workflow

Written by

Tencent Technical Engineering

Official account of Tencent Technology. A platform for publishing and analyzing Tencent's technological innovations and cutting-edge developments.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.