How AgentCore’s Four New Features Simplify Deploying Trustworthy AI Agents

Amazon Bedrock’s AgentCore introduces four preview‑stage capabilities—fine‑grained Policy, continuous Evaluations, long‑term Memory, and bidirectional Runtime streaming—each illustrated with real‑world use cases that boost efficiency, cut costs, and improve operational control for large‑scale AI agents.

Amazon Cloud Developers
Amazon Cloud Developers
Amazon Cloud Developers
How AgentCore’s Four New Features Simplify Deploying Trustworthy AI Agents

Announcement Overview

At re:Invent 2025 Amazon Web Services announced four new preview‑stage features for Amazon Bedrock AgentCore, aimed at removing obstacles that hinder the production deployment of AI agents.

Real‑World Impact

PGA TOUR built a multi‑agent content generation system that increased article‑writing efficiency by 10× and reduced costs by 95% .

Workday leverages the AgentCore Code Interpreter to add secure data‑protection and financial‑analysis capabilities, shortening planning‑analysis time by 30% and saving roughly 100 hours per month .

Grupo Elfa uses AgentCore Observability to achieve 100% traceability of agent decisions and cut problem‑resolution time by 50% .

Key Challenges in Scaling Agents

Enterprises face difficulties in defining safe operating boundaries and performing quality validation for agents that have autonomous decision‑making abilities. Risks include unauthorized data access, unintended actions, and difficulty guaranteeing consistent quality at scale.

New Features

Policy in AgentCore (preview)

This feature introduces fine‑grained permission policies that intercept calls to the AgentCore Gateway before execution, establishing clear boundaries for agent behavior. Policies are independent of the agent’s inference flow and can be authored in natural language or the open‑source Cedar policy language, allowing security and compliance teams to create, read, and audit rules without programming expertise.

Example policy (Cedar syntax):

permit( principal is AgentCore::OAuthUser,</code>
<code>  action == AgentCore::Action::"RefundTool__process_refund",</code>
<code>  resource == AgentCore::Gateway::"<GATEWAY_ARN>" )</code>
<code>when {</code>
<code>  principal.hasTag("role") &&</code>
<code>  principal.getTag("role") == "refund-agent" &&</code>
<code>  context.input.amount < 200</code>
<code>};

AgentCore Evaluations (preview)

A fully managed service that continuously monitors and scores agents on core dimensions such as correctness, helpfulness, tool‑selection accuracy, safety, goal success rate, and contextual relevance. Built‑in evaluators cover these dimensions, while custom evaluators let users define business‑specific scoring prompts and models. Results are visualized alongside Observability metrics in Amazon CloudWatch, enabling alerts when quality drops below thresholds.

Typical usage scenarios:

Testing phase : Run baseline evaluations before release to catch defects.

Production phase : Continuously track quality; trigger alerts (e.g., if a customer‑service agent’s satisfaction score falls >10% within eight hours).

AgentCore Memory – Scenario Memory

This long‑term memory strategy lets agents learn from past interactions and apply that knowledge to similar future tasks, improving consistency and performance. For example, an agent that books travel learns a user’s preference for later flights after a meeting‑driven schedule change and proactively suggests flexible return options in subsequent bookings.

AgentCore Runtime – Bidirectional Streaming

The Runtime now supports bidirectional streaming, enabling voice agents to listen and respond in real time. Users can interrupt an agent mid‑reply, and the agent instantly adapts its context, delivering a more natural conversational flow compared to traditional turn‑based interactions.

Getting Started

In the AgentCore console, users create a “Policy” engine, associate it with one or more Gateways, and define rules either via natural‑language descriptions or direct Cedar code. Evaluation tasks are created in the “Evaluations” panel, selecting data sources (Agent endpoints or CloudWatch log groups) and configuring built‑in or custom evaluators, sampling rates, and optional IAM roles.

Availability and Pricing

AgentCore (including the Policy preview) is available in US East (Ohio, N. Virginia), US West (Oregon), APAC (Mumbai, Singapore, Sydney, Tokyo), and EU (Frankfurt, Ireland). AgentCore Evaluations preview is available in US East, US West, APAC Sydney, and EU Frankfurt. Usage is billed on a pay‑as‑you‑go basis with no upfront commitments.

Compatibility

The new capabilities work with open‑source frameworks such as CrewAI, LangGraph, LlamaIndex, and Strands Agents, and support any foundation model. Developers can also use the open‑source MCP server for rapid prototyping.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AI agentsPolicy EngineAmazon BedrockBidirectional StreamingAgentCoreEvaluation ServiceScenario Memory
Amazon Cloud Developers
Written by

Amazon Cloud Developers

Official technical community of Amazon Cloud. Shares practical AI/ML, big data, database, modern app development, IoT content, offers comprehensive learning resources, hosts regular developer events, and continuously empowers developers.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.