Serverless AI Agents with Amazon Bedrock AgentCore

The newly released suite from AWS just pushed serverless into GenAI.

Artificial intelligence in the enterprise is at a turning point. Agents built on LLMs have moved from proof-of-concept status to being seriously considered for integration into business operations. Yet, operationalizing these agents remains challenging due to issues surrounding secure execution, integration complexity, memory persistence, and auditability at a production scale. Today, AWS debuted Amazon Bedrock AgentCore in public preview, marking a significant step toward making agent-based automation both standardizable and manageable across enterprise environments.

Amazon Bedrock AgentCore is structured as a suite of seven managed primitives, each focused on a specific operational challenge. At its core is the Runtime, which provides strict session-level isolation using Firecracker microVMs. Each agent invocation receives a dedicated, ephemeral microVM, ensuring that memory and execution state persist only for the duration of the session. This eliminates cross-tenant leakage and supports high-assurance teardown requirements that are crucial for regulated industries and any enterprise handling sensitive workloads. Cold start latency is present, typically ranging from 300 to 800 milliseconds, but teardown is deterministic. The microVMs are wiped clean at the end of every session, and the state is preserved for up to 15 minutes only if the agent is awaiting further user input; after this time, the environment is entirely destroyed.

The memory primitive in AgentCore supports both short-term event retention and long-term, insight-driven storage. Memory can be segmented by user, business unit, or custom namespace, and organizations can configure how frequently insights are extracted and for how long context persists. File uploads for memory are currently limited to 250MB per session, as documented in the AWS preview. Both short-term and long-term memory are accessible via standard SDK calls. The code interpreter is an ephemeral Linux environment, pre-loaded with libraries such as pandas, NumPy, SciPy, and Matplotlib for Python, as well as standard stacks for R and TypeScript. It allows up to 25 concurrent sessions per account per region and supports file uploads up to 250MB. All execution is sandboxed in Firecracker microVMs, ensuring that code runs with complete isolation and is easily torn down.

Gateway simplifies tool integration by allowing teams to expose APIs, Lambda functions, and OpenAPI/Smithy endpoints to agents without bespoke adapters or wrappers. The managed browser brings headless Chromium automation to the agent runtime, enabling agents to perform web-based retrieval, automation, and scraping tasks. Each browser session runs within its own microVM, and AgentCore supports scaling up to 3,000 concurrent browser or runtime sessions per account, subject to region and service quotas. The Identity primitive is designed for enterprise SSO (e.g., Okta, Active Directory, Cognito) and supports outbound delegation (e.g., three-legged OAuth), allowing agents to securely act on behalf of users in both internal and third-party SaaS environments. Observability is built-in across all primitives: every tool invocation, approval, error, and memory access is captured as a span, with metrics and logs streamable to external observability platforms via OpenTelemetry.

Actually, AgentCore is available in us-east-1, eu-west-1, and ap-southeast-1, with further regional rollout planned.

AgentCore standardizes agent execution with these key capabilities: the Runtime delivers Firecracker-based microVM isolation for every session; Memory offers segmented, event-driven short- and long-term storage; the Code Interpreter supplies a secure, pre-loaded analytics sandbox; the Managed Browser powers scalable, isolated headless automation; Gateway enables frictionless integration of APIs and tools; Identity provides inbound SSO and outbound delegation; Observability ensures comprehensive, exportable traces and metrics across the platform.

Operationalizing research agents for competitive intelligence or regulatory monitoring typically presents several hurdles: securing browser automation, ensuring auditability, and managing the surface area of tools that interact with the open web. With AgentCore, browser sessions are strictly isolated in microVMs, with each invocation securely torn down and terminated. The agent’s memory can persist insights by project or analyst, while every action is fully traceable in the observability stack. This eliminates the need for teams to handcraft wrappers or manage custom browser sandboxes—a common pain point in traditional open-source agent frameworks.

In analytics and data science, enterprise teams often struggle to provide managed, reproducible, and secure code execution environments. Legacy workflows require maintaining Jupyter servers, handling user isolation, and manually orchestrating ephemeral compute. With AgentCore’s code interpreter, data files can be uploaded (within the 250MB per-session limit) and analyzed in isolated, short-lived sandboxes pre-configured with standard scientific libraries. Session memory can be checkpointed and results pushed directly to S3 or other downstream systems, all while ensuring sensitive data does not escape its assigned execution environment.

Security automation is another area where AgentCore’s primitives offer clear architectural advantages. Deploying bots or remediation agents that access sensitive IAM or log data requires careful segregation of permissions, credential management, and full audit trails. By exposing internal tools through the Gateway, managing delegation via Identity, and capturing every event through Observability, AgentCore enables organizations to automate privileged actions without introducing new attack surfaces or compliance gaps. Traditional chatbots or RPA frameworks often leave these requirements as custom engineering challenges, while AgentCore offers them as built-in features.

For teams already building retrieval-augmented generation (RAG) or other advanced pipelines, AgentCore does not demand migration of existing agent frameworks. Its primitives can be integrated selectively—layering in microVM isolation, context-specific memory, and full telemetry—allowing technical leaders to add security and observability guarantees on top of their existing investments.

AgentCore’s public preview has certain boundaries. Currently, only three AWS regions are supported, and some primitives, such as the code interpreter and memory, have concurrency and file size limits—25 concurrent interpreter sessions per account and a 250MB cap on uploads, respectively. Agent-to-agent pass-through Identity and broader region coverage remain on the roadmap. Cold start latency is non-negligible, especially for latency-sensitive interactive applications, and organizations should assess how this might affect their user experience in high-frequency scenarios.

Amazon Bedrock AgentCore provides a standardized, composable foundation for deploying AI agents with enterprise requirements in mind. Its microVM-based Runtime, flexible memory architecture, integrated code execution, browser automation, tool gateway, enterprise identity, and observability combine to address long-standing obstacles to production agent adoption. Unlike legacy agent platforms that require piecemeal solutions and significant custom engineering, AgentCore delivers each operational primitive as a managed service with well-defined limits and integration points. As enterprise teams move beyond experimentation, AgentCore offers a clear, technically grounded path to deploying agentic workloads at scale, with a focus on security, auditability, and operational clarity.

A defining advantage of Amazon Bedrock AgentCore is its truly serverless architecture. Every primitive—whether Runtime, memory, interpreter, or browser—is delivered as a managed on-demand service. This removes the complexity burden that traditionally falls on development and operations teams, such as provisioning infrastructure, managing containers, or tuning resource allocation. When agents are idle, AgentCore automatically scales their environments down to zero, ensuring that organizations only consume resources (and incur costs) when actual work is being performed. This ability to scale seamlessly up or down, with no cold-start tax for the majority of agent workflows, makes AgentCore accessible and efficient for teams of any size, from small product teams running targeted pilots to large enterprises orchestrating thousands of concurrent agent sessions. The result is a platform where out-of-the-box manageability becomes a reality, freeing technical leaders to focus on agent design and business logic rather than undifferentiated infrastructure.

Serverless AI Agents with Amazon Bedrock AgentCore

The newly released suite from AWS just pushed serverless into GenAI.

Member discussion