OpenAI Upgrades Its Agents SDK With Sandboxing and a New Model Harness

Building enterprise-grade AI agents just got a little less risky. OpenAI has released a significant update to its Agents SDK, adding two capabilities that development teams have been waiting for: Native sandboxing and an in-distribution model harness. Together, these additions push the SDK from a promising framework into something closer to a production-ready platform.

From Swarm to Something Serious

If you’ve been following OpenAI’s agent development story, the Agents SDK has come a long way in a short time. It launched in early 2025 as the production-ready successor to Swarm, an experimental, lightweight framework for exploring multi-agent patterns. The SDK brought structure to what Swarm only suggested — formalizing four core primitives: Agents, Handoffs, Guardrails, and Tracing.

That foundation gave developers a working starting point. But building agents that can run complex, multi-step tasks over extended periods — what the industry calls long-horizon agents — required developers to fill in many gaps themselves. That’s what this update begins to address.

Sandboxing: Running Agents Without the Risk

The most significant addition is sandboxing. Agents running without guardrails in a production environment carry real risk. They’re capable, but not always predictable. A sandboxed agent operates inside a controlled, siloed workspace — it can access the files and code it needs for a specific operation, but it can’t wander into parts of the system it shouldn’t touch.

This approach lets agents operate in a siloed capacity, accessing only what’s needed for specific operations while protecting the system’s overall integrity. For enterprise teams that have been cautious about deploying agents in sensitive environments, this matters.

Karan Sharma from OpenAI’s product team put it plainly. He described the launch as being about making the existing Agents SDK compatible with sandbox providers, so teams can build long-horizon agents using OpenAI’s harness with whatever infrastructure they already have.

The Harness: Getting Agents to Work Like Codex

The second major addition is an in-distribution harness for frontier models. In agent development, the “harness” refers to everything around the model itself — the instructions, tools, approvals, tracing, handoffs, and state management that make an agent functional in a real environment. Without a strong harness, even the best models underperform.

The evolved Agents SDK provides the model with a harness that includes instructions, tools, approvals, tracing, handoffs, and resume bookkeeping — the kind of model behavior seen in Codex-style agents. That last part is telling. OpenAI is essentially bringing the same scaffolding that powers Codex into the broader Agents SDK, which signals where the company sees all of this heading.

The harness is launching in Python first, with TypeScript support planned for a later release.

Provider-Agnostic From the Start

One thing worth noting: the SDK isn’t locked to OpenAI’s own models. While optimized for OpenAI models, the SDK works with more than 100 other LLMs through the Chat Completions API — a design choice that avoids vendor lock-in while keeping the development experience straightforward. That kind of flexibility matters to enterprise teams evaluating long-term infrastructure decisions.

The Assistants API Clock Is Ticking

This update also comes with a reminder that isn’t getting enough attention. OpenAI plans to formally deprecate the Assistants API, with a target sunset date in mid-2026, and will provide a migration guide to help developers move their applications to the Responses API. Teams still running workloads on the Assistants API should treat this update as a signal to start planning their migration now, not later.

What This Means for Enterprise DevOps Teams

The Agents SDK update reflects something broader happening across the industry. Enterprises aren’t just experimenting with agents anymore — they’re trying to deploy them at scale, in real environments, with real consequences if something goes wrong.

“Sandboxing enforces execution boundaries; the harness anchors orchestration with approvals, tracing, handoffs, and resume bookkeeping. OpenAI now competes directly with vendors building standalone agent control planes,” according to Mitch Ashley, VP and practice lead for software lifecycle engineering at The Futurum Group.

“Platform selection criteria shift accordingly. Enterprise teams must weigh execution and governance primitives alongside model quality. Teams on the Assistants API cannot defer migration planning given the mid-2026 sunset, and vendors without comparable primitives face procurement pressure.”

The combination of sandboxing and a structured harness doesn’t solve every challenge, but it removes two significant barriers that have made enterprise teams hesitant. Agents can now operate in controlled environments with the kind of scaffolding that supports tracing, approvals, and resumability — features that are table stakes for production software.

OpenAI has said it will continue expanding the SDK. But even in its current form, this update makes a clear statement: the era of AI agents as side projects is ending. The build-for-production era is here.