model releaseOpenAI

OpenAI launches GPT-5.4 with native computer use capabilities for autonomous agents

TL;DR

OpenAI has launched GPT-5.4, its latest model with native computer use capabilities that allow it to operate computers and complete tasks across applications. The release represents a step toward autonomous AI agents that can handle complex jobs independently. The model includes advancements in reasoning, coding, and professional work with spreadsheets, documents, and presentations.

March 5, 2026 · 6:06 PM1 min read

OpenAI Launches GPT-5.4 With Native Computer Use Capabilities

OpenAI has released GPT-5.4, its latest model featuring native computer use capabilities—a significant development toward the autonomous agent systems that AI companies are pursuing. The model can operate computers on behalf of users and complete tasks across different applications without human intervention.

Key Capabilities

GPT-5.4 combines improvements in three core areas:

Reasoning: Enhanced logical problem-solving and complex task analysis
Coding: Improved code generation and technical implementation
Professional work: Native support for spreadsheets, documents, and presentations

The native computer use capability is the defining feature, enabling GPT-5.4 to interact with software interfaces directly. This represents a departure from previous models that required structured APIs or human intermediaries.

The Agentic AI Push

GPT-5.4 arrives amid an industry-wide shift toward agentic AI systems. OpenAI previously introduced ChatGPT Agent and has been building toward a future where networks of AI-powered agents operate autonomously in the background to complete complex online tasks and software operations.

Competitors have launched similar capabilities. Anthropic released Claude Opus 4.5 with agentic features, and Microsoft integrated AI agents into Windows 11, signaling that autonomous agent development has become a priority across the sector.

What This Means

GPT-5.4's computer use capability represents a practical step toward agents that can execute real-world tasks without human guidance. The focus on reasoning and coding suggests OpenAI is addressing the technical requirements for autonomous systems to handle complex, multi-step workflows. However, the actual performance, reliability, and safety of computer use across diverse applications remain unverified by independent benchmarks. OpenAI's specific context window size, pricing, and detailed benchmark scores for GPT-5.4 have not been disclosed.

The timeline for widespread deployment and whether computer use will be available to all users or limited to certain tiers requires clarification from OpenAI.

Source: theverge.com ↗

gpt-5-4 openai computer-use autonomous-agents model-release agentic-ai

model releaseJune 21, 2026

Poolside releases Laguna M.1: 225B parameter MoE model scores 74.6% on SWE-bench Verified

Poolside has released Laguna M.1, a 225B total parameter Mixture-of-Experts model with 23B activated parameters per token, designed for agentic coding tasks. The model scores 74.6% on SWE-bench Verified and 63.1% on SWE-bench Multilingual, released under Apache 2.0 license.

model releaseJune 18, 2026

Mistral Releases Voxtral TTS: 4B Parameter Text-to-Speech Model at $0.016 per 1k Characters

Mistral AI has released Voxtral TTS, a 4B parameter text-to-speech model supporting 9 languages including English, French, German, Spanish, Dutch, Portuguese, Italian, Hindi, and Arabic. The model achieves 70ms latency for typical inputs and can clone voices from as little as 3 seconds of audio, priced at $0.016 per 1,000 characters.

model releaseJune 18, 2026

Mistral Releases Mistral 3 Family: 675B-Parameter Large 3 MoE and Three Edge Models Under Apache 2.0

Mistral has released Mistral 3, including Mistral Large 3—a sparse mixture-of-experts model with 41B active and 675B total parameters—and three Ministral 3 edge models (3B, 8B, 14B). All models are released under Apache 2.0 license with multimodal capabilities and are available today on multiple platforms.

model releaseJune 18, 2026

Google releases Gemini 3.1 Flash Image, claims Pro-level quality at $0.50 per 1M tokens

Google has released Gemini 3.1 Flash Image, internally codenamed "Nano Banana 2," an image generation and editing model with a 131K context window. The model is priced at $0.50 per 1M input tokens and $3 per 1M output tokens.

OpenAI launches GPT-5.4 with native computer use capabilities for autonomous agents

OpenAI Launches GPT-5.4 With Native Computer Use Capabilities

Key Capabilities

The Agentic AI Push

What This Means

Related Articles

Poolside releases Laguna M.1: 225B parameter MoE model scores 74.6% on SWE-bench Verified

Mistral Releases Voxtral TTS: 4B Parameter Text-to-Speech Model at $0.016 per 1k Characters

Mistral Releases Mistral 3 Family: 675B-Parameter Large 3 MoE and Three Edge Models Under Apache 2.0

Google releases Gemini 3.1 Flash Image, claims Pro-level quality at $0.50 per 1M tokens

Comments