model release

26 articles tagged with model release

May 22, 2026
model releaseNVIDIA

NVIDIA releases Nemotron-Labs-Diffusion-14B with tri-mode decoding achieving 3.3x speed-up on GB200

NVIDIA released Nemotron-Labs-Diffusion-14B, a 14-billion parameter language model that supports three decoding modes by switching attention patterns during inference. The model achieves 850 tokens per second on GB200 hardware at concurrency 1, representing a 3.3x speed-up over standard autoregressive decoding and outperforming Qwen3-8B-Eagle3 by 2.2x in self-speculation mode.

May 21, 2026
model release

Alibaba Releases Qwen3.7 Max with 1M Token Context Window for Agent and Coding Tasks

Alibaba has released Qwen3.7 Max, the flagship model in its Qwen3.7 series, featuring a 1 million token context window. The text-only model is designed for agent-centric workloads with strengths in coding, office productivity, and long-horizon autonomous execution, and includes explicit prompt caching support.

May 20, 2026
model releasexAI

xAI Launches Grok Build 0.1: Coding Model with 256K Context for Agentic Workflows

xAI has released Grok Build 0.1, a coding-specialized model with a 256K context window and unlimited text output. The model is designed for agentic software engineering workflows and powers xAI's Grok Build CLI tool.

April 30, 2026
model releasexAI

xAI releases Grok 4.3 reasoning model with 1M token context at $1.25/M input tokens

xAI has released Grok 4.3, a reasoning model with a 1 million token context window and no output token limit. The model accepts text and image inputs, has always-on reasoning that cannot be disabled, and uses tiered pricing starting at $1.25 per million input tokens and $2.50 per million output tokens.

April 29, 2026
model releaseOpenAI

OpenAI releases GPT-5.5 with 82.7% Terminal-Bench score, API priced at $5/$30 per million tokens

OpenAI released GPT-5.5 on April 23, its first retrained base model since GPT-4.5, scoring 82.7% on Terminal-Bench 2.0 versus GPT-5.4's 75.1% and Claude Opus 4.7's 69.4%. API pricing is set at $5 per million input tokens and $30 per million output tokens, exactly double GPT-5.4 rates.

model releaseNVIDIA+1

NVIDIA Releases Nemotron 3 Nano Omni: 31B-Parameter Multimodal Model with 256K Context and Reasoning Mode

NVIDIA has released Nemotron 3 Nano Omni 30B-A3B, a multimodal large language model with 31 billion parameters using a Mamba2-Transformer hybrid Mixture of Experts architecture. The model supports video, audio, image, and text inputs with a 256K token context window and includes a dedicated reasoning mode with chain-of-thought capabilities.

April 27, 2026
analysis

Qwen releases three new Qwen3.6 models ranging from 27B to flagship Max Preview

Qwen has released three models in its Qwen3.6 series: a flagship Max Preview model, a 35B parameter A3B variant, and a 27B parameter base model. All three models are now accessible through OpenRouter's API platform.

April 24, 2026
model releaseOpenAI

OpenAI Releases GPT-5.5 Pro with 1M+ Token Context Window, $30 Per Million Input Tokens

OpenAI has released GPT-5.5 Pro, a high-capability model with a 1,050,000 token context window (922K input, 128K output) priced at $30 per million input tokens and $180 per million output tokens. The model supports text and image inputs and is optimized for deep reasoning, agentic coding, and multi-step workflows.

model releaseDeepSeek

DeepSeek V4 Flash Released: 284B Parameter MoE Model with 1M Context Window at $0.14 per Million Tokens

DeepSeek has released V4 Flash, a Mixture-of-Experts model with 284B total parameters and 13B activated parameters per request. The model supports a 1,048,576-token context window and is priced at $0.14 per million input tokens and $0.28 per million output tokens.

analysis

DeepSeek Releases V4-Flash and V4-Pro Models as Tencent Ships Hy3-Preview

DeepSeek has released two new models in its V4 series: DeepSeek-V4-Flash and DeepSeek-V4-Pro, both now available on Hugging Face. Separately, Tencent has shipped Hy3-Preview, marking simultaneous releases from two major Chinese AI labs.

April 23, 2026
model releaseOpenAI+1

OpenAI releases GPT-5.5 with 400K context window, higher pricing than GPT-5.4

OpenAI released GPT-5.5 on April 23, 2026, seven weeks after GPT-5.4. The model features a 400K context window in Codex and claims improvements in multi-step tasks, agentic coding, and computer use, though at higher pricing than its predecessor.

model releaseOpenAI

OpenAI releases GPT-5.5 with improved reasoning and agentic capabilities

OpenAI released GPT-5.5 on April 23, 2026, positioning it as a step toward agentic computing and a unified 'superapp' combining ChatGPT, Codex, and browser capabilities. The company claims the model outperforms GPT-5.4, Google's Gemini 3.1 Pro, and Anthropic's Claude Opus 4.5 across multiple benchmarks.

model releaseOpenAI

OpenAI releases GPT-5.5 with improved coding efficiency, fewer tokens needed

OpenAI has released GPT-5.5, one month after GPT-5.4, claiming improved performance on coding tasks and reduced token usage in its Codex platform. The model rolls out April 24 to paid ChatGPT tiers.

model releaseOpenAI

OpenAI releases GPT-5.5 with improved coding and computer control capabilities

OpenAI released GPT-5.5, its latest AI model with enhanced coding, computer operation, and research capabilities. The model is rolling out to paid subscribers in ChatGPT and Codex, with API access coming soon.

model releaseOpenAI+1

OpenAI teases GPT-5.5 in cryptic Base64-encoded message 'NS41'

OpenAI posted the cryptic message 'NS41' on X, which decodes to '5.5' through Base64 encoding or mathematical interpretation. The teaser suggests an imminent announcement for GPT-5.5, though no release date, pricing, or technical specifications have been disclosed.

April 22, 2026
analysis

Qwen 3.6 27B Released With FP8 Quantization, OpenAI Deploys Privacy Filter Model

Alibaba Cloud released Qwen 3.6 27B, a 27-billion parameter language model, alongside an FP8 quantized version for deployment efficiency. Separately, OpenAI published a privacy filter model on Hugging Face, marking a rare public model release from the company.

model release

Alibaba releases Qwen3.6-27B with 262K context window, scores 53.5% on SWE-bench Pro

Alibaba has released Qwen3.6-27B, a 27-billion parameter language model with a native 262,144 token context window (extensible to 1,010,000 tokens). The model achieves 53.5% on SWE-bench Pro and 77.2% on SWE-bench Verified, with FP8 quantization providing near-identical performance to the full-precision version.

model releaseArcee Ai

Arcee AI Releases Trinity Large Preview: 400B-Parameter MoE Model with 512K Context Window

Arcee AI has released Trinity Large Preview, a 400B-parameter sparse Mixture-of-Experts model with 13B active parameters per token using 4-of-256 expert routing. The model supports context windows up to 512K tokens and is available with open weights under permissive licensing.

model releaseXiaomi

Xiaomi Launches MiMo-V2.5-Pro with 1M Context Window for Complex Agentic Tasks

Xiaomi released MiMo-V2.5-Pro on April 22, 2026, its flagship model featuring a 1,048,576 token context window and pricing at $1 per million input tokens and $3 per million output tokens. According to Xiaomi, the model ranks highly on ClawEval, GDPVal, and SWE-bench Pro benchmarks, designed for autonomous completion of professional tasks requiring thousands of tool calls.

April 21, 2026
model releaseOpenAI

OpenAI Releases GPT-5.4 Image 2 with 272K Context Window and Image Generation

OpenAI has released GPT-5.4 Image 2, combining the GPT-5.4 reasoning model with image generation capabilities. The multimodal model features a 272K token context window and is priced at $8 per million input tokens and $15 per million output tokens.

model release

InclusionAI releases Ling-2.6-flash: 104B parameter model with 7.4B active parameters, free on OpenRouter

InclusionAI has released Ling-2.6-flash, an instruction-tuned model with 104 billion total parameters and 7.4 billion active parameters, available free through OpenRouter. The model features a 262,144-token context window and is designed for agent workflows requiring fast responses and high token efficiency.

April 16, 2026
model releaseAnthropic

Anthropic releases Claude Opus 4.7 with improved coding and vision, confirms it trails unreleased Mythos model

Anthropic released Claude Opus 4.7 with improved coding capabilities, higher-resolution vision, and a new reasoning level. The company publicly acknowledged the model underperforms its unreleased Mythos system, which remains restricted due to safety concerns.

April 15, 2026
model releaseOpenAI

OpenAI releases GPT-5.4-Cyber, a cybersecurity-focused model limited to verified security professionals

OpenAI has released GPT-5.4-Cyber, a fine-tuned variant of GPT-5.4 built for defensive cybersecurity work including binary reverse engineering. Access is initially restricted to a few hundred verified security professionals, with expansion planned to thousands of individuals and hundreds of teams in coming weeks.

April 14, 2026
model releaseOpenAI

OpenAI releases GPT-5.4-Cyber with tiered access verification system for cybersecurity work

OpenAI released GPT-5.4-Cyber, a model variant designed for defensive cybersecurity tasks with fewer restrictions on dual-use queries. Access is controlled through a tiered verification system in the Trusted Access for Cyber program, targeting thousands of vetted users compared to Anthropic's 40-organization Mythos Preview rollout.

April 10, 2026
model releaseAnthropic+1

Powell, Bessent convene bank CEOs over Anthropic's Mythos AI cyber threat

Federal Reserve Chairman Jerome Powell and Treasury Secretary Scott Bessent met with major U.S. bank CEOs this week to assess cyber risks from Anthropic's newly released Mythos model. Anthropic deployed Mythos in limited capacity due to concerns that hackers could exploit its capabilities.

April 8, 2026
model release

Meta launches proprietary Muse Spark model, abandoning open-source commitment

Meta has released Muse Spark, a proprietary AI model with restricted access via API and portal invite only—a striking reversal from CEO Mark Zuckerberg's 2024 manifesto championing open-source AI. The model claims performance matching top competitors from OpenAI, Anthropic, and Google, trained with an order of magnitude less compute than Llama 4.