coding

36 articles tagged with coding

July 16, 2026

model release

Google delays Gemini 3.5 Pro release after disappointing coding performance in June training update

Google has delayed the release of Gemini 3.5 Pro past its June deadline due to coding performance issues. The company retrained the model in late June with new data but saw disappointing results, according to Bloomberg. An upgraded Flash model is now in testing with partners.

July 16, 2026 · 7:50 PM

model releaseMoonshot AI

Moonshot AI Releases Kimi K3: Open-Weight Multimodal Reasoning Model with 1M Context Window

Moonshot AI has released Kimi K3, an open-weight multimodal reasoning model with a 1-million token context window. The model is priced at $3 per 1M input tokens and $15 per 1M output tokens, available through OpenRouter.

July 16, 2026 · 3:35 PM

July 9, 2026

model releaseOpenAI

OpenAI Releases GPT-5.6 Terra: Mid-Tier Model at $2.50 Input/$15 Output per 1M Tokens

OpenAI has released GPT-5.6 Terra, a mid-tier model in its GPT-5.6 series priced at $2.50 per million input tokens and $15 per million output tokens. The model features a 1 million token context window and February 2026 knowledge cutoff, positioned between the flagship Sol and cost-efficient Luna tiers.

July 9, 2026 · 5:20 PM

July 6, 2026

model releaseNex Agi

Nex AGI releases Nex-N2-Mini: open-source agentic MoE model with 262K context window

Nex AGI has released Nex-N2-Mini, an open-source agentic mixture-of-experts model with a 262K-token context window. The model accepts text and image inputs and is priced at $0.025 per 1M input tokens and $0.10 per 1M output tokens.

July 6, 2026 · 7:20 PM

June 30, 2026

model releaseAnthropic

Anthropic releases Claude Sonnet 5 at $2/1M input tokens, 63.2% agentic coding benchmark

Anthropic has released Claude Sonnet 5, its new mid-tier model optimized for agentic tasks, priced at $2 per million input tokens through August 31 before rising to $3/1M. The model scores 63.2% on agentic coding benchmarks, approaching Opus 4.8's 69.2% performance at a significantly lower price point.

June 30, 2026 · 6:06 PM

June 22, 2026

model release

Z.ai's GLM-5.2 Matches Claude Opus 4.8 in Agent Tasks, First Open Model to Compete in Coding

Z.ai released GLM-5.2 on June 16, 2026, the first open-weight model to match proprietary models like Claude Opus 4.8 on agent benchmarks. The MIT-licensed model closes the performance gap to 6.8 months behind frontier labs, down from expected 9+ months as compute scales.

June 22, 2026 · 3:06 PM

June 18, 2026

model releaseZhipu AI+1

Zhipu AI releases GLM-5.2 with 1M token context and 62.1% SWE-bench Pro score

Zhipu AI released GLM-5.2, a 753 billion parameter model with a 1 million token context window. The model scores 62.1% on SWE-bench Pro and introduces IndexShare architecture that reduces per-token FLOPs by 2.9× at 1M context length. Released under MIT license with no regional restrictions.

June 18, 2026 · 8:06 AM

June 17, 2026

model release

Z.AI releases GLM-5.2 with 1M token context, outperforms GPT-5.5 on long-horizon coding benchmarks

Z.AI has released GLM-5.2, an open-source model with a 1M-token context window under an MIT license. On FrontierSWE, a long-horizon coding benchmark, GLM-5.2 trails Claude Opus 4.8 by 1% while outperforming GPT-5.5 by 1%, and achieves 81.0 on Terminal-Bench 2.1 compared to Opus 4.8's 85.0.

June 17, 2026 · 9:21 AM

June 16, 2026

model release

GLM-5.2 Released with 1M Token Context and 753B Parameters Under MIT License

Zhipu AI has released GLM-5.2, a 753 billion parameter model featuring a 1 million token context window and MIT open-source license. The model scores 62.1% on SWE-bench Pro and 91.2% on GPQA-Diamond, with flexible reasoning effort levels for coding tasks.

June 16, 2026 · 6:21 PM

June 12, 2026

benchmark

Gemini 3.5 Flash ranks 6th in Android coding benchmark at 3x cost of Gemini 3.1 Pro

Google's latest Android Bench results show Gemini 3.5 Flash ranking 6th with a 63.7% success rate, despite averaging $147.10 per benchmark run compared to Gemini 3.1 Pro Preview's $47.90. The newer model used 355.9 tokens per run versus 73.3 for its predecessor, while GPT 5.5 leads the benchmark at 74% success rate.

June 12, 2026 · 3:20 PM

June 2, 2026

model releaseMicrosoft

Microsoft launches MAI-Code-1 and MAI-Thinking-1 models to reduce OpenAI dependence

Microsoft announced two proprietary AI models at its Build developer conference: MAI-Code-1 for code generation and MAI-Thinking-1 for reasoning tasks. The models are designed to run on Azure infrastructure, allowing Microsoft to reduce costs from its $13 billion OpenAI investment while competing directly with Anthropic and Google.

June 2, 2026 · 6:50 PM

May 28, 2026

model releaseAnthropic

Anthropic releases Claude Opus 4.8 with 69.2% agentic coding score, 2.5x faster performance

Anthropic released Claude Opus 4.8 on May 28, 2026, six weeks after version 4.7. The model achieves 69.2% on agentic coding benchmarks (up from 64.3%), runs 2.5 times faster in fast mode at one-third the cost, while maintaining the same pricing as version 4.7.

May 28, 2026 · 5:21 PM

May 21, 2026

model release

Alibaba Releases Qwen3.7 Max with 1M Token Context Window for Agent and Coding Tasks

Alibaba has released Qwen3.7 Max, the flagship model in its Qwen3.7 series, featuring a 1 million token context window. The text-only model is designed for agent-centric workloads with strengths in coding, office productivity, and long-horizon autonomous execution, and includes explicit prompt caching support.

May 21, 2026 · 3:50 PM

May 19, 2026

model release

Google releases Gemini 3.5 Flash with autonomous coding and agent capabilities, claims 4x speed boost

Google released Gemini 3.5 Flash, positioning it as an agent-first model designed for autonomous coding and multi-hour workflows. The company claims the model outperforms its 3.1 Pro predecessor on coding and agentic benchmarks while running 4x faster than competing frontier models, with an optimized version achieving 12x speed gains.

May 19, 2026 · 6:07 PM

model release

Google Releases Gemini 3.5 Flash with 1M Token Context and Configurable Thinking Modes at $1.50/$9 Per Million Tokens

Google has released Gemini 3.5 Flash, a multimodal model with a 1 million token context window priced at $1.50 per million input tokens and $9 per million output tokens. The model supports text, image, video, audio, and PDF inputs with configurable thinking effort levels from minimal to high.

May 19, 2026 · 6:06 PM

May 7, 2026

model release

Zyphra Releases ZAYA1-8B: 8.4B Parameter MoE Model with 760M Active Parameters Matches 80B+ Models on Math Benchmarks

Zyphra has released ZAYA1-8B, a mixture-of-experts language model with 760M active parameters and 8.4B total parameters. The model scores 89.1% on AIME 2026, competitive with models exceeding 100B parameters, while maintaining efficiency for on-device deployment.

May 7, 2026 · 12:35 AM

April 29, 2026

model releaseMistral AI

Mistral Releases Medium 3.5: 128B Dense Model With 256k Context and Configurable Reasoning

Mistral AI released Mistral Medium 3.5, a 128B parameter dense model with a 256k context window that unifies instruction-following, reasoning, and coding capabilities. The model features configurable reasoning effort per request and a vision encoder trained from scratch for variable image sizes.

April 29, 2026 · 7:21 PM

April 27, 2026

model release

Alibaba's Qwen Team Releases Qwen3.6 27B With 262K Context Window and Video Processing

Alibaba's Qwen Team has released Qwen3.6 27B, a 27-billion parameter multimodal language model with a 262,144-token context window. The model accepts text, image, and video inputs and includes a built-in thinking mode for extended reasoning, with pricing at $0.195 per million input tokens and $1.56 per million output tokens.

April 27, 2026 · 3:05 AM

April 25, 2026

changelogOpenAI

OpenAI discontinues separate Codex line, merges coding capabilities into GPT-5.5

OpenAI will not release a separate GPT-5.5-Codex model, according to Romain Huet. The company unified its Codex coding model with the main GPT line starting with GPT-5.4, with GPT-5.5 featuring enhanced agentic coding and computer use capabilities.

April 25, 2026 · 12:20 PM

April 24, 2026

model releaseDeepSeek

DeepSeek releases V4 preview, claims parity with GPT-4o and Claude 3.5 Sonnet

DeepSeek released a preview of its V4 model on April 24, 2026, claiming the open-source system matches leading closed-source models from Anthropic, Google, and OpenAI. The company emphasized improved coding capabilities and compatibility with domestic Huawei chips, but did not disclose training costs or hardware specifications.

April 24, 2026 · 10:05 AM

model releaseDeepSeek

DeepSeek V4 Flash Released: 284B Parameter MoE Model with 1M Context Window at $0.14 per Million Tokens

DeepSeek has released V4 Flash, a Mixture-of-Experts model with 284B total parameters and 13B activated parameters per request. The model supports a 1,048,576-token context window and is priced at $0.14 per million input tokens and $0.28 per million output tokens.

April 24, 2026 · 4:21 AM

April 23, 2026

model releaseTencent

Tencent Releases Hy3-Preview: 295B-Parameter MoE Model with 21B Active Parameters

Tencent has released Hy3-preview, a 295-billion-parameter Mixture-of-Experts model with 21 billion active parameters and a 256K context window. The model scores 76.28% on MATH and 34.86% on LiveCodeBench-v6, with particularly strong performance on coding agent tasks.

April 23, 2026 · 8:51 PM

model releaseOpenAI

OpenAI GPT-5.5 Powers Codex Coding Agent on NVIDIA GB200 Infrastructure

OpenAI has released GPT-5.5, its latest frontier model, according to NVIDIA. The model powers Codex, OpenAI's agentic coding application, running on NVIDIA GB200 NVL72 rack-scale systems.

April 23, 2026 · 7:05 PM

model releaseOpenAI

OpenAI releases GPT-5.5 with improved coding efficiency, fewer tokens needed

OpenAI has released GPT-5.5, one month after GPT-5.4, claiming improved performance on coding tasks and reduced token usage in its Codex platform. The model rolls out April 24 to paid ChatGPT tiers.

April 23, 2026 · 6:21 PM

model releaseOpenAI

OpenAI releases GPT-5.5 with improved coding and computer control capabilities

OpenAI released GPT-5.5, its latest AI model with enhanced coding, computer operation, and research capabilities. The model is rolling out to paid subscribers in ChatGPT and Codex, with API access coming soon.

April 23, 2026 · 6:20 PM

April 22, 2026

model release

Alibaba releases Qwen3.6-27B with 262K context window, scores 53.5% on SWE-bench Pro

Alibaba has released Qwen3.6-27B, a 27-billion parameter language model with a native 262,144 token context window (extensible to 1,010,000 tokens). The model achieves 53.5% on SWE-bench Pro and 77.2% on SWE-bench Verified, with FP8 quantization providing near-identical performance to the full-precision version.

April 22, 2026 · 10:36 PM

model release

Alibaba Qwen Releases 27B Parameter Model That Claims to Match 397B Performance on Coding Tasks

Alibaba Qwen released Qwen3.6-27B, a 27B parameter dense model that claims flagship-level coding performance surpassing their previous 397B MoE model across major coding benchmarks. The full model is 55.6GB compared to 807GB for the predecessor.

April 22, 2026 · 5:06 PM

model release

Alibaba Qwen Releases 27B Parameter Model with 262K Context Window, Claims 77.2% on SWE-bench Verified

Alibaba Qwen released Qwen3.6-27B, a 27-billion parameter model with a 262,144 token context window extensible to 1,010,000 tokens. The model claims 77.2% on SWE-bench Verified and 53.5% on SWE-bench Pro, with open weights available on Hugging Face.

April 22, 2026 · 2:06 PM

April 16, 2026

model releaseAnthropic

Anthropic releases Claude Opus 4.7 with improved coding and vision, confirms it trails unreleased Mythos model

Anthropic released Claude Opus 4.7 with improved coding capabilities, higher-resolution vision, and a new reasoning level. The company publicly acknowledged the model underperforms its unreleased Mythos system, which remains restricted due to safety concerns.

April 16, 2026 · 4:36 PM

model releaseAnthropic

Anthropic ships Claude Opus 4.7 with improved coding reliability and multimodal capabilities

Anthropic has released Claude Opus 4.7, its latest generally available AI model focused on advanced software engineering. The model shows improvements in handling complex coding tasks with less supervision, enhanced vision capabilities, and better instruction following, while introducing a new tokenizer that increases token usage by 1.0-1.35× depending on content type.

April 16, 2026 · 3:06 PM

model release+1

Alibaba Releases Qwen3.6-35B-A3B: 35B Parameter MoE Model with 262K Context Window

Alibaba has released Qwen3.6-35B-A3B, the first open-weight model in the Qwen3.6 series. The model features 35B total parameters with 3B activated, a native 262K context window extensible to 1.01M tokens, and achieves 73.4% on SWE-bench Verified using 256 experts with 8 activated per token.

April 16, 2026 · 2:21 PM

April 9, 2026

model releaseZhipu AI

Zhipu AI's GLM-5.1 outperforms GPT-5.4 and Claude Opus 4.6 on SWE-Bench Pro through iterative strategy refinement

Zhipu AI has released GLM-5.1, a freely available open-weight model designed for long-running programming tasks that achieves 58.4% on SWE-Bench Pro, edging out GPT-5.4 (57.7%) and Claude Opus 4.6 (57.3%). The model's core capability is iterative strategy refinement—it rethinks its approach across hundreds of iterations and thousands of tool calls, recognizing dead ends and shifting tactics without human intervention. However, GLM-5.1 trails on reasoning and knowledge benchmarks, scoring 31% on Humanity's Last Exam compared to Gemini 3.1 Pro's 45%.

April 9, 2026 · 11:20 AM

April 8, 2026

model release

Alibaba's Qwen3.6 Plus reaches 78.8 on SWE-bench with 1M context window

Alibaba released Qwen3.6 Plus on April 2, 2026, featuring a 1 million token context window at $0.50 per million input tokens and $3 per million output tokens. The model combines linear attention with sparse mixture-of-experts routing to achieve a 78.8 score on SWE-bench Verified, with significant improvements in agentic coding, front-end development, and reasoning tasks.

April 8, 2026 · 1:05 AM

April 2, 2026

product update

Gemini 3.1 Pro launches in Augment Code at 2.6x cheaper than Claude Opus 4.6

Augment Code now offers Gemini 3.1 Pro alongside Claude Opus 4.6 and GPT-5.4. In head-to-head testing on structural refactoring tasks, Gemini matched or outperformed Opus while consuming 268 credits per task—46% cheaper than Opus's 488 credits—making it 2.6x more cost-effective per message in real-world usage.

April 2, 2026 · 5:35 PM

model release

Alibaba releases Qwen3.6-Plus with 1M token context, claims performance near Claude 4.5 Opus

Alibaba has released Qwen3.6-Plus, its third proprietary AI model in days, featuring a 1 million token context window available via Alibaba Cloud Model Studio API. The model claims improved agentic coding capabilities and partially outperforms Anthropic's Claude 4.5 Opus in Alibaba-conducted benchmarks, though trails Claude 4.6 Opus released in December 2025.

April 2, 2026 · 3:05 PM

March 17, 2026

model releaseOpenAI

OpenAI releases GPT-5.4 mini and nano with 3-4x price increases but major performance gains

OpenAI has released GPT-5.4 mini and GPT-5.4 nano, compact models optimized for coding and subagent tasks. The new models deliver significant performance improvements—GPT-5.4 mini reaches 54.4% on SWE-Bench Pro versus 45.7% for GPT-5 mini—but cost 3-4x more per input token than their predecessors.

March 17, 2026 · 7:35 PM

← Back to all news