context-window

23 articles tagged with context-window

July 14, 2026

model releaseKwaipilot+1

Kwaipilot Releases KAT-Coder-Air V2.5 with 256K Context Window at $0.15/$0.60 Per Million Tokens

Kwaipilot has released KAT-Coder-Air V2.5, a coding-specialized model with a 256K token context window. The model is priced at $0.15 per million input tokens and $0.60 per million output tokens, positioning it as a mid-tier coding assistant option.

July 14, 2026 · 3:51 AM

July 9, 2026

model releaseOpenAI

OpenAI Releases GPT-5.6 Luna Pro with Extended Reasoning Mode at $1/$6 Per Million Tokens

OpenAI has released GPT-5.6 Luna Pro, a reasoning-enhanced variant of GPT-5.6 Luna with a 1 million token context window. The model is priced at $1 per million input tokens and $6 per million output tokens, with a knowledge cutoff date of February 2026.

July 9, 2026 · 5:51 PM

model releaseOpenAI

OpenAI Releases GPT-5.6 Terra Pro with Enhanced Reasoning Mode at $2.50/$15 Per Million Tokens

OpenAI has released GPT-5.6 Terra Pro, a variant of GPT-5.6 Terra configured with enhanced reasoning capabilities for complex tasks. The model features a 1 million token context window and is priced at $2.50 per million input tokens and $15 per million output tokens.

July 9, 2026 · 5:51 PM

model releaseOpenAI

OpenAI Releases GPT-5.6 Sol Pro with Extended Reasoning Mode at $5 Input/$30 Output per 1M Tokens

OpenAI has released GPT-5.6 Sol Pro, a reasoning-enhanced variant of GPT-5.6 Sol designed for complex tasks. The model features a 1 million token context window, February 2026 knowledge cutoff, and is priced at $5 per 1M input tokens and $30 per 1M output tokens.

July 9, 2026 · 5:50 PM

July 4, 2026

product update

Google AI Plus at $4.99/month and AI Pro at $19.99/month expand Gemini context windows to 128K and 1M tokens

Google has detailed pricing and features for its Gemini app subscription tiers. AI Plus costs $4.99/month and includes 128,000 token context windows, while AI Pro at $19.99/month provides 1 million token context windows. Free users are limited to 32,000 tokens.

July 4, 2026 · 8:50 PM

June 30, 2026

model releaseAnthropic

Claude Sonnet 5 ships with 1M token context and new tokenizer that increases costs 30-40% for English text

Anthropic released Claude Sonnet 5 with a 1 million token context window and 128,000 token maximum output. The model removes traditional sampling parameters and introduces a new tokenizer that generates approximately 30% more tokens than Sonnet 4.6 for the same English text—effectively a significant price increase despite unchanged nominal rates of $3/million input and $15/million output tokens.

June 30, 2026 · 9:36 PM

June 17, 2026

product updateGitHub

GitHub Copilot cuts token usage with improved context handling and model routing

GitHub has improved how Copilot handles context and routes requests to models, reducing token usage per session. The changes aim to make user credits last longer by eliminating wasted tokens.

June 17, 2026 · 7:50 PM

June 4, 2026

product updateOpenAI

OpenAI expands ChatGPT memory to free users, doubles storage capacity for paid tiers

OpenAI is rolling out an upgraded memory system for ChatGPT that synthesizes context more efficiently across conversations. The company reduced compute requirements by approximately 5x, enabling it to offer the memory feature to free users for the first time while doubling storage capacity for Plus and Pro subscribers.

June 4, 2026 · 4:51 PM

June 3, 2026

model release

Alibaba's Qwen Releases Qwen3.7 Plus: 1M Context Window at $0.40 Per Million Input Tokens

Alibaba's Qwen has released Qwen3.7 Plus, a multimodal model with a 1 million token context window. The model accepts text and image input with text output, priced at $0.40 per million input tokens and $1.60 per million output tokens through OpenRouter's API.

June 3, 2026 · 1:20 PM

May 21, 2026

product updateAmazon Web Services

AWS launches AgentCore Code Interpreter to process documents beyond context window limits using recursive LLM architectu

Amazon Web Services released AgentCore Code Interpreter, a sandboxed Python environment that enables recursive language models to process documents of unlimited length by treating context as an external environment rather than loading it into the model's context window. The system orchestrates sub-LLM calls from within the sandbox, maintaining intermediate results as Python variables across a persistent session.

May 21, 2026 · 4:21 PM

May 19, 2026

model release

Google Releases Gemini 3.5 Flash with 1M Token Context and Configurable Thinking Modes at $1.50/$9 Per Million Tokens

Google has released Gemini 3.5 Flash, a multimodal model with a 1 million token context window priced at $1.50 per million input tokens and $9 per million output tokens. The model supports text, image, video, audio, and PDF inputs with configurable thinking effort levels from minimal to high.

May 19, 2026 · 6:06 PM

May 12, 2026

changelogAnthropic

Anthropic releases Claude Opus 4.7 Fast with 6x pricing for higher output speed

Anthropic has released Claude Opus 4.7 Fast, a speed-optimized variant of its Opus 4.7 model. The fast-mode version delivers identical capabilities with higher output speed at premium pricing: $30 per 1M input tokens and $150 per 1M output tokens, representing a 6x increase over standard pricing.

May 12, 2026 · 7:20 PM

May 7, 2026

model release

Google releases Gemini 3.1 Flash Lite with 1M context at $0.25 per million input tokens

Google has released Gemini 3.1 Flash Lite, a high-efficiency multimodal model with a 1,048,576 token context window priced at $0.25 per million input tokens and $1.50 per million output tokens. The model supports text, image, video, audio, and PDF inputs with four thinking levels for cost-performance optimization.

May 7, 2026 · 4:21 PM

April 27, 2026

model releaseOpenAI

OpenAI Launches GPT Mini Latest with 400,000 Token Context Window

OpenAI released GPT Mini Latest on April 27, 2025, featuring a 400,000 token context window. The model automatically redirects to the latest version in the OpenAI GPT Mini family, allowing developers to stay current without manual updates.

April 27, 2026 · 7:52 PM

changelog

Google Launches Gemini Pro Latest Router with 1M+ Context Window

Google has released Gemini Pro Latest on OpenRouter, a dynamic model router that automatically redirects to the most current model in the Gemini Pro family. The router supports a 1,048,576-token context window and includes reasoning capabilities.

April 27, 2026 · 7:51 PM

changelog

Alibaba releases Qwen3.5 Plus with 1M token context window at $0.40 per million input tokens

Alibaba released an updated version of Qwen3.5 Plus on April 27, 2026, with a 1 million token context window. The multimodal model accepts text, image, and video input and is priced at $0.40 per million input tokens and $2.40 per million output tokens, with tiered pricing above 256K tokens.

April 27, 2026 · 4:36 AM

model release

Alibaba Qwen Releases Qwen3.6 Flash with 1M Context Window at $0.25 per 1M Input Tokens

Alibaba's Qwen team has released Qwen3.6 Flash, a multimodal language model supporting text, image, and video input with a 1 million token context window. The model is priced at $0.25 per 1M input tokens and $1.50 per 1M output tokens, with tiered pricing above 256K tokens.

April 27, 2026 · 4:35 AM

April 25, 2026

changelogOpenAI

OpenCode v1.14.25 Adds Roslyn LSP Support for Razor and Fixes GPT-5.5 Context Limits

OpenCode shipped v1.14.25 with Roslyn LSP support for Razor, .cshtml, and C# script files. The release fixes GPT-5.5 with OpenAI OAuth to use correct context limits and adds request details to LSP permission prompts.

April 25, 2026 · 2:50 PM

April 24, 2026

model releaseDeepSeek

DeepSeek V4 Pro launches with 1.6 trillion parameters, 1M token context at $0.145 per million input tokens

Chinese AI lab DeepSeek has released preview versions of DeepSeek V4 Flash and V4 Pro, mixture-of-experts models with 1 million token context windows. The V4 Pro has 1.6 trillion total parameters (49 billion active), making it the largest open-weight model available, while both models significantly undercut frontier model pricing.

April 24, 2026 · 1:50 PM

April 16, 2026

model releaseAnthropic

Anthropic releases Claude Opus 4.7 with 1M context window for long-running agent tasks

Anthropic has released Claude Opus 4.7, the latest version of its flagship Opus family designed for long-running, asynchronous agent tasks. The model features a 1 million token context window and costs $5 per million input tokens and $25 per million output tokens.

April 16, 2026 · 2:51 PM

March 31, 2026

model releasexAI

xAI releases Grok 4.20 Multi-Agent with 2M context window and parallel agent reasoning

xAI has released Grok 4.20 Multi-Agent, a variant designed for collaborative agent-based workflows with a 2-million-token context window. The model scales from 4 agents at low/medium reasoning effort to 16 agents at high/xhigh effort levels, priced at $2 per million input tokens and $6 per million output tokens.

March 31, 2026 · 7:20 PM

March 18, 2026

product updateAmazon Web Services

Amazon Nova 2 Lite surpasses Nova 1 Pro with 1M token context and extended thinking at 7x lower cost

Amazon Nova 2 Lite expands context window to 1 million tokens, introduces extended thinking with developer controls, and adds native tool use and web grounding. AWS claims Nova 2 Lite surpasses Nova 1 Pro on multi-step reasoning while costing 7x less and running up to 5x faster.

March 18, 2026 · 3:20 PM

February 28, 2026

benchmarkOpenAI

Frontier LLMs lose up to 33% accuracy in long conversations, study finds

Frontier language models including GPT-5.2 and Claude 4.6 experience accuracy degradation of up to 33% as conversations lengthen, according to new research. The finding suggests that extended context use within a single conversation introduces performance challenges even in state-of-the-art models.

February 28, 2026 · 6:05 PM

← Back to all news