context-window
17 articles tagged with context-window
OpenAI expands ChatGPT memory to free users, doubles storage capacity for paid tiers
OpenAI is rolling out an upgraded memory system for ChatGPT that synthesizes context more efficiently across conversations. The company reduced compute requirements by approximately 5x, enabling it to offer the memory feature to free users for the first time while doubling storage capacity for Plus and Pro subscribers.
Alibaba's Qwen Releases Qwen3.7 Plus: 1M Context Window at $0.40 Per Million Input Tokens
Alibaba's Qwen has released Qwen3.7 Plus, a multimodal model with a 1 million token context window. The model accepts text and image input with text output, priced at $0.40 per million input tokens and $1.60 per million output tokens through OpenRouter's API.
AWS launches AgentCore Code Interpreter to process documents beyond context window limits using recursive LLM architectu
Amazon Web Services released AgentCore Code Interpreter, a sandboxed Python environment that enables recursive language models to process documents of unlimited length by treating context as an external environment rather than loading it into the model's context window. The system orchestrates sub-LLM calls from within the sandbox, maintaining intermediate results as Python variables across a persistent session.
Google Releases Gemini 3.5 Flash with 1M Token Context and Configurable Thinking Modes at $1.50/$9 Per Million Tokens
Google has released Gemini 3.5 Flash, a multimodal model with a 1 million token context window priced at $1.50 per million input tokens and $9 per million output tokens. The model supports text, image, video, audio, and PDF inputs with configurable thinking effort levels from minimal to high.
Anthropic releases Claude Opus 4.7 Fast with 6x pricing for higher output speed
Anthropic has released Claude Opus 4.7 Fast, a speed-optimized variant of its Opus 4.7 model. The fast-mode version delivers identical capabilities with higher output speed at premium pricing: $30 per 1M input tokens and $150 per 1M output tokens, representing a 6x increase over standard pricing.
Google releases Gemini 3.1 Flash Lite with 1M context at $0.25 per million input tokens
Google has released Gemini 3.1 Flash Lite, a high-efficiency multimodal model with a 1,048,576 token context window priced at $0.25 per million input tokens and $1.50 per million output tokens. The model supports text, image, video, audio, and PDF inputs with four thinking levels for cost-performance optimization.
OpenAI Launches GPT Mini Latest with 400,000 Token Context Window
OpenAI released GPT Mini Latest on April 27, 2025, featuring a 400,000 token context window. The model automatically redirects to the latest version in the OpenAI GPT Mini family, allowing developers to stay current without manual updates.
Google Launches Gemini Pro Latest Router with 1M+ Context Window
Google has released Gemini Pro Latest on OpenRouter, a dynamic model router that automatically redirects to the most current model in the Gemini Pro family. The router supports a 1,048,576-token context window and includes reasoning capabilities.
Alibaba releases Qwen3.5 Plus with 1M token context window at $0.40 per million input tokens
Alibaba released an updated version of Qwen3.5 Plus on April 27, 2026, with a 1 million token context window. The multimodal model accepts text, image, and video input and is priced at $0.40 per million input tokens and $2.40 per million output tokens, with tiered pricing above 256K tokens.
Alibaba Qwen Releases Qwen3.6 Flash with 1M Context Window at $0.25 per 1M Input Tokens
Alibaba's Qwen team has released Qwen3.6 Flash, a multimodal language model supporting text, image, and video input with a 1 million token context window. The model is priced at $0.25 per 1M input tokens and $1.50 per 1M output tokens, with tiered pricing above 256K tokens.
OpenCode v1.14.25 Adds Roslyn LSP Support for Razor and Fixes GPT-5.5 Context Limits
OpenCode shipped v1.14.25 with Roslyn LSP support for Razor, .cshtml, and C# script files. The release fixes GPT-5.5 with OpenAI OAuth to use correct context limits and adds request details to LSP permission prompts.
DeepSeek V4 Pro launches with 1.6 trillion parameters, 1M token context at $0.145 per million input tokens
Chinese AI lab DeepSeek has released preview versions of DeepSeek V4 Flash and V4 Pro, mixture-of-experts models with 1 million token context windows. The V4 Pro has 1.6 trillion total parameters (49 billion active), making it the largest open-weight model available, while both models significantly undercut frontier model pricing.
Anthropic releases Claude Opus 4.7 with 1M context window for long-running agent tasks
Anthropic has released Claude Opus 4.7, the latest version of its flagship Opus family designed for long-running, asynchronous agent tasks. The model features a 1 million token context window and costs $5 per million input tokens and $25 per million output tokens.
xAI releases Grok 4.20 Multi-Agent with 2M context window and parallel agent reasoning
xAI has released Grok 4.20 Multi-Agent, a variant designed for collaborative agent-based workflows with a 2-million-token context window. The model scales from 4 agents at low/medium reasoning effort to 16 agents at high/xhigh effort levels, priced at $2 per million input tokens and $6 per million output tokens.
Alibaba releases Qwen 3.6 Plus Preview with 1M token context, free via OpenRouter
Alibaba's Qwen division has released Qwen 3.6 Plus Preview, a free multimodal model available via OpenRouter with a 1,000,000 token context window. The model claims stronger reasoning and more reliable agentic behavior compared to the 3.5 series, with particular strength in coding and complex problem-solving tasks.
Amazon Nova 2 Lite surpasses Nova 1 Pro with 1M token context and extended thinking at 7x lower cost
Amazon Nova 2 Lite expands context window to 1 million tokens, introduces extended thinking with developer controls, and adds native tool use and web grounding. AWS claims Nova 2 Lite surpasses Nova 1 Pro on multi-step reasoning while costing 7x less and running up to 5x faster.
Frontier LLMs lose up to 33% accuracy in long conversations, study finds
Frontier language models including GPT-5.2 and Claude 4.6 experience accuracy degradation of up to 33% as conversations lengthen, according to new research. The finding suggests that extended context use within a single conversation introduces performance challenges even in state-of-the-art models.