pricing

50 articles tagged with pricing

June 18, 2026
model releaseMistral AI

Mistral OCR 3 launches at $2 per 1,000 pages with 74% win rate over previous version

Mistral AI released Mistral OCR 3, a document extraction model priced at $2 per 1,000 pages ($1 with Batch API discount). The model achieves a 74% overall win rate over its predecessor on forms, scanned documents, complex tables, and handwriting according to internal benchmarks.

product updateMistral AI

Mistral Launches OCR API at $1 Per 1,000 Pages, Claims 94.89% Accuracy on Document Benchmarks

Mistral AI has released Mistral OCR, an API for extracting text and images from documents at $1 per 1,000 pages (approximately $0.50 with batch inference). The company claims 94.89% overall accuracy on its internal test set, comparing favorably to GPT-4o (89.77%), Gemini 2.0 Flash (88.69%), and Azure OCR (89.52%).

June 17, 2026
product updateMicrosoft

Microsoft evaluates DeepSeek V3 for Copilot to cut agent costs, will offer cheaper tier within weeks

Microsoft is evaluating a self-hosted version of DeepSeek V3 to power Copilot Cowork as agent costs spiral. The company plans to launch a lower-cost tier within weeks while moving to usage-based pricing, charging enterprises for actual compute consumed rather than flat fees.

June 10, 2026
product update

Google Cuts AI Plus Subscription to $4.99, Doubles Storage in Direct Challenge to OpenAI

Google cut its AI Plus subscription price from $7.99 to $4.99 per month while doubling included storage from 200GB to 400GB. The move brings the pricing pressure already visible in markets like India — where OpenAI launched ChatGPT Go at $4.60 last August — directly to U.S. consumers.

June 9, 2026
model releaseAnthropic+1

Anthropic releases Fable 5, bringing capabilities of restricted Mythos model to public with $10/$50 per 1M token pricing

Anthropic has released Fable 5, making capabilities from its previously restricted Mythos model available to the public. The company claims Fable 5 beats GPT-5.5, Gemini 3.1 Pro, and its own Opus 4.8 in internal testing, with pricing set at $10 per million input tokens and $50 per million output tokens after a free trial period ending June 22.

model releaseAnthropic

Anthropic releases Claude Fable 5, first public Mythos-class model at $10/$50 per million tokens

Anthropic has released Claude Fable 5, its first publicly available Mythos-class model, at $10 per million input tokens and $50 per million output tokens—less than half the price of Claude Mythos Preview. The model includes safeguards that redirect sensitive queries to Claude Opus 4.8 in less than 5% of sessions.

model releaseAnthropic

Anthropic releases Claude Fable 5 with Mythos-class capabilities at $10/$50 per million tokens

Anthropic released Claude Fable 5, a Mythos-class model, to enterprise customers and paid subscribers two months after limiting its advanced Mythos model to select users. The new model costs $10 per million input tokens and $50 per million output tokens—twice the price of Claude Opus 4.8—and includes safeguards that block responses in high-risk areas like cybersecurity and biology.

model releaseAnthropic

Anthropic releases Claude Fable 5, a safety-limited version of Mythos, at $10/$50 per million tokens

Anthropic released Claude Fable 5, the first publicly available version of its Mythos model, with built-in safety restrictions that automatically block high-risk queries in cybersecurity, biology, chemistry, and related fields. The model costs $10 per million input tokens and $50 per million output tokens, double the price of Claude Opus 4.8.

model releaseAnthropic

Anthropic releases Claude Fable 5, first public Mythos-class model at $10/$50 per million tokens

Anthropic has released Claude Fable 5, marking the first broad release from its Mythos class of AI models. The company previously deemed this model family too dangerous for public release due to exceptional cybersecurity capabilities, but new safeguards that block responses in high-risk areas now make it available at $10 per million input tokens and $50 per million output tokens.

June 8, 2026
product update

Google cuts AI Plus subscription to $5/month, doubles storage to 400GB

Google lowered its AI Plus subscription from $8 to $5 per month and doubled included storage from 200GB to 400GB. The plan includes access to Gemini 3 Pro, Nano Banana Pro, Deep Research, and the newly announced Gemini Omni video generation model.

product updateApple

Apple waives cloud API fees for developers under 2M downloads using Private Cloud Compute

Apple announced it will waive cloud API fees for developers with fewer than 2 million first-time App Store downloads who use its Foundation Models running in Private Cloud Compute. The company also expanded its Foundation Models framework to include image input and support for server models from third-party cloud providers.

product updateApple

Apple's new Siri AI introduces usage caps and paid upgrades for image generation

Apple unveiled Siri AI at WWDC 2026, a rebuilt version of its assistant powered by Apple Intelligence and enhanced with Google Gemini. The company confirmed daily usage limits for features like image generation, with increased access requiring iCloud+ subscriptions. Siri AI won't launch in China and faces EU restrictions.

changelog

Google AI Plus drops to $4.99/month with 400GB storage, down from $7.99

Google reduced its AI Plus subscription from $7.99 to $4.99 per month and doubled storage from 200GB to 400GB. The plan includes 2x higher Gemini usage limits with a 128,000 token context window, along with features like daily briefs and video generation.

June 3, 2026
model release

Alibaba's Qwen Releases Qwen3.7 Plus: 1M Context Window at $0.40 Per Million Input Tokens

Alibaba's Qwen has released Qwen3.7 Plus, a multimodal model with a 1 million token context window. The model accepts text and image input with text output, priced at $0.40 per million input tokens and $1.60 per million output tokens through OpenRouter's API.

May 30, 2026
product updateMicrosoft

GitHub Copilot switches to token-based billing June 1, some users report costs jumping from $50 to $3,000

Microsoft is ending GitHub Copilot's flat-rate subscription model in favor of token-based billing starting June 1. Some developers report monthly costs rising from approximately $29-50 to $750-3,000, while others claim the increases only affect inefficient "vibe-coders" who iterate excessively without clear direction.

May 29, 2026
model release

StepFun releases Step-3.7-Flash: 198B-parameter MoE model with 256K context at $0.20/M input tokens

StepFun has released Step-3.7-Flash, a 198B-parameter sparse Mixture-of-Experts vision-language model that activates 11B parameters per token and delivers up to 400 tokens per second. The model supports a 256K context window, three selectable reasoning levels, and is priced at $0.20 per million input tokens (cache miss) and $1.15 per million output tokens.

May 28, 2026
model releaseAnthropic

Anthropic releases Claude Opus 4.8 with improved agentic coding and reasoning benchmarks

Anthropic released Claude Opus 4.8 on May 28, 2026, with improved performance in agentic coding, computer use, and reasoning benchmarks. Pricing remains at $5 per million input tokens and $25 per million output tokens, while the model's fast mode is now three times cheaper than previous versions.

changelogAnthropic

Anthropic Launches Claude Opus 4.8 Fast Mode at 2x Price for Higher Output Speed

Anthropic has released a fast-mode variant of Claude Opus 4.8 that delivers higher output speed at double the pricing of the standard version. The model offers identical capabilities to regular Opus 4.8 with input pricing at $10 per million tokens and output at $50 per million tokens.

product updateMistral AI+1

Mistral rebrands Le Chat to Vibe, launches autonomous coding agent and work automation platform

Mistral AI has rebranded Le Chat as Vibe, introducing two new agent modes: Work Mode for multi-step business tasks across connected apps, and Code Mode for autonomous coding from pull request to merge. The service includes a new VS Code extension and starts at $14.99/month for Pro tier.

model releaseMistral AI

Mistral OCR 3 launches at $2 per 1,000 pages with 74% win rate over previous version

Mistral AI released OCR 3, a document parsing model priced at $2 per 1,000 pages with a 50% batch API discount. The company claims a 74% overall win rate compared to Mistral OCR 2 on forms, scanned documents, complex tables, and handwriting.

model releaseMistral AI

Mistral Medium 3 launches at $0.4/$2 per million tokens, matching 90% of Claude 3.7 Sonnet performance

Mistral AI launched Mistral Medium 3 on May 7, 2025, priced at $0.4 per million input tokens and $2 per million output tokens. The company claims the model performs at or above 90% of Claude Sonnet 3.7 on benchmarks while being significantly less expensive, and surpasses Llama 4 Maverick and Cohere Command A.

product updateMistral AI

Mistral Releases OCR API at $1 per 1,000 Pages, Claims 94.89% Accuracy on Document Benchmarks

Mistral AI has released an OCR API priced at $1 per 1,000 pages with batch inference costs approximately half that rate. The company claims 94.89% overall accuracy on internal benchmarks, ahead of GPT-4o (89.77%), Gemini 2.0 Flash (88.69%), and Azure OCR (89.52%). The model processes up to 2,000 pages per minute on a single node.

May 23, 2026
changelogDeepSeek

DeepSeek cuts V4 Pro pricing by 75% to $0.003625 per million input tokens

DeepSeek has permanently reduced pricing for its V4 Pro model by 75%, bringing input token costs down to $0.003625 per million tokens from $0.0145. The move makes permanent a promotional discount that was set to expire May 31, 2026.

May 22, 2026
model release

Google releases Gemini 3.5 Flash and autonomous agent Gemini Spark at I/O 2026

Google announced Gemini 3.5 Flash and Gemini Spark at I/O 2026. Gemini 3.5 Flash now powers Google's AI Mode search, while Spark is a cloud-based autonomous agent that can monitor credit card statements, track emails, and interact with third-party services like OpenTable and Instacart.

May 21, 2026
product updateReplit

Replit launches self-serve Enterprise with instant SSO setup, unlimited seats, credit-based pricing

Replit today launched self-serve Enterprise, allowing organizations to purchase and configure an Enterprise account with SSO, SCIM, and RBAC in minutes without sales calls. The credit-based model provides unlimited seats, with all spending pooled across Replit Agent, Deployments, and Storage.

product update

Google announces Spark AI agent, Information agents, and Android Halo at I/O 2026—all paywalled behind $100/month Ultra

Google announced multiple AI agent products at I/O 2026, including Spark for managing digital tasks, Information agents for 24/7 topic monitoring, and Android Halo for notifications. All features remain paywalled behind the $100/month Gemini Ultra plan, with free access timeline unspecified.

product update

Google cuts AI Ultra plan to $200/month, launches new $100 developer tier

Google announced pricing changes to its Gemini AI subscription tiers at I/O 2026, cutting its top AI Ultra plan from $250 to $200 per month while introducing a new $100/month developer-focused tier. All plans now get access to Gemini 3.5 Flash and the new Gemini Omni video generation model.

May 19, 2026
model release

Google releases Gemini 3.5 Flash with 4x faster output and agentic capabilities, 3.5 Pro coming June

Google released Gemini 3.5 Flash today with 4x faster output token generation than competing frontier models while surpassing Gemini 3.1 Pro on coding, agentic, and multimodal benchmarks. The company announced Gemini 3.5 Pro will launch next month and introduced Gemini Omni, a new multimodal series that outputs video.

product update

Google repositions Antigravity as agentic development suite with CLI, SDK, and $100/month tier

Google has repositioned Antigravity 2.0 as a unified suite for agentic AI development, introducing parallel agent orchestration, a new CLI that replaces the Gemini CLI, an SDK for custom agents, and Managed Agents in the Gemini API. The company also launched a new $100/month AI Ultra tier with 5X the usage limits of the $20 Pro plan, alongside a dedicated Android app for AI Studio.

changelog

Google switches Gemini to compute-based limits, cuts AI Ultra to $100/month

Google is replacing Gemini's daily prompt limits with a compute-based system that factors in prompt complexity, features used, and chat length. Limits refresh every five hours until reaching a weekly cap. AI Ultra, aimed at developers and technical leads, now starts at $100/month—down from its previous entry point—with 5x higher usage limits than the Pro plan.

model release

Google Releases Gemini 3.5 Flash with 1M Token Context and Configurable Thinking Modes at $1.50/$9 Per Million Tokens

Google has released Gemini 3.5 Flash, a multimodal model with a 1 million token context window priced at $1.50 per million input tokens and $9 per million output tokens. The model supports text, image, video, audio, and PDF inputs with configurable thinking effort levels from minimal to high.

model release

Google releases Gemini 3.5 Flash at half the price of frontier models, announces Omni world model

Google released Gemini 3.5 Flash, priced at half to one-third the cost of comparable frontier models, and announced it will become the default model in the Gemini app globally. The company also unveiled Omni, a world model for simulating physical environments, and Gemini Spark, an AI agent in beta testing.

May 12, 2026
changelogAnthropic

Anthropic releases Claude Opus 4.7 Fast with 6x pricing for higher output speed

Anthropic has released Claude Opus 4.7 Fast, a speed-optimized variant of its Opus 4.7 model. The fast-mode version delivers identical capabilities with higher output speed at premium pricing: $30 per 1M input tokens and $150 per 1M output tokens, representing a 6x increase over standard pricing.

product updateGitHub

GitHub Copilot adds flexible model switching to Pro tier, launches Max plan on June 1

GitHub is restructuring its Copilot individual subscription tiers effective June 1, 2025. The company is introducing flexible model allotments to Pro and Pro+ plans and launching a new Max tier, responding to user feedback on plan structure.

May 7, 2026
model release

Google releases Gemini 3.1 Flash Lite with 1M context at $0.25 per million input tokens

Google has released Gemini 3.1 Flash Lite, a high-efficiency multimodal model with a 1,048,576 token context window priced at $0.25 per million input tokens and $1.50 per million output tokens. The model supports text, image, video, audio, and PDF inputs with four thinking levels for cost-performance optimization.

May 5, 2026
changelog+1

Google preparing 'AI Ultra Lite' tier between $20 Pro and $250 Ultra plans, adding usage dashboard

Google is developing an intermediate subscription tier called 'AI Ultra Lite' to slot between its $20 Pro and $250 Ultra plans, according to code discovered in the Gemini macOS app. The company is also preparing a usage dashboard showing token budgets across five-hour and weekly limits.

April 28, 2026
product updateMicrosoft

GitHub Copilot switches to metered token billing June 1 as flat-rate model proves unsustainable

Microsoft's GitHub is ending flat-rate billing for Copilot on June 1, 2026, switching to usage-based metered tokens after acknowledging the request-based model is no longer sustainable. Copilot Pro subscribers ($10/month) will receive 1,000 GitHub AI Credits monthly, with each credit worth $0.01.

April 27, 2026
product updateGitHub

GitHub Copilot switches to token-based pricing June 1, ending unlimited usage model

GitHub Copilot transitions to token-based pricing effective June 1, 2026, replacing its premium request unit system. Base subscription prices remain unchanged at $10/month for Pro and $39/month for Pro+, but users now receive equivalent monthly AI Credits that deplete with usage—and service stops when credits run out.

product updateGitHub

GitHub Copilot switches to usage-based billing with AI Credits starting June 1, 2025

GitHub will replace Copilot's flat subscription model with usage-based billing starting June 1, 2025. Users will consume GitHub AI Credits based on their actual Copilot usage, marking a significant shift in the company's pricing strategy.

changelog

Alibaba releases Qwen3.5 Plus with 1M token context window at $0.40 per million input tokens

Alibaba released an updated version of Qwen3.5 Plus on April 27, 2026, with a 1 million token context window. The multimodal model accepts text, image, and video input and is priced at $0.40 per million input tokens and $2.40 per million output tokens, with tiered pricing above 256K tokens.

model release

Alibaba Qwen Releases Qwen3.6 Flash with 1M Context Window at $0.25 per 1M Input Tokens

Alibaba's Qwen team has released Qwen3.6 Flash, a multimodal language model supporting text, image, and video input with a 1 million token context window. The model is priced at $0.25 per 1M input tokens and $1.50 per 1M output tokens, with tiered pricing above 256K tokens.

April 24, 2026
model releaseOpenAI

OpenAI Releases GPT-5.5 Pro with 1M+ Token Context Window, $30 Per Million Input Tokens

OpenAI has released GPT-5.5 Pro, a high-capability model with a 1,050,000 token context window (922K input, 128K output) priced at $30 per million input tokens and $180 per million output tokens. The model supports text and image inputs and is optimized for deep reasoning, agentic coding, and multi-step workflows.

model releaseDeepSeek

DeepSeek V4 Pro launches with 1.6 trillion parameters, 1M token context at $0.145 per million input tokens

Chinese AI lab DeepSeek has released preview versions of DeepSeek V4 Flash and V4 Pro, mixture-of-experts models with 1 million token context windows. The V4 Pro has 1.6 trillion total parameters (49 billion active), making it the largest open-weight model available, while both models significantly undercut frontier model pricing.

model releaseDeepSeek

DeepSeek V4 Pro launches with 1.6T parameters at $1.74/M tokens, undercutting Claude Sonnet 4.6 by 42%

DeepSeek released two preview models: V4 Pro (1.6T total parameters, 49B active) and V4 Flash (284B total, 13B active), both with 1 million token context windows. V4 Pro is priced at $1.74/M input tokens and $3.48/M output—42% cheaper than Claude Sonnet 4.6—while V4 Flash at $0.14/$0.28 per million tokens undercuts all small frontier models.

April 22, 2026
model releaseXiaomi

Xiaomi Launches MiMo-V2.5-Pro with 1M Context Window for Complex Agentic Tasks

Xiaomi released MiMo-V2.5-Pro on April 22, 2026, its flagship model featuring a 1,048,576 token context window and pricing at $1 per million input tokens and $3 per million output tokens. According to Xiaomi, the model ranks highly on ClawEval, GDPVal, and SWE-bench Pro benchmarks, designed for autonomous completion of professional tasks requiring thousands of tool calls.

product updateAnthropic+1

Anthropic silently tests 5x price increase for Claude Code, reverses within hours after backlash

Anthropic updated its pricing page on April 22, 2026, removing Claude Code from the $20/month Pro plan and restricting it to $100-200/month Max plans. The company reversed the change within hours after significant backlash across Reddit, Hacker News, and Twitter.

changelogAnthropic

Anthropic removes Claude Code from Pro plan pricing page, says only 2% of new signups affected by test

Anthropic removed Claude Code from its Pro subscription plan pricing page on Tuesday, though the company claims the change is only a test affecting approximately 2% of new prosumer signups. Existing Pro and Max subscribers remain unaffected, according to head of growth Amol Avasare.

April 21, 2026
model releaseOpenAI

OpenAI releases ChatGPT Images 2.0 with 3840x2160 resolution at $30 per 1M output tokens

OpenAI released ChatGPT Images 2.0, pricing output tokens at $30 per million with maximum resolution of 3840x2160 pixels. CEO Sam Altman claims the improvement from gpt-image-1 to gpt-image-2 equals the jump from GPT-3 to GPT-5.

April 20, 2026
product updateMicrosoft

GitHub halts Copilot Pro signups as agentic AI workloads overwhelm infrastructure

GitHub has paused new subscriptions for Copilot Pro, Pro+, and Student plans due to compute capacity constraints. The company cites agentic workflows as consuming significantly more resources than its original pricing structure anticipated, forcing tighter usage limits and a shift away from flat-rate billing.

product updateGitHub

GitHub Copilot Individual Plans Change Structure, Details Not Yet Disclosed

GitHub has announced changes to its Copilot Individual subscription plans, citing the need for reliability and predictability for existing customers. The company has not yet disclosed specific details about pricing adjustments, feature modifications, or implementation timelines.