changelog

Alibaba releases Qwen3.5 Plus with 1M token context window at $0.40 per million input tokens

TL;DR

Alibaba released an updated version of Qwen3.5 Plus on April 27, 2026, with a 1 million token context window. The multimodal model accepts text, image, and video input and is priced at $0.40 per million input tokens and $2.40 per million output tokens, with tiered pricing above 256K tokens.

1 min read
0

Qwen3.5 Plus (April 2026) — Quick Specs

Context window1000K tokens
Input$0.4/1M tokens
Output$2.4/1M tokens

Alibaba releases Qwen3.5 Plus with 1M token context window at $0.40 per million input tokens

Alibaba released an updated version of Qwen3.5 Plus on April 27, 2026, expanding its context window to 1 million tokens. The multimodal model accepts text, image, and video input and produces text output.

Pricing and specifications

The model is priced at $0.40 per million input tokens and $2.40 per million output tokens. According to the model page, tiered pricing applies above 256K tokens, though specific tier rates are not disclosed.

Qwen3.5 Plus supports three input modalities:

  • Text
  • Image
  • Video

The model outputs text only.

Context window expansion

The 1 million token context window represents a significant expansion for the Qwen series. This capacity allows the model to process substantially longer documents, codebases, or multi-turn conversations within a single context.

The model is currently available through OpenRouter, which routes requests across multiple providers with automatic fallbacks. No other distribution channels were disclosed in the announcement.

What this means

Qwen3.5 Plus's pricing positions it in the mid-range of multimodal models with extended context windows. At $0.40 per million input tokens, it's more expensive than text-only models but competitive for multimodal capabilities. The tiered pricing structure suggests Alibaba is targeting use cases that require long context while managing compute costs for shorter inputs. The lack of disclosed benchmark scores or technical specifications makes direct performance comparison with competing models difficult.

Related Articles

changelog

Anthropic releases Claude Opus 4.7 Fast with 6x pricing for higher output speed

Anthropic has released Claude Opus 4.7 Fast, a speed-optimized variant of its Opus 4.7 model. The fast-mode version delivers identical capabilities with higher output speed at premium pricing: $30 per 1M input tokens and $150 per 1M output tokens, representing a 6x increase over standard pricing.

changelog

Google DeepMind Releases Quantization-Aware Training Versions of Gemma 4 Models in GGUF Format

Google DeepMind has released quantization-aware training (QAT) optimized versions of its Gemma 4 model family in GGUF Q4_0 format. The QAT versions preserve similar quality to bfloat16 while dramatically reducing memory requirements, with models available across the entire Gemma 4 lineup: E2B, E4B, 12B, 26B A4B, and 31B.

changelog

Google AI Plus drops to $4.99/month with 400GB storage, down from $7.99

Google reduced its AI Plus subscription from $7.99 to $4.99 per month and doubled storage from 200GB to 400GB. The plan includes 2x higher Gemini usage limits with a 128,000 token context window, along with features like daily briefs and video generation.

changelog

Anthropic Launches Claude Opus 4.8 Fast Mode at 2x Price for Higher Output Speed

Anthropic has released a fast-mode variant of Claude Opus 4.8 that delivers higher output speed at double the pricing of the standard version. The model offers identical capabilities to regular Opus 4.8 with input pricing at $10 per million tokens and output at $50 per million tokens.

Comments

Loading...