model release

Alibaba releases Qwen3.5-35B-A3B, a 35B multimodal model with Apache 2.0 license

TL;DR

Alibaba's Qwen team has released Qwen3.5-35B-A3B-Base, a 35-billion parameter multimodal model supporting image-text-to-text tasks. The model is available under the Apache 2.0 license and compatible with major inference endpoints including Azure deployment.

1 min read
0

Alibaba's Qwen division has released Qwen3.5-35B-A3B-Base, a 35-billion parameter multimodal language model designed for image-text-to-text tasks.

Model Details

The model was published on February 24, 2026 on Hugging Face and carries an Apache 2.0 license, allowing both commercial and research use without licensing restrictions. It is tagged as part of the Qwen3.5 MoE (mixture of experts) family, indicating the model uses conditional computation techniques to improve efficiency.

Qwen3.5-35B-A3B-Base supports multimodal inputs, processing both images and text to generate text outputs. The model is compatible with the Transformers library and uses SafeTensors format for weight storage, a security-focused serialization standard.

Availability and Deployment

The model has achieved 1,937 downloads and 62 likes on Hugging Face as of publication. It is compatible with inference endpoints through major cloud providers, including Azure deployment options, making it accessible for production use cases.

The base model variant indicates this is the foundational version without instruction-tuning or fine-tuning for specific tasks, leaving optimization to end users or downstream applications.

Context

This release continues Alibaba's Qwen series momentum in the open-weight model space. The Qwen3.5 line represents an iteration beyond Qwen3, with the A3B variant designation referring to a specific model configuration within the 35B parameter class.

The mixture-of-experts architecture employed in this model typically provides efficiency improvements during inference compared to dense models of equivalent parameter count, though exact computational requirements are not yet published.

What This Means

Alibaba is positioning Qwen3.5-35B-A3B as an open alternative for organizations needing multimodal capabilities at the 35B scale. The Apache 2.0 license removes commercial deployment barriers, and cloud provider integration lowers infrastructure barriers. The model joins a competitive field of open multimodal 30B+ parameter models from Meta, Mistral, and others, each with different architectural choices and trade-offs in performance, efficiency, and licensing.

Related Articles

model release

Alibaba's Qwen Releases Qwen3.7 Plus: 1M Context Window at $0.40 Per Million Input Tokens

Alibaba's Qwen has released Qwen3.7 Plus, a multimodal model with a 1 million token context window. The model accepts text and image input with text output, priced at $0.40 per million input tokens and $1.60 per million output tokens through OpenRouter's API.

model release

Google DeepMind releases Gemma 4 12B Unified: encoder-free multimodal model with 256K context window

Google DeepMind has released Gemma 4 12B Unified, an encoder-free multimodal model that processes text, images, and audio through a single decoder-only transformer. The model features 11.95 billion parameters, a 256K token context window, and achieves 77.2% on MMLU Pro and 72.0% on LiveCodeBench v6.

model release

ByteDance Open-Sources Bernini-R Video Diffusion Model With Semantic Planning Architecture

ByteDance released Bernini-R, an open-source video generation and editing model that combines an MLLM-based semantic planner with a DiT-based renderer. The model requires Hopper-class GPUs (H100/H800/H200) for optimal performance and supports multiple tasks including text-to-video, video editing, and reference-guided generation.

model release

Nvidia releases Nemotron 3 Ultra: 550B-parameter MoE model with 1M context window for agentic workflows

Nvidia has released Nemotron 3 Ultra, a 550-billion parameter mixture-of-experts model with 55 billion active parameters and support for up to 1 million token context windows. The model uses a hybrid Transformer-Mamba architecture and is designed specifically for long-running agentic workflows including agent orchestration, coding agents, and complex enterprise tasks.

Comments

Loading...