model releaseStepFun

StepFun launches Step 3.7 Flash: 196B MoE model with 256K context and adjustable reasoning levels at $0.20/$1.15 per 1M

TL;DR

StepFun has released Step 3.7 Flash, a 196B-parameter Mixture-of-Experts model that activates approximately 11B parameters per token. The multimodal model supports a 256K context window and introduces selectable reasoning levels (high/medium/low), priced at $0.20 per 1M input tokens and $1.15 per 1M output tokens.

2 min read
0

Step-3.7-Flash — Quick Specs

Context window256K tokens
Input$0.2/1M tokens
Output$1.15/1M tokens

StepFun Launches Step 3.7 Flash with Adjustable Reasoning Levels

StepFun has released Step 3.7 Flash, a 196B-parameter Mixture-of-Experts (MoE) model that activates roughly 11B parameters per token during inference. The model includes native image and video understanding capabilities through an integrated vision encoder.

Technical Specifications

Step 3.7 Flash supports a 256K token context window and is priced at $0.20 per 1M input tokens and $1.15 per 1M output tokens. The model was released on May 28, 2025, according to OpenRouter's listing.

The architecture combines a 196B-parameter language backbone with a vision encoder, making it StepFun's latest multimodal offering. By activating only 11B parameters per token through its MoE design, the model aims to balance performance with computational efficiency.

Selectable Reasoning Levels

A distinctive feature is the model's three selectable reasoning levels—high, medium, and low—allowing developers to trade off between processing speed, cost, and reasoning depth based on specific use cases. This gives callers direct control over how the model allocates compute resources per query.

Target Use Cases

According to StepFun, Step 3.7 Flash is designed for coding tasks, agentic workflows, structured output generation, and long-context productivity applications. The 256K context window positions it for document analysis, extended code review, and multi-turn conversations requiring substantial memory.

The model is currently available through OpenRouter, which routes requests across multiple providers to handle different prompt sizes and parameters.

What This Means

Step 3.7 Flash represents StepFun's entry into the competitive space of large-context multimodal models, directly competing with offerings from Anthropic, Google, and others in the 200K+ context range. The adjustable reasoning levels are a notable differentiation—most models offer fixed inference patterns, while this approach lets developers optimize for their specific latency and quality requirements. The $0.20/$1.15 pricing puts it in the mid-tier range, though real-world performance benchmarks will determine whether the selectable reasoning modes deliver meaningful value beyond standard inference optimization.

Related Articles

model release

StepFun releases Step-3.7-Flash: 198B-parameter MoE model with 256K context at $0.20/M input tokens

StepFun has released Step-3.7-Flash, a 198B-parameter sparse Mixture-of-Experts vision-language model that activates 11B parameters per token and delivers up to 400 tokens per second. The model supports a 256K context window, three selectable reasoning levels, and is priced at $0.20 per million input tokens (cache miss) and $1.15 per million output tokens.

model release

Mistral AI Releases Small 4: 119B Parameter Open-Source Model with 256K Context Under Apache 2.0

Mistral AI has released Mistral Small 4, a 119B total parameter mixture-of-experts model with 256K context window and native multimodal capabilities. The model uses 128 experts with 4 active per token (6B active parameters) and is released under the Apache 2.0 license, marking Mistral's first unified model combining reasoning, multimodal, and coding capabilities.

model release

Mistral Releases Mistral Large 3 with 675B Parameters and Three Ministral 3 Models Under Apache 2.0

Mistral AI has released Mistral 3, consisting of Mistral Large 3—a sparse mixture-of-experts model with 675B total parameters and 41B active parameters—and three Ministral 3 models at 3B, 8B, and 14B parameters. All models are released under the Apache 2.0 license with multimodal capabilities including image understanding.

model release

Anthropic releases Claude Opus 4.8 with improved agentic coding and reasoning benchmarks

Anthropic released Claude Opus 4.8 on May 28, 2026, with improved performance in agentic coding, computer use, and reasoning benchmarks. Pricing remains at $5 per million input tokens and $25 per million output tokens, while the model's fast mode is now three times cheaper than previous versions.

Comments

Loading...