StepFun releases Step-3.7-Flash: 198B-parameter MoE model with 256K context at $0.20/M input tokens
StepFun has released Step-3.7-Flash, a 198B-parameter sparse Mixture-of-Experts vision-language model that activates 11B parameters per token and delivers up to 400 tokens per second. The model supports a 256K context window, three selectable reasoning levels, and is priced at $0.20 per million input tokens (cache miss) and $1.15 per million output tokens.