model releaseArcee Ai

Arcee AI Releases Trinity Large Preview: 400B-Parameter MoE Model with 512K Context Window

TL;DR

Arcee AI has released Trinity Large Preview, a 400B-parameter sparse Mixture-of-Experts model with 13B active parameters per token using 4-of-256 expert routing. The model supports context windows up to 512K tokens and is available with open weights under permissive licensing.

2 min read
0

Arcee AI Releases Trinity Large Preview: 400B-Parameter MoE Model with 512K Context Window

Arcee AI has released Trinity Large Preview, a 400B-parameter sparse Mixture-of-Experts (MoE) language model with 13B active parameters per token. The model uses 4-of-256 expert routing architecture and is currently available for free through OpenRouter.

Technical Specifications

Trinity Large Preview features:

  • Total parameters: 400 billion (sparse)
  • Active parameters per token: 13 billion
  • Expert routing: 4-of-256 MoE architecture
  • Context window: Up to 512K tokens (native support)
  • Current deployment: 128K context window using 8-bit quantization
  • Pricing: Free during preview period (ends April 22, 2026)
  • License: Open weights with permissive licensing

Capabilities and Target Use Cases

According to Arcee AI, Trinity Large Preview excels in creative writing, storytelling, role-play, chat scenarios, and real-time voice assistance. The company claims the model performs better in these areas than typical reasoning models.

The model was specifically trained for agentic workflows, designed to navigate agent frameworks including OpenCode, Cline, and Kilo Code. Arcee AI states it handles complex toolchains and long, constraint-filled prompts effectively.

Deployment Details

The Preview API currently serves the model at 128K context using 8-bit quantization for practical deployment, though the architecture natively supports context windows up to 512K tokens. The model is available through OpenRouter's API with standard OpenAI-compatible formatting.

Benchmark scores have not been disclosed. The free preview period will end on April 22, 2026, though future pricing has not been announced.

What This Means

Trinity Large Preview represents Arcee AI's entry into frontier-scale models with a focus on efficiency through sparse MoE architecture. The 400B total parameter count with only 13B active per token aims to deliver large model capabilities at lower computational cost. The open weights and permissive licensing lower barriers for developers and researchers to experiment with frontier-scale models, particularly for agentic applications. However, the lack of published benchmark scores makes it difficult to assess performance against competing models in the 400B+ parameter class.

Related Articles

model release

Xiaomi Launches MiMo-V2.5-Pro with 1M Context Window for Complex Agentic Tasks

Xiaomi released MiMo-V2.5-Pro on April 22, 2026, its flagship model featuring a 1,048,576 token context window and pricing at $1 per million input tokens and $3 per million output tokens. According to Xiaomi, the model ranks highly on ClawEval, GDPVal, and SWE-bench Pro benchmarks, designed for autonomous completion of professional tasks requiring thousands of tool calls.

model release

Alibaba Qwen Releases 35B Parameter Qwen3.6-35B-A3B Model with 262K Native Context Window

Alibaba Qwen has released Qwen3.6-35B-A3B, a 35-billion parameter mixture-of-experts model with 3 billion activated parameters and a 262,144-token native context window extendable to 1,010,000 tokens. The model scores 73.4 on SWE-bench Verified and features FP8 quantization with performance metrics nearly identical to the original model.

model release

OpenAI Releases GPT-5.4 Image 2 with 272K Context Window and Image Generation

OpenAI has released GPT-5.4 Image 2, combining the GPT-5.4 reasoning model with image generation capabilities. The multimodal model features a 272K token context window and is priced at $8 per million input tokens and $15 per million output tokens.

model release

InclusionAI releases Ling-2.6-flash: 104B parameter model with 7.4B active parameters, free on OpenRouter

InclusionAI has released Ling-2.6-flash, an instruction-tuned model with 104 billion total parameters and 7.4 billion active parameters, available free through OpenRouter. The model features a 262,144-token context window and is designed for agent workflows requiring fast responses and high token efficiency.

Comments

Loading...