Arcee AI Releases Trinity Large Thinking: Free 262K Context Reasoning Model
Arcee AI has released Trinity Large Thinking, an open source reasoning model with a 262,144-token context window. The model is available free via OpenRouter and claims strong performance in PinchBench, agentic workloads, and reasoning tasks.
Trinity Large Thinking — Quick Specs
Arcee AI Releases Trinity Large Thinking: Free 262K Context Reasoning Model
Arcee AI has released Trinity Large Thinking, an open source reasoning model with a 262,144-token context window, available free via OpenRouter as of April 1, 2026.
Key Specifications
- Context window: 262,144 tokens
- Pricing: $0 per million input tokens, $0 per million output tokens
- Model type: Reasoning model with step-by-step thinking capabilities
- Availability: Free tier on OpenRouter platform
Performance Claims
According to Arcee AI, Trinity Large Thinking shows strong performance across:
- PinchBench evaluations
- Agentic workloads
- Reasoning tasks
Specific benchmark scores have not been disclosed at launch.
Technical Implementation
The model supports OpenRouter's reasoning-enabled API, which allows developers to access the model's internal reasoning process through the reasoning_details array in API responses. Developers can use the reasoning parameter to enable step-by-step thinking output.
When continuing conversations, the complete reasoning_details must be preserved in message history for the model to maintain reasoning continuity.
Open Source Availability
Model weights are available for download, following Arcee AI's open source approach. The company has published a launch video detailing the model's capabilities and use cases.
What This Means
Trinity Large Thinking enters a competitive reasoning model market currently dominated by OpenAI's o1 and o3 series, Anthropic's extended thinking modes, and DeepSeek's R1. The 262K context window positions it among the larger-context reasoning models available, though it's unclear how this compares to specialized long-context models.
The free tier availability via OpenRouter makes it immediately accessible for developers to test against commercial alternatives. However, without published benchmark scores on standard reasoning evaluations like AIME or GPQA, performance relative to established models remains unverified. The emphasis on "agentic workloads" suggests optimization for multi-step tool use and planning tasks rather than pure mathematical reasoning.
Related Articles
Sakana AI Releases Fugu Ultra: Multi-Agent Orchestration System with 1M Context Window at $5/$30 per Million Tokens
Sakana AI has released Fugu Ultra, a multi-agent orchestration system that routes tasks across pools of underlying models rather than operating as a single monolithic model. The system supports a 1M token context window and is priced at $5 per million input tokens and $30 per million output tokens.
China's Z.ai releases GLM-5.2, open-source model matching Claude and GPT-5.5 in cybersecurity tasks
Z.ai's GLM-5.2 performs on par with Claude Opus 4.8 and OpenAI's GPT-5.5 in cybersecurity benchmarks while costing roughly half as much to run. Security evaluations from Graphistry and Semgrep confirm the open-weight model's capabilities in vulnerability discovery and cyber investigation, raising concerns about accessibility of advanced hacking tools.
Unconventional AI releases Un-0 image model on simulated oscillator chip claiming 1000x power reduction
Unconventional AI released Un-0, an image generation model that runs on a software simulation of oscillator-based hardware. Founder Naveen Rao claims the architecture could reduce AI power consumption by 1000x compared to conventional chips, though no physical hardware exists yet.
Unconventional AI releases Un0 image model on oscillator-based architecture, claims 1,000x power reduction potential
Unconventional AI, led by former Databricks AI chief Naveen Rao, has released Un0, an image generation model built on a simulated oscillator-based architecture. The company claims this approach could reduce inference power consumption by up to 1,000x compared to conventional computing, though the technology currently runs only in software simulation.
Comments
Loading...