model release

Rakuten releases RakutenAI-3.0, 671B-parameter Japanese-optimized mixture-of-experts model

TL;DR

Rakuten Group has released RakutenAI-3.0, a 671 billion parameter mixture-of-experts (MoE) model designed specifically for Japanese language tasks. The model activates 37 billion parameters per token and supports a 128K context window. It is available under the Apache License 2.0 on Hugging Face.

2 min read

Rakuten Releases 671B Parameter Model Optimized for Japanese

Rakuten Group has published RakutenAI-3.0, a 671 billion parameter mixture-of-experts language model engineered for Japanese language understanding and generation. The model activates 37 billion parameters per token and supports a 128,000 token context window.

Technical Specifications

The model uses a mixture-of-experts architecture, a design pattern that maintains computational efficiency by selectively activating only a subset of parameters for each input token. RakutenAI-3.0 is trained on a combination of publicly available open-source data and Rakuten's proprietary bilingual Japanese-English datasets.

Key specifications:

  • Total parameters: 671 billion
  • Active parameters per token: 37 billion
  • Context window: 128,000 tokens
  • Supported languages: Japanese and English
  • Model format: F32, BF16, and F8_E4M3 quantization variants available
  • License: Apache License 2.0

Deployment and Access

RakutenAI-3.0 is available on Hugging Face for download and local deployment. The company provides inference instructions using SGLang with recommended specifications requiring 8 tensor parallelism and 85% static memory allocation. The model has recorded 425 downloads in its first month on Hugging Face.

No official inference API or hosted endpoints have been announced. The model card indicates the model is not currently deployed by commercial inference providers.

Positioning

Rakuten positions RakutenAI-3.0 as delivering "superior grasp of Japanese language and culture" compared to existing models. The emphasis on Japanese-optimized training reflects increasing focus by regional technology companies on language-specific LLMs, following similar releases from companies like Alibaba (Qwen) and Baidu.

Limitations

Rakuten's documentation explicitly acknowledges that RakutenAI-3.0 can generate biased, inaccurate, or unsafe outputs like other large language models. The company recommends implementing appropriate safeguards for production deployments.

What This Means

Rakuten's entry into open-source Japanese-optimized LLMs signals sustained competition in regional language models. At 671B parameters with a 128K context window, it competes in scale with existing open models but targets a specific linguistic niche. The Apache 2.0 license and community release suggest Rakuten is prioritizing ecosystem participation over proprietary monetization, similar to Meta's approach with Llama. The model's availability only through local deployment (no hosted API) limits accessibility for developers without substantial compute resources.

Related Articles

model release

Nvidia releases Nemotron 3 Super: 120B MoE model with 1M token context

Nvidia has released Nemotron 3 Super, a 120-billion parameter hybrid Mamba-Transformer Mixture-of-Experts model that activates only 12 billion parameters during inference. The open-weight model features a 1-million token context window, multi-token prediction capabilities, and pricing at $0.10 per million input tokens and $0.50 per million output tokens.

model release

NVIDIA releases Nemotron-3-Super-120B, a 120B parameter model with latent MoE architecture

NVIDIA has released Nemotron-3-Super-120B-A12B-BF16, a 120 billion parameter model designed for text generation and conversational tasks. The model employs a latent mixture-of-experts (MoE) architecture and supports multiple languages including English, French, Spanish, Italian, German, Japanese, and Chinese.

model release

NVIDIA releases Nemotron 3 Content Safety 4B for multimodal, multilingual moderation

NVIDIA released Nemotron 3 Content Safety 4B, an open-source multimodal safety model designed to moderate content across text, images, and multiple languages. Built on Gemma-3 4B-IT with a 128K context window, the model achieved 84% average accuracy on multimodal safety benchmarks and supports over 140 languages through culturally-aware training data.

model release

Xiaomi launches MiMo-V2-Pro with 1T parameters, matches Claude Opus on coding at 80% lower cost

Xiaomi shipped three AI models simultaneously designed to form a complete agent platform. MiMo-V2-Pro, a 1-trillion-parameter Mixture-of-Experts model with 42 billion active parameters per request, scores 78% on SWE-bench Verified and 81 points on ClawEval—nearly matching Claude Opus 4.6 while costing $1 per million input tokens versus $5 for Opus.

RakutenAI-3.0: 671B MoE Model for Japanese | TPS