model releaseTencent

Tencent Releases Hy-MT2: 1.8B Translation Model Compressed to 440MB With 1.25-Bit Quantization

TL;DR

Tencent has open-sourced Hy-MT2, a family of multilingual translation models available in 1.8B, 7B, and 30B-A3B parameter sizes. The models support translation across 33 languages and include extreme quantization down to 1.25-bit, reducing the 1.8B model to 440MB storage while increasing inference speed by 1.5x.

May 22, 2026 · 5:21 AM2 min read

Hy-MT2-1.8B — Quick Specs

Compare Hy-MT2-1.8B with other models →

Tencent Releases Hy-MT2: 1.8B Translation Model Compressed to 440MB With 1.25-Bit Quantization

Tencent has open-sourced Hy-MT2, a family of multilingual translation models designed for complex real-world scenarios, released on May 21, 2025. The family includes three model sizes: 1.8B, 7B, and 30B-A3B (mixture-of-experts architecture).

Model Specifications

All three models support translation among 33 languages and can follow translation instructions in multiple languages. The 1.8B base model is available in multiple quantization formats:

Standard FP8 quantization
GGUF format for llama.cpp deployment
2-bit quantization
1.25-bit extreme quantization via AngelSlim

The 1.25-bit quantization reduces storage requirements to just 440MB and improves inference speed by 1.5x compared to the unquantized version, according to Tencent.

Performance Claims

Tencent claims the 7B and 30B-A3B models outperform open-source models including DeepSeek-V4-Pro and Kimi K2.6 in "fast-thinking mode." The company states the lightweight 1.8B model surpasses mainstream commercial translation APIs from Microsoft and Doubao (ByteDance's service) overall.

The models were evaluated across general translation, real-world business scenarios, domain-specific tasks, and instruction-following capabilities. Tencent has released IFMTBench, a new benchmark specifically for evaluating translation instruction-following performance.

Inference Configuration

For the 1.8B and 7B models, Tencent recommends:

Temperature: 0.7
Top-p: 0.6
Top-k: 20
Repetition penalty: 1.05
Max tokens: 4096

The 30B-A3B model uses different parameters: top-p of 1.0, top-k of -1, and no repetition penalty.

Deployment and Training

The models support deployment via transformers (version 5.6.0+), vLLM, and SGLang. Tencent provides a complete training pipeline supporting both full-parameter fine-tuning and LoRA fine-tuning with DeepSpeed ZeRO configurations and LLaMA-Factory integration.

The company has also released Hy-MT2-Translator Skill for easier integration of the model series into translation workflows.

WMT26 Partnership

Tencent announced an official partnership with WMT26 (Workshop on Machine Translation) for the "Video Subtitle Translation Task." Participants using Hy-MT models in the general machine translation and video subtitle translation tasks are eligible for special awards sponsored by Hunyuan.

All models are available on HuggingFace and ModelScope. Pricing for API access has not been disclosed.

What This Means

The 1.25-bit quantization achieving 440MB storage is notable for on-device deployment scenarios where model size is a critical constraint. However, Tencent's performance claims require independent verification—benchmarks against commercial APIs like Microsoft Translator need reproducible methodology. The partnership with WMT26 suggests Tencent is positioning these models for academic credibility in addition to commercial deployment. The extreme quantization approach, if validated, could enable translation models on resource-constrained devices that previously couldn't run models of this capability level.

Source: huggingface.co ↗

tencent translation multilingual quantization open-source moe benchmark

model releaseJuly 6, 2026

Tencent Releases Hy3: 295B MoE Model with 256K Context and Configurable Reasoning Modes

Tencent has released Hy3, a 295-billion parameter Mixture-of-Experts model with 21 billion active parameters and a 256,000-token context window. The model features configurable reasoning modes and is available free through OpenRouter, with deployment ending July 21, 2026.

model releaseJuly 6, 2026

Tencent Releases Hy3: 295B-Parameter MoE Model with 21B Active Parameters at 256K Context

Tencent has released Hy3, a 295-billion parameter Mixture-of-Experts model with 21 billion active parameters and 3.8 billion MTP layer parameters. The model features a 256K context window and is released under Apache 2.0 license, with pricing not yet disclosed.

model releaseJuly 6, 2026

Nex AGI releases Nex-N2-Mini: open-source agentic MoE model with 262K context window

Nex AGI has released Nex-N2-Mini, an open-source agentic mixture-of-experts model with a 262K-token context window. The model accepts text and image inputs and is priced at $0.025 per 1M input tokens and $0.10 per 1M output tokens.

model releaseJuly 4, 2026

Mistral releases Leanstral 1.5: 119B parameter open-source model for Lean 4 proof assistance

Mistral AI has released Leanstral 1.5, an open-source 119B parameter mixture-of-experts model designed specifically for Lean 4 proof assistance. The model features 128 experts with 4 active per token (6.5B activated parameters), a 256k token context window, and multimodal input capabilities.

Tencent Releases Hy-MT2: 1.8B Translation Model Compressed to 440MB With 1.25-Bit Quantization

Hy-MT2-1.8B — Quick Specs

Tencent Releases Hy-MT2: 1.8B Translation Model Compressed to 440MB With 1.25-Bit Quantization

Model Specifications

Performance Claims

Inference Configuration

Deployment and Training

WMT26 Partnership

What This Means

Related Articles

Tencent Releases Hy3: 295B MoE Model with 256K Context and Configurable Reasoning Modes

Tencent Releases Hy3: 295B-Parameter MoE Model with 21B Active Parameters at 256K Context

Nex AGI releases Nex-N2-Mini: open-source agentic MoE model with 262K context window

Mistral releases Leanstral 1.5: 119B parameter open-source model for Lean 4 proof assistance

Comments