Microsoft releases Harrier embedding models with 32K context window, achieving 74.3 on MTEB v2

TL;DR

Microsoft released the Harrier-OSS embedding model family, comprising three variants with 270M, 600M, and 27B parameters. The largest model achieves 74.3 on the Multilingual MTEB v2 benchmark. All models support 32,768 max tokens and multilingual inputs across 40+ languages.

March 31, 2026 · 7:20 AM2 min read

Harrier-OSS v1 270M — Quick Specs

Context window33K tokens

Compare Harrier-OSS v1 270M with other models →

Microsoft Releases Harrier Embedding Models with State-of-the-Art Multilingual Performance

Microsoft has released Harrier-OSS, a family of multilingual text embedding models designed for retrieval, clustering, semantic similarity, classification, and reranking tasks. The open-source models are available on Hugging Face.

Model Specifications

The Harrier family includes three variants:

Model	Parameters	Embedding Dimension	Max Context	MTEB v2 Score
harrier-oss-v1-270m	270M	640	32,768 tokens	66.5
harrier-oss-v1-0.6b	600M	1,024	32,768 tokens	69.0
harrier-oss-v1-27b	27B	5,376	32,768 tokens	74.3

All models use decoder-only architectures with last-token pooling and L2 normalization to generate dense embeddings. The 270M and 600M variants employ knowledge distillation from larger embedding models during training.

Training and Capabilities

Microsoft trained all variants using contrastive learning on multilingual datasets covering diverse embedding tasks. The models support 40+ languages including English, Spanish, French, German, Chinese, Japanese, Arabic, and Hindi.

Key capabilities span:

Dense passage retrieval
Semantic similarity scoring
Text clustering
Bitext mining
Zero-shot classification and reranking

Each model requires task-specific instructions appended to queries during inference—for example, "Instruct: Retrieve semantically similar text\nQuery: [user query]". Documents do not require instructions.

Technical Details

The models are compatible with both Sentence Transformers and native Hugging Face Transformers libraries. They use BF16 tensor precision and are serialized in Safetensors format. The 270M variant has a 0.3B parameter model size (Safetensors).

Microsoft notes that reproduced scores may differ slightly from reported benchmarks due to library version differences in PyTorch and Transformers.

Performance Claims

According to Microsoft, the Harrier models achieve state-of-the-art results on the Multilingual MTEB v2 benchmark as of the release date. The 27B model significantly outperforms the smaller variants: 74.3 vs. 69.0 and 66.5 respectively.

What This Means

Harrier fills a gap for production embedding models that handle long sequences (32K tokens) and multilingual content without reliance on proprietary APIs. The three-tier parameter design allows organizations to choose between efficiency (270M for edge deployment) and accuracy (27B for complex retrieval). The requirement for task-specific instructions during inference adds operational complexity but enables customization across different search and classification scenarios. Open-source availability means researchers can fine-tune variants for domain-specific embeddings without vendor lock-in.

Source: huggingface.co ↗

embedding-models retrieval multilingual microsoft open-source mteb-benchmark semantic-search

model releaseJuly 4, 2026

Mistral releases Leanstral 1.5: 119B parameter open-source model for Lean 4 proof assistance

Mistral AI has released Leanstral 1.5, an open-source 119B parameter mixture-of-experts model designed specifically for Lean 4 proof assistance. The model features 128 experts with 4 active per token (6.5B activated parameters), a 256k token context window, and multimodal input capabilities.

product updateJuly 1, 2026

GitHub Copilot CLI adds Microsoft C++ Language Server plugin with automated setup

GitHub has added the Microsoft C++ Language Server as a plugin to the Copilot CLI marketplace. The plugin includes a built-in setup skill designed to automate C++ project configuration.

model releaseJuly 1, 2026

Portugal releases Amália, open-source 9B parameter AI model trained on European Portuguese

Portugal has released Amália, its first national AI model trained specifically for European Portuguese. Built on EuroLLM-9B with 9 billion parameters, the model is fully open-source with weights, datasets, and code published under an open license. The government has committed €5.5m in initial funding through 2027.

model releaseJune 29, 2026

DeepReinforce Releases Ornith-1.0, Open-Source Agentic Coding Model in 9B to 397B Sizes

DeepReinforce has released Ornith-1.0, an MIT-licensed model designed for agentic coding tasks with variants ranging from 9B to 397B parameters. Built on top of Apache 2.0-licensed Gemma 4 and Qwen 3.5 base models, the company claims it achieves state-of-the-art performance among open-source models of comparable size on coding benchmarks.