Microsoft open-sources Harrier embedding model with 27B parameters, 131K context window

TL;DR

Microsoft's Bing team has open-sourced Harrier, a 27-billion-parameter embedding model that supports over 100 languages and features a 131,072-token context window. The model ranks first on the MTEB v2 multilingual benchmark, outperforming proprietary offerings from OpenAI and Amazon, and is available on Hugging Face under the MIT license.

April 7, 2026 · 4:50 PM2 min read

Harrier (27B) — Quick Specs

Context window131K tokens

Compare Harrier (27B) with other models →

Microsoft Open-Sources Harrier Embedding Model with 27B Parameters

Microsoft's Bing team has released Harrier, an open-source embedding model trained on over two billion examples augmented with synthetic data from GPT-5. The model is available in three sizes: a full 27-billion-parameter version, a 0.6-billion-parameter variant, and a 270-million-parameter lightweight option.

Key Specifications

The flagship Harrier-OSS-v1-27B model features:

Context window: 131,072 tokens (4x larger than comparable models)
Embedding dimension: 5,376
Active parameters: 25.6B of 27.0B total
Language support: 100+ languages
License: MIT (fully open-source)

The model was trained on synthetic data generated from GPT-5, according to Microsoft's team, though no independent verification of training methodology has been published.

Benchmark Performance

Harrier achieves a Borda score of 78% on the MTEB v2 multilingual benchmark, ranking it first overall. Microsoft claims this outperforms proprietary models from OpenAI (Gemini Embedding 001 scores 99% zero-shot accuracy but ranks 5th on Borda scoring) and Amazon, though direct head-to-head comparisons on identical benchmarks are not provided in available documentation.

Other top performers include KaLM-Embedding-Gemma3-12B (73% Borda), Llama-Embed-Nemotron-8B (7.0B params), and Qwen3-Embedding-8B (6.9B params).

Model Variants and Distribution

Smaller variants address different computational requirements:

Harrier-OSS-v1-0.6B: 0.44B active parameters, 32K context window, designed for edge deployment
270M variant: Ultra-lightweight option for resource-constrained environments

All models are hosted on Hugging Face under MIT licensing, enabling commercial and research use without restrictions.

Intended Applications

Microsoft plans to integrate Harrier into Bing search and next-generation AI agent grounding services. The company describes embedding models as "increasingly critical" for multi-step agent tasks requiring information retrieval and organization.

What This Means

Harrier represents a strategic shift toward open-source tooling for enterprise AI infrastructure. By releasing a top-performing multilingual embedding model under permissive licensing, Microsoft reduces friction for developers building retrieval-augmented generation (RAG) systems and AI agents. The 131K context window positions Harrier above many commercial alternatives, addressing a specific gap in the market where context size matters for document-heavy retrieval tasks.

The release also signals competitive pressure in the embedding model space—historically dominated by closed APIs from OpenAI and Cohere. Open alternatives from Meta (Llama Embeddings) and now Microsoft may accelerate adoption of self-hosted embedding infrastructure among enterprises concerned with vendor lock-in or data residency.

Pricing advantage is significant: Harrier incurs only compute costs when self-hosted, versus per-API-call charges from proprietary services. However, independent verification of multilingual quality parity across all 100+ supported languages remains pending from third-party evaluation.

Source: the-decoder.com ↗

embedding-models open-source microsoft mteb-benchmark multilingual rag retrieval

model releaseJuly 6, 2026

Nex AGI releases Nex-N2-Mini: open-source agentic MoE model with 262K context window

Nex AGI has released Nex-N2-Mini, an open-source agentic mixture-of-experts model with a 262K-token context window. The model accepts text and image inputs and is priced at $0.025 per 1M input tokens and $0.10 per 1M output tokens.

model releaseJuly 4, 2026

Mistral releases Leanstral 1.5: 119B parameter open-source model for Lean 4 proof assistance

Mistral AI has released Leanstral 1.5, an open-source 119B parameter mixture-of-experts model designed specifically for Lean 4 proof assistance. The model features 128 experts with 4 active per token (6.5B activated parameters), a 256k token context window, and multimodal input capabilities.

product updateJuly 1, 2026

GitHub Copilot CLI adds Microsoft C++ Language Server plugin with automated setup

GitHub has added the Microsoft C++ Language Server as a plugin to the Copilot CLI marketplace. The plugin includes a built-in setup skill designed to automate C++ project configuration.

model releaseJuly 1, 2026

Portugal releases Amália, open-source 9B parameter AI model trained on European Portuguese

Portugal has released Amália, its first national AI model trained specifically for European Portuguese. Built on EuroLLM-9B with 9 billion parameters, the model is fully open-source with weights, datasets, and code published under an open license. The government has committed €5.5m in initial funding through 2027.