LLM News

Every LLM release, update, and milestone.

Filtered by:mixture-of-experts✕ clear
research

TSEmbed combines mixture-of-experts with LoRA to scale multimodal embeddings across conflicting tasks

Researchers propose TSEmbed, a multimodal embedding framework that combines Mixture-of-Experts (MoE) with Low-Rank Adaptation (LoRA) to handle task conflicts in universal embedding models. The approach introduces Expert-Aware Negative Sampling (EANS) to improve discriminative power and achieves state-of-the-art results on the Massive Multimodal Embedding Benchmark (MMEB).

research

Researchers propose Mixture of Universal Experts to scale MoE models via depth-width transformation

Researchers have introduced Mixture of Universal Experts (MoUE), a generalization of Mixture-of-Experts architectures that adds a new scaling dimension called virtual width. The approach reuses a shared expert pool across layers while maintaining fixed per-token computation, achieving up to 1.3% improvements over standard MoE baselines and enabling 4.2% gains when converting existing MoE checkpoints.

research

ButterflyMoE achieves 150× memory reduction for mixture-of-experts models via geometric rotations

Researchers introduce ButterflyMoE, a technique that replaces independent expert weight matrices with learned geometric rotations applied to a shared quantized substrate. The method reduces memory scaling from linear to sub-linear in the number of experts, achieving 150× compression at 256 experts with negligible accuracy loss on language modeling tasks.

model release

Segmind releases SegMoE, a mixture-of-experts diffusion model for faster image generation

Segmind has released SegMoE, a mixture-of-experts (MoE) diffusion model designed to accelerate image generation while reducing computational overhead. The model applies MoE techniques traditionally used in large language models to the diffusion model architecture, enabling selective expert activation during inference.