llm

4 articles tagged with llm

April 2, 2026
benchmarkNVIDIA

Nvidia claims 291 MLPerf wins with 288-GPU setup; AMD MI355X crosses 1M tokens/sec

MLCommons published MLPerf Inference v6.0 results on April 1, 2026, with Nvidia, AMD, and Intel each claiming top spots in different configurations. Nvidia's 288-GPU GB300-NVL72 system achieved 2.49 million tokens per second on DeepSeek-R1, while AMD's MI355X crossed one million tokens per second for the first time. Direct comparisons remain difficult as each chipmaker targets different market segments and benchmarks.

March 23, 2026
model release

Rakuten releases RakutenAI-3.0, 671B-parameter Japanese-optimized mixture-of-experts model

Rakuten Group has released RakutenAI-3.0, a 671 billion parameter mixture-of-experts (MoE) model designed specifically for Japanese language tasks. The model activates 37 billion parameters per token and supports a 128K context window. It is available under the Apache License 2.0 on Hugging Face.

March 1, 2026
researchAnthropic

Researchers link pseudonymous users to real identities using AI for under $10 per person

Researchers from ETH Zurich and Anthropic have demonstrated that pseudonymous internet users can be de-anonymized using commercially available AI models at a cost of just a few dollars per person. The attack works in minutes and calls fundamental assumptions about online anonymity into question.

February 20, 2026
model release

Google announces Gemini 3.1 Pro for complex problem-solving tasks

Google announced Gemini 3.1 Pro, positioning the model for complex problem-solving tasks requiring deeper reasoning than previous versions. The release follows Gemini 3 Pro (November 2025) and Gemini 3 Flash (December 2025).