LLM News

Every LLM release, update, and milestone.

changelogOpenAI

Cline CLI 3.0.6 Adds Support for GPT-5.2, GPT-5.4, and GPT-5.4-mini Models

Cline released CLI version 3.0.6 with updated ChatGPT provider model list. The patch adds support for codex variants and three new GPT-5 series models: gpt-5.2, gpt-5.4, and gpt-5.4-mini.

May 17, 2026 · 11:50 PM1 min read

cline cli-update gpt-5

via github.com ↗

product update

Google Gemini app adds Extended thinking mode, prepares Canva, Instacart, OpenTable integrations

Google is rolling out a new "Thinking level" option in the Gemini app, allowing users to toggle between Standard and Extended modes when using Gemini 3 Flash or Gemini 3.1 Pro. The app is also preparing integrations with Canva, Instacart, and OpenTable, expanding beyond its current third-party connections.

May 17, 2026 · 6:20 PM2 min read

google-gemini gemini-app thinking-mode

via 9to5google.com ↗

research

Gemma 4, DeepSeek V4, and ZAYA1 Deploy KV Cache Compression to Cut Long-Context Memory Costs

Recent open-weight LLM releases from Google, DeepSeek, and others are adopting architectural techniques that reduce KV cache size by approximately 50% at long contexts. These include cross-layer KV sharing in Gemma 4, which saves 2.7 GB at 128K context for the E2B model, and compressed convolutional attention in ZAYA1-8B.

May 16, 2026 · 11:50 AM3 min read

architecture kv-cache memory-optimization

via magazine.sebastianraschka.com ↗

model releaseMicrosoft

Microsoft Releases Fara-7B: 7B Parameter Computer Use Agent Trained in 2.5 Days on 64 H100s

Microsoft Research has released Fara-7B, a 7-billion parameter small language model designed for computer automation tasks. The model, which took 2.5 days to train on 64 H100 GPUs, can navigate websites to complete tasks like booking restaurants and shopping, using screenshots as input with a 128K token context window.

May 15, 2026 · 11:20 PM2 min read

Fara-7B Microsoft computer use

via huggingface.co ↗

benchmark

Augment Code's agent matches Claude Code quality at 33% lower cost on Opus 4.7

Augment Code benchmarked its Auggie agent against Claude Code on Claude Opus 4.7, reporting a 67.4% pass rate versus 66.3% while cutting costs by 33%. The company attributes savings to a semantic context engine that reduces cache read tokens by 32% and output tokens by 37% compared to Claude Code's keyword-based retrieval.

May 15, 2026 · 8:05 PM3 min read

augment-code anthropic claude-code

via augmentcode.com ↗

product update

Google's Gemini Intelligence requires 12GB RAM and Nano v3, excludes Pixel 9 and Galaxy Z Fold 7

Google's new Gemini Intelligence features require 12GB of RAM, Gemini Nano v3 or higher, flagship chips, and 5 years of OS upgrades. The requirements exclude recent devices including the entire Pixel 9 series and Samsung's Galaxy Z Fold 7, which run Gemini Nano v2.

May 15, 2026 · 5:50 PM2 min read

Google Gemini Gemini Nano

via 9to5google.com ↗

product updateReplit

Replit ships iPhone app update with Agent 4 after four-month App Store review delay

Replit released its first iPhone app update in four months after resolving App Store review issues with Apple. The update brings Agent 4, the company's latest AI coding assistant, along with parallel agent support and cross-workspace project viewing.

May 15, 2026 · 4:35 PM2 min read

replit app-store ai-coding

via 9to5mac.com ↗

product updateOpenAI

OpenAI launches ChatGPT financial account integration for Pro users via Plaid partnership

OpenAI is rolling out a preview feature allowing ChatGPT Pro users in the US to connect their bank accounts through Plaid for personalized financial advice. The integration provides access to 12,000+ financial institutions and generates visual dashboards of user finances.

May 15, 2026 · 4:20 PM2 min read

ChatGPT OpenAI Plaid

via engadget.com ↗

product updateGitHub

GitHub pilots AI agent to automate accessibility testing and remediation

GitHub is piloting an experimental AI agent designed to automate accessibility testing and remediation. The tool aims to help developers identify and fix accessibility issues in their code and user interfaces without requiring specialized expertise.

May 15, 2026 · 4:06 PM2 min read

GitHub accessibility AI agent

via github.blog ↗

product updateGitHub

GitHub pilots experimental AI accessibility agent for assistive technology users

GitHub is piloting an experimental accessibility agent that uses AI to help users with disabilities navigate software interfaces. The company disclosed the project in a technical blog post detailing lessons learned during development.

May 15, 2026 · 4:05 PM2 min read

github accessibility assistive-technology

via github.blog ↗

product updateAnthropic

Anthropic launches contract review tool in Claude for Small Business that flags risky clauses

Anthropic has released Claude for Small Business, a collection of 31 AI skills for Claude Cowork subscribers. The standout feature is /review-contract, which analyzes legal contracts and flags problematic clauses in approximately five minutes. The tool requires at minimum a $20/month Claude Pro subscription.

May 15, 2026 · 12:50 PM2 min read

anthropic claude contract-review

via zdnet.com ↗

product updatexAI

xAI launches Grok Build coding agent at $300/month, available only to SuperGrok Heavy subscribers

xAI has released Grok Build, a coding agent and CLI tool positioned to compete with Anthropic's Claude Code and other AI coding assistants. The early beta is available exclusively to SuperGrok Heavy subscribers at $300 per month.

May 15, 2026 · 6:20 AM2 min read

xAI Grok Build coding agent

via engadget.com ↗

product updateGitHub

GitHub Copilot API adds team-level usage metrics for enterprise tracking

GitHub has expanded its Copilot usage metrics API to include team-level reporting. The new user-teams endpoint maps each Copilot-licensed user to their team memberships, allowing organizations to analyze AI coding assistant adoption and usage patterns across team structures.

May 14, 2026 · 11:20 PM2 min read

GitHub Copilot API

via github.blog ↗

researchAnthropic

Security researchers use Anthropic's Mythos Preview to bypass Apple's M5 memory protection in 5 days

Security researchers at Calif used Anthropic's Mythos Preview model to develop a working macOS kernel memory corruption exploit on M5 silicon in five days, bypassing Apple's Memory Integrity Enforcement (MIE) system. The exploit chain targets macOS 26.4.1 and escalates from unprivileged local user to root shell using two vulnerabilities and several techniques.

May 14, 2026 · 10:35 PM3 min read

anthropic mythos-preview security

via 9to5mac.com ↗

product updateOpenAI

OpenAI brings Codex coding agent to iOS and Android with remote environment monitoring

OpenAI has integrated its Codex coding agent into the ChatGPT mobile app for iOS and Android, allowing developers to monitor live development environments and manage workflows from their phones. The update, announced May 14, 2026, is now available in preview across all ChatGPT plans.

May 14, 2026 · 9:05 PM2 min read

OpenAI Codex ChatGPT

via techcrunch.com ↗

product updateOpenAI

OpenAI adds remote Codex control to ChatGPT mobile apps for iOS and Android

OpenAI has integrated remote Codex control into the ChatGPT mobile apps for iPhone and Android. Users can now approve tasks, review outputs, and manage Codex running on Mac computers, laptops, or remote environments directly from their smartphones.

May 14, 2026 · 8:20 PM2 min read

OpenAI Codex ChatGPT

via 9to5mac.com ↗

product updateOpenAI

OpenAI brings Codex desktop AI control to ChatGPT mobile apps

OpenAI is rolling out mobile access to Codex, its desktop AI tool that can write code and control applications on users' computers. The feature allows users to control Codex on their desktop from the ChatGPT iOS and Android apps, with real-time updates flowing back to mobile devices.

May 14, 2026 · 8:05 PM1 min read

openai codex chatgpt

via theverge.com ↗

product updateMicrosoft

Microsoft Cancels Claude Code Licenses, Pushes Developers to GitHub Copilot CLI

Microsoft is removing Claude Code access from its Experiences + Devices division by June 30, 2026, redirecting thousands of engineers to GitHub Copilot CLI instead. The decision follows six months of Claude Code proving more popular than Microsoft's own coding tool among internal developers.

May 14, 2026 · 7:06 PM2 min read

microsoft anthropic github

via theverge.com ↗

model releaseIbm

IBM Releases 97M-Parameter Granite Embedding Model With 60.3 MTEB Score — Highest Retrieval Quality Under 100M Parameter

IBM released two new multilingual embedding models under Apache 2.0: a 97M-parameter compact model scoring 60.3 on MTEB Multilingual Retrieval (highest in its size class) and a 311M full-size model scoring 65.2. Both support 200+ languages with enhanced retrieval for 52 languages, handle 32K-token context (64x increase over predecessors), and include code retrieval across 9 programming languages.

May 14, 2026 · 7:06 PM3 min read

IBM Granite embeddings

via huggingface.co ↗

analysisAnthropic

Anthropic's Mythos Preview solves previously unsolvable cybersecurity test in updated checkpoint

A month after its initial release, a newer checkpoint of Anthropic's Mythos Preview became the first model to complete the UK AI Safety Institute's 'Cooling Tower' cyber range, solving it in 3 of 10 attempts. The model also completed 'The Last Ones' range in 6 of 10 attempts, surpassing OpenAI's GPT-5.5 and demonstrating capability improvements within a single model version.

May 14, 2026 · 5:50 PM3 min read

Anthropic Mythos AI Safety

via zdnet.com ↗

← PreviousPage 5 of 47Next →