model release

Mira Murati's Thinking Machines announces full-duplex AI model with 0.40-second response time

TL;DR

Thinking Machines Lab, founded by former OpenAI CTO Mira Murati, announced TML-Interaction-Small, a full-duplex AI model that processes input while generating responses simultaneously. The company claims 0.40-second response time, matching natural human conversation speed.

2 min read
0

Mira Murati's Thinking Machines announces full-duplex AI model with 0.40-second response time

Thinking Machines Lab, the AI startup founded by former OpenAI CTO Mira Murati, announced what it calls "interaction models" — AI systems that process input and generate responses simultaneously rather than in traditional turn-taking fashion.

The company's first model, TML-Interaction-Small, claims a 0.40-second response time, which Thinking Machines says matches natural human conversation speed and is "significantly faster than comparable models from OpenAI and Google." The technical architecture enables full-duplex communication, meaning the model can listen while it speaks, similar to a phone call rather than a text message exchange.

Release timeline and availability

This is a research preview, not a public product. Thinking Machines plans a "limited research preview" within the next few months, with wider release scheduled for later in 2026. No pricing information has been disclosed.

The company has not released benchmark scores, parameter count, context window specifications, or other technical details beyond the response latency claim.

Technical approach

Current AI models follow a strict turn-taking protocol: user input is processed completely before response generation begins. Thinking Machines' approach processes incoming audio or text while simultaneously generating output, which the company argues should be "native to a model, not bolted on."

The distinction matters for applications requiring real-time interaction, such as voice assistants or conversational interfaces where interruption and natural flow are expected.

What this means

Full-duplex communication represents a meaningful architectural shift if the claimed performance holds under real-world conditions. A 0.40-second response time would indeed approach human conversation norms (typical human response latency ranges from 0.2 to 0.6 seconds). However, without independent verification, public testing, or detailed benchmarks, it's impossible to assess whether this translates to genuinely better user experience or represents a marginal improvement over existing streaming architectures. The value proposition depends entirely on whether simultaneous processing creates noticeably more natural interactions than current streaming implementations, which remains unproven until researchers and users can test the system directly.

Related Articles

model release

Liquid AI releases LFM2.5-230M, a 230M parameter edge model running at 213 tok/s on Galaxy S25 Ultra

Liquid AI has released LFM2.5-230M, a 230M parameter hybrid model trained on 19 trillion tokens with a 32,768 token context window. The model achieves 213 tok/s decode speed on Galaxy S25 Ultra and 42 tok/s on Raspberry Pi 5, with support for function calling and data extraction tasks.

model release

White House Orders OpenAI to Limit GPT-5.6 Release to Approved Partners Only

The Trump administration has instructed OpenAI to release its newest model, GPT-5.6, only to a select group of government-approved partners rather than the general public. The Office of the National Cyber Director and Office of Science and Technology Policy will approve access customer by customer during a preview period.

model release

OpenAI's ChatGPT 5.6 release restricted to government-approved customers initially

OpenAI will release ChatGPT 5.6 first to customers approved by the federal government, according to a staff memo from CEO Sam Altman. The company plans a broader release "a couple of weeks later," marking a significant departure from typical model rollouts.

model release

OpenAI delays GPT-5.6 release after Trump administration mandates case-by-case customer approval

OpenAI CEO Sam Altman told employees the company will release GPT-5.6 in limited preview form only, with the Trump administration approving customer access on a case-by-case basis. The move follows stricter export controls imposed on Anthropic earlier this month.

Comments

Loading...