world-models

11 articles tagged with world-models

June 2, 2026
model releaseNVIDIA

NVIDIA Releases Cosmos 3: 64B-Parameter Omnimodal World Model for Physical AI

NVIDIA released Cosmos 3, an omnimodal world foundation model platform for Physical AI spanning robotics, autonomous driving, and industrial environments. The flagship Cosmos3-Super variant contains 64 billion parameters and generates video, images, audio, and action commands from text, image, video, and action trajectory inputs using a Mixture-of-Transformers architecture.

model releaseNVIDIA

NVIDIA Releases Cosmos3-Super: 64B-Parameter Omnimodal World Model for Physical AI

NVIDIA released Cosmos3-Super, a 64-billion parameter omnimodal foundation model that generates video, images, audio, and action commands from combinations of text, image, video, and action trajectory inputs. The model, part of the Cosmos3 collection, targets Physical AI applications including robotics, autonomous vehicles, and industrial automation.

model releaseNVIDIA

NVIDIA Releases Cosmos3-Nano: 16B-Parameter Omnimodal World Model for Physical AI with 256K Token Context

NVIDIA has released Cosmos3-Nano, a 16-billion parameter omnimodal world model capable of generating video, audio, images, and robot action commands from combinations of text, image, video, and action trajectory inputs. The model supports a 256K token context window and is designed for Physical AI applications including robotics, autonomous vehicles, and smart manufacturing environments.

May 19, 2026
product updateGoogle DeepMind

Google DeepMind connects Genie world model to 280 billion Street View images, Waymo already using for self-driving train

Google DeepMind has integrated its Genie world model with Street View's 280 billion images spanning 110 countries, enabling users to explore AI-generated simulations of real locations. Waymo is already using Genie 3 to train self-driving cars on rare scenarios like tornadoes and unexpected obstacles.

product update

Google DeepMind Integrates Street View With Genie 3 World Model for Real-World Environment Simulation

Google DeepMind launched Street View integration with its Genie 3 world model at I/O 2026, allowing users to simulate real-world locations from 280 billion images across 110 countries. The feature enables environment modification including weather changes and supports robotics training, with initial access for U.S. Ultra subscribers expanding globally.

model release

Google releases Gemini 3.5 Flash at half the price of frontier models, announces Omni world model

Google released Gemini 3.5 Flash, priced at half to one-third the cost of comparable frontier models, and announced it will become the default model in the Gemini app globally. The company also unveiled Omni, a world model for simulating physical environments, and Gemini Spark, an AI agent in beta testing.

May 18, 2026
researchNVIDIA

NVIDIA releases LoRA/DoRA fine-tuning guide for Cosmos Predict 2.5 to generate synthetic robot training data

NVIDIA published a technical guide for parameter-efficient fine-tuning of its Cosmos Predict 2.5 world model using LoRA and DoRA adapters. The method allows teams to adapt the 2B-parameter model to robot manipulation tasks on a single 80GB GPU, generating synthetic training trajectories from just 92 demonstration videos.

April 16, 2026
model releaseTencent

Tencent Releases HY-World 2.0: Open-Source Multi-Modal Model Generates 3D Worlds from Text and Images

Tencent has released HY-World 2.0, an open-source multi-modal world model that generates navigable 3D environments from text prompts, single images, multi-view images, or video. The model produces editable 3D assets including meshes and 3D Gaussian Splattings that can be directly imported into game engines like Unity and Unreal Engine.

April 2, 2026
analysisOpenAI

OpenAI's Brockman claims GPT reasoning models have 'line of sight' to AGI

OpenAI President Greg Brockman stated that GPT reasoning models have 'line of sight' to AGI and represents a settled debate on whether text-based models can achieve general intelligence. The company is prioritizing this approach over multimodal world models like Sora, which Brockman views as 'a different branch of the tech tree.' The stance contradicts prominent AI researchers including Yann LeCun and Demis Hassabis, who argue LLMs alone are insufficient for human-level intelligence.

March 10, 2026
funding

Yann LeCun's AMI Labs raises $1.03B to develop world models

AMI Labs, cofounded by Turing Prize winner Yann LeCun, has raised $1.03 billion at a $3.5 billion pre-money valuation. The funding will support the company's effort to develop world models, marking a major commitment to foundational AI research outside of existing tech giants.

February 20, 2026
funding

Fei-Fei Li's World Labs raises $1B to develop spatial intelligence AI systems

World Labs, the AI startup founded by Fei-Fei Li, has raised $1 billion in new funding to develop spatial intelligence—AI systems capable of understanding and operating in three-dimensional physical environments. The capital will fund the development of world models, a class of AI architecture designed to reason about spatial relationships and physical interactions.