video-generation

33 articles tagged with video-generation

June 17, 2026
product update

Google Vids Opens AI Avatar Feature to Free Users, Reaches 7M Monthly Active Users

Google Vids now offers AI avatars to free personal accounts, providing 10 video generations per month that can be split between avatars and Veo video generation. The Workspace video creation tool has reached 7 million monthly active users.

June 8, 2026
product update

Google cuts AI Plus subscription to $5/month, doubles storage to 400GB

Google lowered its AI Plus subscription from $8 to $5 per month and doubled included storage from 200GB to 400GB. The plan includes access to Gemini 3 Pro, Nano Banana Pro, Deep Research, and the newly announced Gemini Omni video generation model.

June 3, 2026
model releaseByteDance

ByteDance Open-Sources Bernini-R Video Diffusion Model With Semantic Planning Architecture

ByteDance released Bernini-R, an open-source video generation and editing model that combines an MLLM-based semantic planner with a DiT-based renderer. The model requires Hopper-class GPUs (H100/H800/H200) for optimal performance and supports multiple tasks including text-to-video, video editing, and reference-guided generation.

June 2, 2026
analysis

Nvidia Releases Cosmos 3 Video Generation Models in Three Sizes: Nano, Super, and Super-Image2Video

Nvidia has released three variants of its Cosmos 3 video generation model family on Hugging Face: Cosmos3-Nano, Cosmos3-Super, and Cosmos3-Super-Image2Video. The release includes models for both standard video generation and specialized image-to-video conversion, though detailed specifications including parameter counts and benchmark scores have not yet been disclosed.

model releaseNVIDIA

NVIDIA Releases Cosmos 3: 64B-Parameter Omnimodal World Model for Physical AI

NVIDIA released Cosmos 3, an omnimodal world foundation model platform for Physical AI spanning robotics, autonomous driving, and industrial environments. The flagship Cosmos3-Super variant contains 64 billion parameters and generates video, images, audio, and action commands from text, image, video, and action trajectory inputs using a Mixture-of-Transformers architecture.

model releaseNVIDIA

NVIDIA Releases Cosmos3-Super: 64B-Parameter Omnimodal World Model for Physical AI

NVIDIA released Cosmos3-Super, a 64-billion parameter omnimodal foundation model that generates video, images, audio, and action commands from combinations of text, image, video, and action trajectory inputs. The model, part of the Cosmos3 collection, targets Physical AI applications including robotics, autonomous vehicles, and industrial automation.

model releaseNVIDIA

NVIDIA Releases Cosmos3-Nano: 16B-Parameter Omnimodal World Model for Physical AI with 256K Token Context

NVIDIA has released Cosmos3-Nano, a 16-billion parameter omnimodal world model capable of generating video, audio, images, and robot action commands from combinations of text, image, video, and action trajectory inputs. The model supports a 256K token context window and is designed for Physical AI applications including robotics, autonomous vehicles, and smart manufacturing environments.

June 1, 2026
model releaseNVIDIA+1

NVIDIA Releases Cosmos 3: 8B and 32B Omni-Models Combining Video Generation, Reasoning, and Action in Single Architectur

NVIDIA has released Cosmos 3, a unified omni-model that combines world generation, physical reasoning, and action generation in a single architecture. Available in 8B (Nano) and 32B (Super) parameter versions on Hugging Face, Cosmos 3 uses a Mixture-of-Transformers architecture to process text, image, video, audio, and action modalities without switching between separate models.

May 27, 2026
model release

Google launches Gemini Omni, multimodal AI video generator with avatar cloning and physics modeling

Google has released Gemini Omni, a multimodal AI video generation tool that accepts text, images, audio, and video as inputs. The first tier, Gemini Omni Flash, includes avatar cloning that creates digital versions of users and incorporates physics modeling for realistic motion.

May 21, 2026
product update

Google cuts AI Ultra plan to $200/month, launches new $100 developer tier

Google announced pricing changes to its Gemini AI subscription tiers at I/O 2026, cutting its top AI Ultra plan from $250 to $200 per month while introducing a new $100/month developer-focused tier. All plans now get access to Gemini 3.5 Flash and the new Gemini Omni video generation model.

May 20, 2026
model release

Google releases Gemini Omni Flash video generation model with conversational editing, withholds speech synthesis

Google DeepMind released Gemini Omni Flash, the first model in its new Omni family that generates and edits video from image, audio, video, and text inputs. The model is rolling out to Gemini app subscribers and YouTube Shorts with a 10-second clip limit, while speech-editing capabilities remain withheld pending safety testing.

May 18, 2026
researchNVIDIA

NVIDIA releases LoRA/DoRA fine-tuning guide for Cosmos Predict 2.5 to generate synthetic robot training data

NVIDIA published a technical guide for parameter-efficient fine-tuning of its Cosmos Predict 2.5 world model using LoRA and DoRA adapters. The method allows teams to adapt the 2B-parameter model to robot manipulation tasks on a single 80GB GPU, generating synthetic training trajectories from just 92 demonstration videos.

April 13, 2026
product update+1

Google adds Veo 3.1 Lite to Ultra subscriptions at zero credit cost starting May 10

Google is adding Veo 3.1 Lite to Ultra subscriptions at zero credit cost starting May 10, 2026. The model costs less than half of Veo 3.1 Fast but generates videos at the same speed according to Google, though quality tradeoffs remain unclear.

April 9, 2026
product update

Google launches AI avatar tool for YouTube Shorts creators

YouTube is rolling out an AI avatar feature that lets creators generate digital versions of themselves for use in Shorts videos. The tool requires users to record a "live selfie" with face and voice data, generates clips up to 8 seconds long, and marks all AI-generated content with watermarks and digital labels.

April 8, 2026
product update

YouTube Shorts adds AI avatars that replicate your voice and appearance

YouTube is rolling out an AI avatar feature that lets users create photorealistic versions of themselves for YouTube Shorts. Users record a live selfie and voice prompts to generate an avatar that can create up to 8-second video clips. The feature includes watermarks, digital labels (SynthID and C2PA), and AI-generated content disclosures.

April 4, 2026
model releaseTencent

Tencent releases OmniWeaving, open-source video generation model with reasoning and multi-modal composition

Tencent's Hunyuan team released OmniWeaving on April 3, 2026, an open-source video generation model designed to compete with proprietary systems like Seedance-2.0. The model combines multimodal composition, reasoning-informed capabilities, and supports eight video generation tasks including text-to-video, image-to-video, video editing, and compositional generation.

April 2, 2026
model releaseMicrosoft

Microsoft releases three multimodal AI models to compete with OpenAI and Google

Microsoft AI released three foundational models on April 2: MAI-Transcribe-1 for speech-to-text across 25 languages, MAI-Voice-1 for audio generation, and MAI-Image-2 for video generation. The company positions these models as cheaper alternatives to Google and OpenAI offerings. Models are available on Microsoft Foundry with pricing starting at $0.36 per hour for transcription.

March 31, 2026
model release

Google launches Veo 3.1 Lite, cutting video generation costs by half

Google announced Veo 3.1 Lite, a cost-reduced video generation model priced at less than 50% of Veo 3.1 Fast's cost. The model supports text-to-video and image-to-video generation at 720p or 1080p resolution with customizable durations of 4s, 6s, or 8s, rolling out today on the Gemini API and Google AI Studio.

March 28, 2026
changelogOpenAI

OpenAI shuts down Sora app April 2026, API follows September 2026

OpenAI is shutting down Sora in two phases: the web app and mobile application close April 26, 2026, followed by the API on September 24, 2026. Users must export their videos and images before the cutoff dates, as user data will be permanently deleted afterward. The discontinuation reflects OpenAI's strategic shift toward coding tools and enterprise products.

March 26, 2026
product updateByteDance

ByteDance rolls out Dreamina Seedance 2.0 video generation to CapCut with IP safeguards

ByteDance confirmed Thursday that Dreamina Seedance 2.0, its audio and video generation model, is rolling out in CapCut across seven initial markets. The model generates videos up to 15 seconds with realistic textures and motion, but includes safety restrictions blocking generation from real faces and unauthorized IP use.

March 25, 2026
product updateOpenAI

OpenAI shutters Sora video tool after Disney deal collapse, signaling shift to enterprise focus

OpenAI announced the shutdown of its Sora video generation app on Tuesday via an X post, just two days after publishing usage guidelines and following Disney's withdrawal from a proposed $1 billion investment deal. The move represents OpenAI's second major product discontinuation in recent months, after deprecating GPT-4o in January with two weeks' notice.

product updateOpenAI

OpenAI shuts down Sora app with no explanation; Disney deal collapses

OpenAI announced the shutdown of its Sora standalone video generation app on X, though the company provided no explanation for the decision. The closure kills a partnership deal with Disney that would have allowed Sora to generate videos using Disney IP. Video generation capabilities may remain available through other OpenAI channels.

March 24, 2026
product updateOpenAI

OpenAI shutting down Sora video app 15 months after launch

OpenAI announced Tuesday it will shut down Sora, the video generation app that launched publicly in December 2024. The shutdown comes as OpenAI refocuses on business and productivity applications and faces intensifying competition from ByteDance's SeeDance 2.0 and Google's Veo.

changelogOpenAI+1

OpenAI shuts down Sora video app amid declining user engagement and strategic shift

OpenAI announced the discontinuation of Sora, its consumer video generation app and API. The shutdown follows declining user engagement and aligns with OpenAI's strategic pivot toward enterprise AI products and robotics research, with Disney exiting a planned $1 billion investment.

product updateOpenAI

OpenAI discontinues Sora video generator, ending $1B Disney deal

OpenAI announced Tuesday that it is discontinuing Sora, its video generation tool launched in late 2024, along with both the standalone app and developer API access. The shutdown also terminates Disney's $1 billion investment deal announced in December, which included licensing Disney characters for use within Sora and plans to distribute AI-generated videos on Disney+.

model releaseStability AI

Stability AI releases Stable Virtual Camera for 3D multi-view video generation from 2D images

Stability AI has introduced Stable Virtual Camera, a multi-view diffusion model currently in research preview that generates 3D videos from 2D images with realistic depth and perspective transformations. The model requires no complex scene reconstruction or scene-specific optimization, enabling direct camera control across multiple viewpoints.

March 15, 2026
product updateOpenAI

OpenAI plans to integrate Sora video generation directly into ChatGPT

OpenAI plans to add its Sora video generation model directly to ChatGPT, according to The Information. The move comes as the standalone Sora app, launched in September 2025, has seen declining engagement despite strong initial adoption. Integration could help grow ChatGPT's 900 million weekly active users toward 1 billion.

March 14, 2026
product updateOpenAI

OpenAI Python SDK v2.27.0 adds Sora video API improvements and character support

OpenAI released version 2.27.0 of its Python SDK on March 13, 2026, adding significant improvements to the Sora video generation API. The update introduces character API support, video extension and editing capabilities, and higher resolution export options.

March 11, 2026
product updateOpenAI

OpenAI plans to integrate Sora video generator directly into ChatGPT

OpenAI plans to integrate its Sora video generator as a built-in feature within ChatGPT, according to The Information. Currently available only on a standalone website and app, the integration would let users generate videos directly in the chatbot, similar to how image generation was added last year.

March 7, 2026
model releaseByteDance

ByteDance's Helios reaches 19.5 FPS for minute-long video generation on single GPU

ByteDance has released Helios, a 14-billion-parameter open-weight video generation model that achieves 19.5 frames per second on a single GPU while generating minute-long video clips. The researchers claim this is the first model of its scale to reach near-real-time performance at this duration. Code and model weights are publicly available.

March 5, 2026
product update

Google NotebookLM adds Cinematic Video Overviews and upgrades AI Mode Canvas

Google is expanding NotebookLM's video generation capabilities with Cinematic Video Overviews, which produce styled visual narratives beyond simple slide presentations. The update also includes upgrades to Canvas in AI Mode, enhancing the tool's ability to synthesize and present document insights.

March 4, 2026
product update

Google NotebookLM now generates fully animated 'cinematic' videos from research notes

Google has upgraded NotebookLM's video overview feature to generate fully animated videos from research notes and documents, moving beyond the previous narrated slideshow format. The new capability uses multiple Google AI models including Gemini 3 and Veo 3 to automatically create visual content that matches the narrative.

February 23, 2026
product update

Gemini app adds video templates for quicker content generation

Google has rolled out video templates in its Gemini app, enabling users to generate video content more quickly through pre-built starting points. This update follows the introduction of music generation capabilities the previous week.