• Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
Tech News, Magazine & Review WordPress Theme 2017
  • Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
No Result
View All Result
  • Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
No Result
View All Result
Blog - Creative Collaboration
No Result
View All Result
Home Gadgets

Google launches Gemini Omni Flash, a conversational video-generation model with avatar mode held back

May 20, 2026
Share on FacebookShare on Twitter

The first model in DeepMind’s new Omni family will generate and edit video from any combination of image, audio, video, and text inputs. Speech-editing is being withheld; SynthID watermarking is on by default.

Google introduced Gemini Omni on Tuesday at the I/O 2026 developer conference, a new multimodal model family from Google DeepMind designed to generate and edit video from any combination of image, audio, video, and text inputs.

The first model in the family, Gemini Omni Flash, started rolling out the same day to the Gemini app and Google Flow for Google AI Plus, Pro, and Ultra subscribers, and to YouTube Shorts and the YouTube Create app at no cost. API access for developers and enterprise customers will follow in the coming weeks.

The product framing, from Koray Kavukcuoglu, CTO of Google DeepMind and Chief AI Architect at Google, is that Omni ‘combines images, audio, video, and text as input and generates high-quality videos grounded in Gemini’s real-world knowledge.’ Inputs can be mixed in a single prompt.

TNW City Coworking space – Where your best work happens

A workspace designed for growth, collaboration, and endless networking opportunities in the heart of tech.

Edits are made conversationally, with each instruction building on the previous one, so that characters, physics, and scene context persist across turns. Output modalities beyond video, including image and audio generation, are ‘coming in time,’ Kavukcuoglu wrote on the company’s blog.

Omni’s positioning, on the published materials, rests on three claims. First, the model has an improved intuitive understanding of physical forces, including gravity, kinetic energy, and fluid dynamics, allowing it to generate scenes with more accurate physics.

Second, it draws on Gemini’s existing world knowledge to connect language, imagery, and meaning beyond pattern-matching, with the company demonstrating prompts that range from claymation protein-folding explainers to chain-reaction physics tracks. Third, the conversational-editing layer preserves consistency across multi-turn revisions, where prior video models have tended to drift on character identity or scene continuity.

The release also extends the Omni family to digital-avatar generation. Avatars let users record their own voice and likeness to create videos that look and sound like them, with onboarding requiring recording yourself and speaking a series of numbers aloud.

]Beyond avatars, Google is explicitly withholding general-purpose audio and speech editing inside Omni for now. ‘We are still working to test this and better understand how we can bring this capability to users responsibly,’ Kavukcuoglu wrote, in a paragraph that third-party coverage has read as a deliberate step back from the deepfake-adjacent territory of consent-free voice editing.

All videos generated with Omni will carry Google’s SynthID imperceptible digital watermark by default. Users can verify whether a clip was generated by Omni through the Gemini app, Gemini in Chrome and Google Search, the company said.

The SynthID layer is the same watermarking infrastructure OpenAI adopted earlier this year under the C2PA open standard, and is now positioned as the cross-industry default for AI-generated visual provenance.

On the disclosed initial limits, Flash-tier clips are capped at 10 seconds at launch, a deployment decision rather than a model constraint. The cap is shorter than OpenAI’s Sora maximum of 60 seconds, where Sora’s tokenisation-of-spatiotemporal-patches architecture is the closest published frontier-model comparison.

Google has not disclosed the per-clip cost structure, the compute footprint per generation, or the benchmark suite it used to evaluate Omni against Veo 3 or third-party models such as ByteDance’s Seedance.

Omni is the headline model in a wider I/O 2026 announcement that also included Gemini 3.5 and what Sundar Pichai called the ‘agentic Gemini era’ in his keynote post. The strategic question for the model, on the announcement and immediate analyst reads, is whether the multi-input conversational editing flow is genuinely a new product category or a tighter integration of capabilities the broader frontier-video field has already demonstrated.

The next visible proof point will be the API rollout to developers and enterprise customers in the coming weeks, where the cost structure and the upper bound on clip length under paid tiers will become public.

What Google has not yet disclosed: the underlying Omni model architecture relative to Veo 3, the per-generation compute footprint, pricing for clips beyond the Flash tier, benchmark scores against DeepMind’s own prior video models and competing frontier offerings, and the timeline for general-purpose audio and speech editing inside the Omni family.

The avatar onboarding process and SynthID enforcement are, on the announcement, the company’s formal answer to the consent-and-provenance questions the launch invites.

Next Post

Berlin’s Dunia Innovations commits €280M to an autonomous AI-materials GigaLab

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

No Result
View All Result

Recent Posts

  • Forza Horizon 6 gives Game Pass its next must-play
  • Alibaba unveils the Zhenwu M890 as China’s NVIDIA alternative push hardens
  • I used a futuristic ultra-thin power bank and forgot it was on my phone
  • Google announces Pics, a Workspace-native AI image generator that goes after Canva on precision editing
  • Robot vacuum deals are live ahead of Memorial Day — the Shark AV2501S AI Ultra robot vacuum is under $300

Recent Comments

    No Result
    View All Result

    Categories

    • Android
    • Cars
    • Gadgets
    • Gaming
    • Internet
    • Mobile
    • Sci-Fi
    • Home
    • Shop
    • Privacy Policy
    • Terms and Conditions

    © CC Startup, Powered by Creative Collaboration. © 2020 Creative Collaboration, LLC. All Rights Reserved.

    No Result
    View All Result
    • Home
    • Blog
    • Android
    • Cars
    • Gadgets
    • Gaming
    • Internet
    • Mobile
    • Sci-Fi

    © CC Startup, Powered by Creative Collaboration. © 2020 Creative Collaboration, LLC. All Rights Reserved.

    Get more stuff like this
    in your inbox

    Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

    Thank you for subscribing.

    Something went wrong.

    We respect your privacy and take protecting it seriously