Summary
- Google is updating Gemini Live with advanced language skills and translation capabilities for smoother interactions.
- Gemini Live is also set to expand with screen sharing and live video streaming features while maintaining conversation transcripts only.
- Gemini 2.0 marks the start of the “agent era” with Flash, a faster model capable of generating images, speech, and text.
Talking to AI used to feel like something out of a sci-fi movie, but now it’s just part of everyday life—thanks to tools like Gemini Live. With instant access from your phone, these AI assistants are changing the way we interact with technology. And Google isn’t stopping there—it’s making Gemini Live even smoother and more engaging to keep conversations feeling natural.
In an email to users, Google revealed a major upgrade for Gemini Live, packing in its latest AI model to make it way smarter. While the company is keeping the nitty-gritty under wraps for now, one thing is clear: Gemini Live has gained more advanced skills at understanding different languages, accents, and dialects, plus its translation capabilities are stronger than ever.
Gemini Live is also gearing up for bigger ways to connect, including screen sharing and live video streaming, as per Google’s email. To make these features work smoothly, Google will start storing your audio, video, and screen share data in your Gemini Apps Activity (if enabled). Right now, only conversation transcripts are saved.
Gemini 2 natively generates images, speech, and text
With the rollout of Gemini 2.0 late last year, the Multimodal Live API gave developers the tools to handle all kinds of inputs—text, audio, video—and spit out text or audio responses. It’s pretty likely that Gemini Live is tapping into this API to power its features, as noted by 9to5Google.
Google is calling Gemini 2 the start of the “agent era.” This model is on par with OpenAI’s o1 but with a bonus: it can natively generate images, speech, text, and more. The first in the lineup is Gemini 2.0 Flash, though it’s still labeled as “experimental” for now. According to Google, Flash is twice as fast as its predecessor, Gemini Pro 1.5, and beats it on major performance benchmarks.
When Gemini 1.0 arrived, we were basically deep in the “chatbot” era—AI you could chat with and use to whip up content. But then OpenAI’s o1 rolled in, and things shifted since. Suddenly, we were in the “reasoning era,” where AI could think more like us, and at the same time, the “agent era” kicked off, where AI started doing more on its own.


