• Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
Tech News, Magazine & Review WordPress Theme 2017
  • Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
No Result
View All Result
  • Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
No Result
View All Result
Blog - Creative Collaboration
No Result
View All Result
Home Android

I tested Gemini’s video analysis feature and the results were predictable

July 7, 2025
Share on FacebookShare on Twitter

Most of Google’s updates to Gemini don’t stand out to me. I’ve yet to see a significant improvement in its hallucination rate, and its ability to summarize the news and weather leaves a lot to be desired. However, a recent update that added video analysis capabilities to Gemini caught my eye as a tool I might use regularly.

Video analysis in Gemini is founded on the AI’s existing ability to summarize YouTube videos. I took this tool for a test run to see just how powerful it is and whether I would use it in everyday life.

Related


5 reasons why I’m not renewing my Gemini Advanced subscription

You haven’t convinced me, Google

How well does Gemini’s video analysis work?

For testing, I selected a variety of videos from my camera roll and asked Gemini different questions each time. Depending on what you ask, Gemini will analyse the video differently, so I asked the most relevant questions about the video.

Test 1: Object recognition

Gemini correctly identified the type of ducks in my video with some prompting, and even managed to correctly identify where the video was taken, thanks to a sign in the background.

The sign only showed the business name, but Gemini managed to identify where the video was recorded to within 100 meters. However, the clues in the video (the business name, Mandarin ducks, and canal) would have also led a human to the correct answer within minutes.

Test 2: Location recognition

I was quite impressed by Google’s ability to identify where my video was, but there were plenty of clues to help it. For my next test, I used a video of an eruption of the Kilauea volcano in Hawaii taken in May. Gemini managed to correctly identify the volcano, but it was unable to identify the date (The video was taken on May 26).

Test 3: Location recognition

Just like with Gemini’s other analysis features, you need to ask it the right question to get the right answer. This video I took of a small parade at Karneval in Cologne last year stumped Gemini.

It was unable to answer me when I asked where the video was taken, but it managed to identify the country with further prompting. Interestingly, this prompt revealed that it recognised that the video was of a Karneval parade, but it couldn’t identify the city.

I tested Gemini again using a video of the main parade of Karneval (which contained significantly more visual clues), but it was still unable to identify that the video was taken in Cologne despite the amount of street signs, shop fronts, and Karneval costumes shown in the video.

Test 3: Audio recognition

I was personally interested in Gemini’s audio recognition. Identifying songs that are currently playing is useful, but picking up a song in the background from an old video is even more helpful for me. Unfortunately, Gemini’s results here were spotty at best. Here are some of my results:

  • Incorrectly identified a 22-second recording of ‘Solid Rock’ by Dire Straits as ‘I Know Alone’ by HAIM.
  • Incorrectly identified a 15-second recording of ‘Surfing with the Alien’ by Joe Satriani as ‘Can’t Stop’ by the Red Hot Chili Peppers.
  • Correctly identified a 57-second recording of ‘Like a Rolling Stone’ by Bob Dylan. It also identified the song from an 11-second recording.
  • Incorrectly identified an 11-second recording of ‘Wildflowers’ by Tom Petty as ‘You Belong To Me’ by the Duprees.

I tested Gemini more times with varying lengths of videos. It’s accuracy was positively correlated with the length of the recording, but what surprised me was how incorrect it was.

I highly recommend you compare the tracks above to see how different they are from reality. Honestly, Gemini, how does Tom Petty sound like The Duprees?

Test 4: Explaining what happens in a video

One of the more practical uses of Gemini is to explain what happens in a video if you don’t have time to watch it yourself. I used one of my favourite videos, a clip of my friend’s cats fighting. Gemini had a fascinating take on this clip.

While you can clearly see the black and white cat attack and then chase away the black cat, Gemini concluded that the cats began to fight (notably using the passive voice here, although there was clearly an aggressor), then the black cat chased the black and white cat away.

Gemini’s take here is misleading and would leave the user with a completely incorrect understanding of the situation.

However, a follow-up question prompted Gemini to correctly identify the aggressor in the video. This is a funny example involving a harmless interaction between cats, but it’s a great example of how Gemini can mislead users. What about if you used Gemini to analyze a video of people fighting?

a phone showing a chat with Google Gemini next to the Gemini logo

Related


6 things I had no idea Gemini could do

Google Gemini just got even more useful for me

Gemini’s video analysis is as unreliable as the rest of the AI’s services

The first test I did of Gemini’s video analysis was the Kilauea volcano eruption. This impressed me, but in most of my subsequent tests, Gemini failed to deliver. It needed hard data like signs to accurately identify locations, and its song recognition is inferior to Google’s Song Search tool (which is also included in the Gemini app).

I found the most interesting test was Gemini analyzing the cat fight, as it drew the wrong conclusions from the video despite clear video evidence. I managed to get it to correctly analyze the video after multiple prompts, but this took longer than watching the video. In conclusion, I’ll stick to watching and analyzing videos myself and shelve Gemini again.

Next Post

NYT Strands hints, answers for July 7

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

No Result
View All Result

Recent Posts

  • An OpenAI-linked news outlet appears to be entirely AI-generated
  • I turned my $1,000 phone into a 1995 desktop computer with this quirky Android launcher
  • The end of Fitbit? Google Health may be ready to take the reins
  • China formalises gig worker protections for 200 million platform workers with algorithm transparency and 2027 deadline
  • Musk v. Altman trial begins with $150B at stake over OpenAI’s nonprofit-to-profit conversion

Recent Comments

    No Result
    View All Result

    Categories

    • Android
    • Cars
    • Gadgets
    • Gaming
    • Internet
    • Mobile
    • Sci-Fi
    • Home
    • Shop
    • Privacy Policy
    • Terms and Conditions

    © CC Startup, Powered by Creative Collaboration. © 2020 Creative Collaboration, LLC. All Rights Reserved.

    No Result
    View All Result
    • Home
    • Blog
    • Android
    • Cars
    • Gadgets
    • Gaming
    • Internet
    • Mobile
    • Sci-Fi

    © CC Startup, Powered by Creative Collaboration. © 2020 Creative Collaboration, LLC. All Rights Reserved.

    Get more stuff like this
    in your inbox

    Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

    Thank you for subscribing.

    Something went wrong.

    We respect your privacy and take protecting it seriously