• Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
Tech News, Magazine & Review WordPress Theme 2017
  • Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
No Result
View All Result
  • Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
No Result
View All Result
Blog - Creative Collaboration
No Result
View All Result
Home Internet

Cheap AI “video scraping” can now extract data from any screen recording

October 18, 2024
Share on FacebookShare on Twitter

Video scraping is just one of many new tricks possible when the latest large language models (LLMs), such as Google’s Gemini and GPT-4o, are actually “multimodal” models, allowing audio, video, image, and text input. These models translate any multimedia input into tokens (chunks of data), which they use to make predictions about which tokens should come next in a sequence.

A term like “token prediction model” (TPM) might be more accurate than “LLM” these days for AI models with multimodal inputs and outputs, but a generalized alternative term hasn’t really taken off yet. But no matter what you call it, having an AI model that can take video inputs has interesting implications, both good and potentially bad.

Breaking down input barriers

Willison is far from the first person to feed video into AI models to achieve interesting results (more on that below, and here’s a 2015 paper that uses the “video scraping” term), but as soon as Gemini launched its video input capability, he began to experiment with it in earnest.

In February, Willison demonstrated another early application of AI video scraping on his blog, where he took a seven-second video of the books on his bookshelves, then got Gemini 1.5 Pro to extract all of the book titles it saw in the video and put them in a structured, or organized, list.

Converting unstructured data into structured data is important to Willison, because he’s also a data journalist. Willison has created tools for data journalists in the past, such as the Datasette project, which lets anyone publish data as an interactive website.

To every data journalist’s frustration, some sources of data prove resistant to scraping (capturing data for analysis) due to how the data is formatted, stored, or presented. In these cases, Willison delights in the potential for AI video scraping because it bypasses these traditional barriers to data extraction.

Next Post

Google Meet app's original version is officially gone for good

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

No Result
View All Result

Recent Posts

  • Save $200 on the Roborock Q7 M5+ robot vacuum and mop
  • Borderlands 4 Update Adds Halloween Event And More
  • Best streaming deal: Curiosity Stream lifetime subscription for $149.99 (reg. $399.99)
  • This versatile power strip does everything you need it to for just $10
  • Vegas Mob Museum explores the rise of organized cybercrime

Recent Comments

    No Result
    View All Result

    Categories

    • Android
    • Cars
    • Gadgets
    • Gaming
    • Internet
    • Mobile
    • Sci-Fi
    • Home
    • Shop
    • Privacy Policy
    • Terms and Conditions

    © CC Startup, Powered by Creative Collaboration. © 2020 Creative Collaboration, LLC. All Rights Reserved.

    No Result
    View All Result
    • Home
    • Blog
    • Android
    • Cars
    • Gadgets
    • Gaming
    • Internet
    • Mobile
    • Sci-Fi

    © CC Startup, Powered by Creative Collaboration. © 2020 Creative Collaboration, LLC. All Rights Reserved.

    Get more stuff like this
    in your inbox

    Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

    Thank you for subscribing.

    Something went wrong.

    We respect your privacy and take protecting it seriously