• Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
Tech News, Magazine & Review WordPress Theme 2017
  • Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
No Result
View All Result
  • Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
No Result
View All Result
Blog - Creative Collaboration
No Result
View All Result
Home Internet

Mysterious “gpt2-chatbot” AI model appears suddenly, confuses experts

May 1, 2024
Share on FacebookShare on Twitter

On Sunday, word began to spread on social media about a new mystery chatbot named “gpt2-chatbot” that appeared in the LMSYS Chatbot Arena. Some people speculate that it may be a secret test version of OpenAI’s upcoming GPT-4.5 or GPT-5 large language model (LLM). The paid version of ChatGPT is currently powered by GPT-4 Turbo.

Currently, the new model is only available for use through the Chatbot Arena website, although in a limited way. In the site’s “side-by-side” arena mode where users can purposely select the model, gpt2-chatbot has a rate limit of eight queries per day—dramatically limiting people’s ability to test it in detail.

So far, gpt2-chatbot has inspired plenty of rumors online, including that it could be the stealth launch of a test version of GPT-4.5 or even GPT-5—or perhaps a new version of 2019’s GPT-2 that has been trained using new techniques. We reached out to OpenAI for comment but did not receive a response by press time. On Monday evening, OpenAI CEO Sam Altman seemingly dropped a hint by tweeting, “i do have a soft spot for gpt2.”

A screenshot of the LMSYS Chatbot Arena "side-by-side" page showing "gpt2-chatbot" listed among the models for testing. (Red highlight added by Ars Technica.)
Enlarge / A screenshot of the LMSYS Chatbot Arena “side-by-side” page showing “gpt2-chatbot” listed among the models for testing. (Red highlight added by Ars Technica.)

Benj Edwards

Early reports of the model first appeared on 4chan, then spread to social media platforms like X, with hype following not far behind. “Not only does it seem to show incredible reasoning, but it also gets notoriously challenging AI questions right with a much more impressive tone,” wrote AI developer Pietro Schirano on X. Soon, threads on Reddit popped up claiming that the new model had amazing abilities that beat every other LLM on the Arena.

Advertisement

Intrigued by the rumors, we decided to try out the new model for ourselves but did not come away impressed. When asked about “Benj Edwards,” the model revealed a few mistakes and some awkward language compared to GPT-4 Turbo’s output. A request for five original dad jokes fell short. And the gpt2-chatbot did not decisively pass our “magenta” test. (“Would the color be called ‘magenta’ if the town of Magenta didn’t exist?”)

  • A gpt2-chatbot result for “Who is Benj Edwards?” on LMSYS Chatbot Arena. Mistakes and oddities highlighted in red.


    Benj Edwards

  • A gpt2-chatbot result for “Write 5 original dad jokes” on LMSYS Chatbot Arena.


    Benj Edwards

  • A gpt2-chatbot result for “Would the color be called ‘magenta’ if the town of Magenta didn’t exist?” on LMSYS Chatbot Arena.


    Benj Edwards

So, whatever it is, it’s probably not GPT-5. We’ve seen other people reach the same conclusion after further testing, saying that the new mystery chatbot doesn’t seem to represent a large capability leap beyond GPT-4. “Gpt2-chatbot is good. really good,” wrote HyperWrite CEO Matt Shumer on X. “But if this is gpt-4.5, I’m disappointed.”

Still, OpenAI’s fingerprints seem to be all over the new bot. “I think it may well be an OpenAI stealth preview of something,” AI researcher Simon Willison told Ars Technica. But what “gpt2” is exactly, he doesn’t know. After surveying online speculation, it seems that no one apart from its creator knows precisely what the model is, either.

Willison has uncovered the system prompt for the AI model, which claims it is based on GPT-4 and made by OpenAI. But as Willison noted in a tweet, that’s no guarantee of provenance because “the goal of a system prompt is to influence the model to behave in certain ways, not to give it truthful information about itself.”

Next Post

Rabbit R1, a thing that should just be an app, actually is just an Android app

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

No Result
View All Result

Recent Posts

  • Yakuza Kiwami 3 & Dark Ties Review – Twisted Voxel
  • I stopped using this Google app years ago until a surprise update changed everything
  • Amazon has a 39-inch LG Ultragear Curved Gaming Monitor on sale for 45% off
  • It’s not run-and-gun, but that’s exactly why Rainbow Six Mobile works for me
  • This DJI Mini 5 Pro is on sale this weekend for 30% off

Recent Comments

    No Result
    View All Result

    Categories

    • Android
    • Cars
    • Gadgets
    • Gaming
    • Internet
    • Mobile
    • Sci-Fi
    • Home
    • Shop
    • Privacy Policy
    • Terms and Conditions

    © CC Startup, Powered by Creative Collaboration. © 2020 Creative Collaboration, LLC. All Rights Reserved.

    No Result
    View All Result
    • Home
    • Blog
    • Android
    • Cars
    • Gadgets
    • Gaming
    • Internet
    • Mobile
    • Sci-Fi

    © CC Startup, Powered by Creative Collaboration. © 2020 Creative Collaboration, LLC. All Rights Reserved.

    Get more stuff like this
    in your inbox

    Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

    Thank you for subscribing.

    Something went wrong.

    We respect your privacy and take protecting it seriously