If you needed more evidence that GenAI is likely to make stuff up, Google's Gemini chatbot, formerly Bard, thinks the 2024 Super Bowl has already happened. It even has (fictitious) statistics to back it up.
According to a Reddit thread, Gemini, powered by Google's GenAI models of the same name, is answering questions about Super Bowl LVIII as if the game ended yesterday or weeks ago. Like most bookmakers, it favors the Chiefs over the 49ers (sorry, San Francisco fans).
Kansas Chiefs quarterback Patrick Mahomes ran for 286 yards for two touchdowns and Brock Purdy's 253 rushing yards and one touchdown against a player on at least one occasion gives the Gemini a stat breakdown.
It's not just Gemini. Even Microsoft's Copilot chatbot insists it's game over and provides false citations to back up the claim. But — perhaps reflecting the San Francisco bias! – It was the 49ers, not the Chiefs, who won “by a final score of 24-21.”
Copilot is powered by a GenAI model similar to the model that underpins OpenAI's ChatGPT (GPT-4). But in my testing, ChatGPT was loathe to make the same mistake.
This is all pretty silly — and probably fixed now — as this reporter didn't have the luck to replicate Gemini's responses in the Reddit thread. (I'd be surprised if Microsoft didn't also work on a solution.) But it also illustrates the major limitations of today's GenAI — and the dangers of putting too much faith in it.
GenAI models have no real intelligence. With vast examples usually collected from the public web, AI models learn how likely data (eg text) is based on patterns including the context of any surrounding data.
This probability-based approach works very well at scale. But the range of words and their probabilities opportunity In order to render the text meaningfully, this is absolutely not the case. LLMs can create things that are grammatically correct but nonsensical, for example – like a claim about the Golden Gate. Or they can spread mistrust in their training data.
The Super Bowl misinformation is certainly not the most damaging example of GenAI going off the rails. That difference might be condoning violence, reinforcing racial and ethnic stereotypes, or writing confidently about conspiracy theories. Still, it's a useful reminder to double-check statements from GenAI bots. There's a good chance they're not true.