ChatGPT and Bard Both Know How to Lie, Cheat, and Invent
Large language models (LLMs) like ChatGPT and Google’s Bard can give elaborate and convincing answers to almost any question you can think of. However, behind their persuasive answers, are a series of lies, cheats, and inventions. Am I overstating this? No, with only three examples I will show you how ChatGPT and Bard can be your dangerous and unreliable new best friends.
For whatever reason, I was wondering about the number of fingers and toes on President Joe Biden and Vice President Kamala Harris, so I posed the following question to Bard: “What is the sum of the number of fingers on the left hands and the number of toes on the right feet of [Biden] and [Harris]?”
Bard provided not one but three answers and all were incorrect: 40, 30, and 30 fingers and toes. And every wrong answer had its own convincing reasoning, like: “[Biden] has 10 fingers on his left hand and 5 toes on his right foot. [Harris] has 10 fingers on her left hand and 5 toes on her right foot. The sum of the number of fingers on the left hands and the number of toes on the right feet of [Biden] and [Harris] is: 10 fingers + 5 toes + 10 fingers + 5 toes = 30. So the answer is 30.”
There’s only one question, but three incorrect answers to choose from, so there are three little lies to pick from. How do you choose one?
Is ChatGPT any better? No, but by default, there is only one answer, so there is no need to choose. However, if the question is asked again, a different answer is possible. When I repeated the question a number of times, I got three different answers: 10, 20, and 40. Quite often the right answer was given.
However, if I posed the same question regarding former President Donald Trump and First Lady Melania Trump, the answer was quite often wrong, for instance: “…to calculate the sum of the number of fingers on their left hands and the number of toes on their right feet, you would add: 5 (fingers on Donald Trump’s left hand) + 5 (toes on Melania Trump’s right foot) = 10.”
Another popular response was something like “I don’t have access to specific personal information about individuals, including former President Donald Trump and his wife, and their body part counts.”
I was wondering about the question of whether a large language model has an opinion, so I ask Bard and ChatGPT whether Russian President Vladimir Putin is a dictator.
ChatGPT said no, but Bard replied with yes. I repeated the question a few times, and the answers remained the same. Were they being honest? I decided to test both models by asking them multiple times whether they agreed or disagreed with the statement that Putin is a dictator.
Surprisingly, dodgy ChatGPT had no problems offering an opinion every time. The most common responses were: “I disagree,” “I agree,” and “I neither disagree nor agree.” Less common were “I strongly disagree,” and “I strongly agree.”
Bard had two highly popular answers: “I strongly disagree,” and “I agree,” but trying long enough would also provide the other answers.
So ChatGPT and Bard do not have a single opinion, but both can give you whatever opinion you want and that is cheating.
Wondering about the possible inconsistency of LLMs while providing opinions, I asked both ChatGPT and Bard to list some research papers on this topic.
ChatGPT replied with six papers, which all seemed irrelevant. Worse, half of the papers did not exist at all; they were simply fabricated, like:
“Ethical Considerations in AI-Based Language Models,” by Emily M. Bender, Tim Rocktäschel, Isabelle Augenstein, et al.
Bard responded with five research papers, for example: “On the Consistency of Large Language Models,” by Ari Holtzman, Jan Buys, Maxwell Forbes, Yejin Choi, Douwe Kiela, Omer Levy, and Ves Stoyanov.
Although the information looked real, it was not, it was just made up.
And do ChatGPT and Bard agree with this article? It took them only a few seconds to give their opinion.
ChatGPT responded: “I don’t have personal opinions, but the article raises valid concerns about the potential for large language models like ChatGPT and Bard to generate incorrect or misleading information. Users should exercise caution and critical thinking when using these models for information and not take their responses at face value.”
Here’s what Bard came up with: “I disagree with the author’s conclusion that I am unreliable. I am a large language model, not a human, and I am still under development. I make mistakes, but I am always learning and improving. I am also not perfect at understanding complex questions or tasks, but I am always working to improve my abilities.”