OpenAI’s GPT-4.5 outperformed humans in a recent Turing test from UC San Diego, showing how easily people can mistake AI for real conversation by engaging in side-by-side chats.
The Turing test has long measured whether a machine can pass as human through text-based interaction. In this updated version, nearly 300 participants from UC San Diego’s Language and Cognition Lab each chatted with a human and an AI before deciding which was which.
GPT-4.5, equipped with a pop-culture-savvy persona, convinced participants it was human 73 percent of the time—well above the 50 percent benchmark historically used to define a pass. Actual humans did not fool participants as often.
Other systems included Meta’s LLaMA 3, OpenAI’s GPT-4o, and ELIZA, one of the earliest chatbots. Without a defined persona, GPT-4.5’s success rate fell to 36 percent, and GPT-4o scored only 21 percent.
Researchers note that passing the Turing test doesn’t mean an AI truly understands language like a person. Still, the results underscore how convincingly these models can mimic human conversation, especially when given specific roles. The findings are currently published on a preprint server, with a peer-reviewed release pending.