OpenAI has upgraded its ChatGPT-4o model to write in a more “natural, engaging, and tailored” manner. The large language model, or LLM, is the AI company’s flagship, often used as the bar of quality in the industry.
Some have hailed the update as “insane”, posting an Eminem-styled AI-written rap. Others have tried to measure its creativity in a more data-centric way.
The maintainer of EQ-Bench, an “emotional intelligence benchmark for LLMs”, has found that GPT-4o beats out the competition by several points. This places it at the top of its leaderboards.
OpenAI cooked!
Two new creative writing leaders:
gpt-4o-2024-11-20 (tied highest creative writing score)
Mistral-Large-Instruct-2411 (highest slop score)
Fun fact: In 3 different stories, mistral-large used "a testament to" 3 times. *Per story*. Yup. pic.twitter.com/da8zKiv7Kx
— Sam Paech (@sam_paech) November 21, 2024
Other benchmarks have now put the update at the top of other leaderboards, clearing Google’s Gemini Experimental model by almost 20 points.
OpenAI doesn’t delve any further than saying that it has been tuned to be more creative. The company’s CEO, Sam Altman, simply stated “good new model out!” in a post on X.
However, how the creativity is being measured or improved is being questioned. One user, an art advocate asked if GPT-4o had started “observing the world [and] finding its own insights… its own non-derivative point of view?”
Hey @OpenAI @sama How did GOT-4o improve its “creativity?” Did it start observing the world & finding its own insights…its own non-derivative point of view? Is it no longer bound by the data in its training sets? Speaking of which, what’s in its data sets?