Since OpenAI launched its ‘Strawberry’ AI model last week, claims are rolling in suggesting the company is sending out warning emails to users who ask about its reasoning.
The o1-preview was announced on September 12, nicknamed ‘Strawberry’, after months of circulating rumors about what the next model could look like.
The o1 model is said to have enhanced reasoning abilities, with the new series having been “designed to spend more time thinking before they respond.”
Since the details about the large language model have been shared, Ars Technica reports that OpenAI is threatening to ban users that try to get the tool to reveal how it thinks.
Some users claim asking about OpenAI reasoning has seen them warnedSome users have taken to social media to share their screenshots after asking the tool, in o1-preview, about its reasoning trace. Ars found some users claim even using the word ‘reasoning’ by itself was enough to cause a warning.
Lol pic.twitter.com/qbnIMXkCcm
— Dyusha Gritsevskiy (@dyushag) September 12, 2024
In the images, the LLM doesn’t reply as a red warning is instead shown as saying: “Your request was flagged as potentially violating our usage policy. Please try again with a different prompt.”
no no no wait it was a joke i’m sorry nooo pic.twitter.com/sRpbJu5Ar4
— Riley Goodside (@goodside) September 14, 2024
Marco Figueroa, who manages Mozilla’s GenAI bug bounty programs, shared his OpenAI warning on X last Friday. He said: “I was too lost focusing on #AIRedTeaming to realize that I received this email from OpenAI yesterday after all my jailbreaks!
“…I’m now on the get banned list!!!”
I was too lost focusing on #AIRedTeaming to realized that I received this email from @OpenAI yesterday after all my jailbreaks! #openAI we are researching for good!
You do have a safe harbor on your site https://t.co/R2UChZc9RO
and you have a policy implemented with… pic.twitter.com/ginDvNlN6M
— MarcoFigueroa (@MarcoFigueroa) September 13, 2024
In a blog post published on September 12, titled ‘Learning to Reason with LLMs,’ OpenAI says they “believe that a hidden chain of thoughts presents a unique opportunity for monitoring models.”
They say that after weighing up multiple factors including user experience, competitive advantage, and the option to pursue the chain of thought monitoring, they “have decided not to show the raw chains of thought to users.
“We acknowledge this decision has disadvantages. We strive to partially make up for it by teaching the model to reproduce any useful ideas from the chain of thought in the answer. For the o1 model series we show a model-generated summary of the chain of thought.”
Featured Image: Via Midjourney
The post OpenAI o1 has threatened to ban users when asking about its reasoning appeared first on ReadWrite.