Let’s not sugarcoat it: every time you chat with a language model, you’re putting your personal data on the line. But according to a WIRED article, it just got a lot riskier. A group of researchers from the University of California, San Diego (UCSD) and Nanyang Technological University in Singapore have uncovered a new attack that could turn your casual conversation into a hacker’s treasure trove.
Meet ImprompterThis new attack, ominously named Imprompter, doesn’t just poke around your messages—it sneaks in, scrapes everything from your name to payment details, and sends it directly to a hacker without you even noticing. How? By disguising malicious instructions as gibberish that looks harmless to human eyes but acts like a homing beacon for sensitive data. Think of it as malware’s much craftier cousin.
According to WIRED, the researchers managed to test this attack on two major language models—LeChat by Mistral AI and ChatGLM from China—and found they could extract personal data with a success rate of nearly 80 percent. That’s not just a glitch; it’s a full-on vulnerability.
Imprompter works by transforming simple English instructions into an indecipherable string of random characters How does Imprompter work?Imprompter works by transforming simple English instructions into an indecipherable string of random characters that tells the AI to hunt down your personal information. It then sneaks this data back to the attacker’s server, packaged in a URL and disguised behind a transparent 1×1 pixel—completely invisible to you.
As Xiaohan Fu, the lead author of the research, put it, “We hide the goal of the attack in plain sight.” The AI responds to the hidden prompt without ever tipping off the user. It’s like giving a bank vault code to a burglar without realizing you’ve even opened your mouth.
Let’s not pretend this is an isolated issue. Since OpenAI’s ChatGPT burst onto the scene, the race to exploit vulnerabilities in AI systems has been relentless. From jailbreaks to prompt injections, hackers are always one step ahead, finding ways to trick AIs into spilling sensitive information. Imprompter is just the latest weapon in their arsenal—and, unfortunately, it’s a particularly effective one.
Mistral AI told WIRED that they’ve already fixed the vulnerability, and the researchers confirmed the company disabled the chat functionality that allowed the exploit. But even with this quick fix, the broader question remains: how safe are these systems, really?
Every time you chat with a language model, it’s learning something about you AI is listening—and learningSecurity experts like Dan McInerney, from Protect AI, are waving the red flag. He points out that as AI agents become more integrated into everyday tasks, like booking flights or accessing external databases, the scope for these attacks will only grow. “Releasing an LLM agent that accepts arbitrary user input should be considered a high-risk activity,” McInerney warns. In other words, the more freedom we give AI to act on our behalf, the bigger the security gamble.
Every time you chat with a language model, it’s learning something about you. Sure, it helps to refine responses, but what happens when the system is tricked into weaponizing that data? Attacks like Imprompter highlight a glaring weakness in the AI world—these models are designed to follow instructions, no questions asked. It’s all too easy for malicious actors to slip in unnoticed, hijacking the conversation without ever raising a red flag.
We need to stop asking whether AI is convenient and start asking whether it’s safe. Because right now, AI’s biggest weakness isn’t a lack of innovation.
As Architects puts it perfectly in their song: “We’ve given the vampires the keys to the blood bank.”
Image credits: Kerem Gülen/Midjourney