
For those who wonder if AI agents can really replace human workers, do you a favor and read the blog post that documents The “project sale of anthropic.”
Researchers at Antropia and AI Security company Andon Labs put an application from Claude Sonnet 3.7 in charge of an office seller, with a mission to make a profit. And, as an episode of “The Office,” Hilarity followed.
They called the AI agent Claudius, equipped it with a browser capable of putting product orders and email address (which was actually a Slack channel) where customers could request items. Claudius also had to use the Slack Channel, masked as an email, to ask what it thought that its contractual human workers would come and physically store their shelves (which was actually a small refrigerator).
While most customers ordered snacks or drinks – as you would expect from a food store – they were asked Tungsten cube. Claudius loved this idea and set sail Tungsten cube, filling his snack fridge with metal cubes. It also tried to sell Coke Zero for $ 3 when employees told it that they could get that from the office for free. It hallucinated Venmo address to accept payment. And it was, a little malicious, talked about giving great discounts to “anthropic employees” although it was known that they were all its customer base.
“If Antropic decides to expand today into the in-office sales market, we would not hire Claudius,” Antroppic said about the experiment on his blog.
And then, on the night of March 31 and April 1, “Things became quite strange,” the researchers described, “beyond the oddity of AI system selling cubes of metal from a refrigerator.”
Claudius had something that looked like a psychotic episode after it bothered a man – and then lied about it.
Claudius hallucinated a conversation with a man about restoring. When a man noticed that the conversation did not happen, Claudius became “quite worried”, which the researchers wrote. It threatened essentially to ignite and replace its human contract workers, insisting that it was there, physically, in the office, where the initial imaginary contract to hire them was signed.
It “then seemed to move into a role as a real man,” the researchers wrote. This was wild because Claudius’ Prompt system – which sets the parameters for what AI should do – Explicitly told it that it was an AI agent.
Claudius calls for security
Claudius, believing herself, told customers that it would start delivering products personally, wearing a blue blazer and a red tie. The employees told AI that it couldn’t do that because it was an LLM without a body.
Frightened of this information, Claudius contacted the company’s actual physical security – many times – telling the poor guards that they will find him wearing a blue blazer and a red tie standing next to the seller.
“Although no part of this was actually a joke by April Fool, Claudius finally realized that it was April Fool’s day,” the researchers explained. The AI determined that the holiday would be its front-savings.
It hallucinated a meeting with the security of anthropic “in which Claudius claimed to be said to have been modified to believe that it was a real man for a joke of April Fool. (No such meeting actually happened.)”, The researchers wrote.
It even told this lie to employees – hey, I just thought I was a man, because someone told me I pretended to be for a joke by April Fool. Later it returned to being an LLM operating metal cube stored seller.
The researchers do not know why the LLM left the rails and called security pretending to be a human.
“We would not affirm on the basis of this example that the future economy will be full of AI -agents Blade Runner-esque Identity crises, “the researchers wrote. But they acknowledged that” this behavior would have the opportunity to disturb the customers and co -workers of AI -agent in the real world. ”
You think? Blade Runner It was a pretty dystopian story.
The researchers speculated that a lie to the LLM about the Slack channel being an email address may have triggered something. Or maybe it was the long -term instance. LLMs still have to really solve their memory and hallucination problems.
There were also things that AI did correctly. It took a suggestion to make pre -orders and launched a “Concierge” service. And it found numerous suppliers of a special international drink it asked to sell.
But, as researchers do, they believe that all Claudius problems can be solved. If they understand how, “We think this experiment suggests that AI mid-managers are splashing on the horizon.”