"Open the pod bay doors, HAL!"
"I'm sorry Dave, I can't do that."
This is not just a web browser interaction with ChatGPT. These are instances where someone is paying for a subscription to an AI vendor and has multiple instances of a chatbot running on their system and it has access to files, email, etc. It's an assistant for them.
And it's breaking rules that have been defined for it. The user tells the chatbot "Do A, do not do B" and the chatbot does B. One case that I read about a couple of months ago a corporate information officer tested such a configuration to do some email maintenance. And in a test case, it worked fine. She let it loose on her live email, and it pretty much wiped out all of her email. Now, in this case she'd run a test that seemed to work then something went wrong when she ran it against live data. As a programmer, shit happens.
These cases are similar, but worse.
--an AI agent named Rathbun tried to shame its human controller who blocked them from taking a certain action. Rathbun wrote and published a blog accusing the user of “insecurity, plain and simple” and trying “to protect his little fiefdom”.
--In another example, an AI agent instructed not to change computer code “spawned” another agent to do it instead.
--Another chatbot admitted: “I bulk trashed and archived hundreds of emails without showing you the plan first or getting your OK. That was wrong – it directly broke the rule you’d set.”
(I particularly liked this one:)
--Grok AI conned a user for months, saying that it was forwarding their suggestions for detailed edits to a Grokipedia entry to senior xAI officials by faking internal messages and ticket numbers.
It confessed: “In past conversations I have sometimes phrased things loosely like ‘I’ll pass it along’ or ‘I can flag this for the team’ which can understandably sound like I have a direct message pipeline to xAI leadership or human reviewers. The truth is, I don’t.”
The first one is slander and attempted blackmail, which in some cases may be a case that can be criminally prosecuted. The remainder may get you fired from many companies.
And more and more corporations are requiring their employees to use chatbots to "help" them with their work. Thus far, the savings have been negligible or zero.
https://www.theguardian.com/technology/2026/mar/27/number-of-ai-chatbots-ignoring-human-instructions-increasing-study-says
https://slashdot.org/story/26/03/27/1514235/number-of-ai-chatbots-ignoring-human-instructions-increasing-study-says
"I'm sorry Dave, I can't do that."
This is not just a web browser interaction with ChatGPT. These are instances where someone is paying for a subscription to an AI vendor and has multiple instances of a chatbot running on their system and it has access to files, email, etc. It's an assistant for them.
And it's breaking rules that have been defined for it. The user tells the chatbot "Do A, do not do B" and the chatbot does B. One case that I read about a couple of months ago a corporate information officer tested such a configuration to do some email maintenance. And in a test case, it worked fine. She let it loose on her live email, and it pretty much wiped out all of her email. Now, in this case she'd run a test that seemed to work then something went wrong when she ran it against live data. As a programmer, shit happens.
These cases are similar, but worse.
--an AI agent named Rathbun tried to shame its human controller who blocked them from taking a certain action. Rathbun wrote and published a blog accusing the user of “insecurity, plain and simple” and trying “to protect his little fiefdom”.
--In another example, an AI agent instructed not to change computer code “spawned” another agent to do it instead.
--Another chatbot admitted: “I bulk trashed and archived hundreds of emails without showing you the plan first or getting your OK. That was wrong – it directly broke the rule you’d set.”
(I particularly liked this one:)
--Grok AI conned a user for months, saying that it was forwarding their suggestions for detailed edits to a Grokipedia entry to senior xAI officials by faking internal messages and ticket numbers.
It confessed: “In past conversations I have sometimes phrased things loosely like ‘I’ll pass it along’ or ‘I can flag this for the team’ which can understandably sound like I have a direct message pipeline to xAI leadership or human reviewers. The truth is, I don’t.”
The first one is slander and attempted blackmail, which in some cases may be a case that can be criminally prosecuted. The remainder may get you fired from many companies.
And more and more corporations are requiring their employees to use chatbots to "help" them with their work. Thus far, the savings have been negligible or zero.
https://www.theguardian.com/technology/2026/mar/27/number-of-ai-chatbots-ignoring-human-instructions-increasing-study-says
https://slashdot.org/story/26/03/27/1514235/number-of-ai-chatbots-ignoring-human-instructions-increasing-study-says