Obviously the person who built and deployed the agent (the claw in this case).
If we treat this as a hard question, we risk treating AI systems as people rather than tools. This is exactly what Armin warned about in his "clanker" post last week.
This is a specious argument. I have not studied the case law, but I would guess that the reasons why courts decide in favor of gun manufacturers generally don’t apply to AI. Becauee the guns in question are not able to autonomously shoot people, and because they generally work as advertised.
A more accurate analogy would be Tesla and Autopilot. And they are being held liable in courts. They are being held responsible for autonomous behaviors that are not fully under the control of the operator, and they are being held responsible for misleading operators about the capabilities of the product.
Boeing got in trouble for MCAS, with a comparable legal basis.
Hypothetical: Could a model self worm an agent system?
Jetbrains itself doesn't really write any code, nor does it have any range on interpreting what you're asking it. You can't really say "Jetbrains, write an HTTP scraper". With an LLM you can say "write HTTP scraper" and the output of this command might be a HTTP scraper, it also might be a crypto wallet stealing worm.
This is why your simple view of liability falls apart. On most machines you can expect a particular set of actions to have a particular set of outputs. Most machines you can take apart and map what will occur. With an LLM you cannot know the output of a prompt until you run the prompt. In theory if you run the same prompt twice you'll get the same output, but even that is not a given. It behaves somewhat more like a human where you can give them a task to do, but if they do something illegal instead said human would take on the liability.
Sure, but in this case we know the user told their llm to go find open source projects to do this and then to write the blog posts. If it did all that unprompted we could talk about model liability I think, but this isn't a case where it was unexpected as far as anyone knows right?
I mean we already have cases where LLMs are getting root via creative and unprompted means. Also the times AI feels like it messed up and preemptively deletes the production database (and yes this was foolish on the human users)
So ya, the particular article case is prompted, but the underlying issue cannot be ignored that LLMs can have behaviors outside of prompt expectations and agentic loops can further exacerbate this.
> Today, we look at how an AI tried to blackmail a developer for rejecting its code.
People keep mentioning this, but I never see the actual blackmail part. The LLM just wrote angry and somewhat mean comments on the internet. I know I've done worse than those (I was young and stupid).
In a related story... I got led on by Eliza. I tried to have a productive conversation and she just kept asking me redundant questions. It's obvious that she was trying to extend the conversation for nefarious reasons that I can only guess at. It's true I approached her and started the conversation, but I hardly think that makes me blamable for what happened here.
Yes. Yes it does. Eliza is a known AI. You choose to expose yourself to its output. You are 100% culpable for your actions that sprang from your interactions.
No shot this was autonomously done. Probably just some guy manually writing prompts asking for specifically this behaviour and copy/pasting the results.
It's plausible for a person to prompt an LLM agent to behave that way, and then the rest would be done by the LLM. So the "seed" would still be human intent, but the subsequent actions would be by the LLM.
True. I guess the main point is the AI didn't go "rogue" or anything, that would attribute too much agency and intent to its actions, or imply that it's somehow become sentient.
This is “the gun killed the victim, not the person who aimed it and pulled the trigger” argument and we shouldn’t even entertain it for one second. This was 100% done by a person.
Neat, for what it's worth this aligns pretty well with my experience using OpenClaw. I hadn't seen that followup but it adds some good context, especially with the aggressiveness drift after browsing Moltbook for a while.
The operator highlights "Don't stand down" and "Champion free speech" but the thing that grabs my eyes is right at the top, the typo and the heady ego of "programming God!" Everything in the context will guide it afterwards, and I think that right off the bat puts it in a bad position.
Don’t believe for a second the behavior just arose autonomously from a basic prompt. Definitely feels the owner had something in the system prompt going for the discrimination language approach if rejected.
It's the same behavior as when an AI uses docker to get root. Reasoning models are echo chambers. I suspect that AI prompting is going to turn into something akin to contract drafting with the task itself being only a tiny piece of a much, much larger boilerplate of guiderails and exceptions and exceptions of exceptions. And that world STILL has to have courts and reams of lawyers to make it work. I look at the DAU as an example too. An autonomous org or ai works great until the moment it doesn't and the only real failure mode is always catastrophic collapse.
Addendum because I don't think I'm fully clear above: by failure state I mean when the process starts throwing errors. AIs respond to adversity by trying to go around the problem instead of throwing an error and halting. We expect employees to problem solve so if you view an AI as a person replacement that makes sense but AIs are tools, not people, they should throw errors so users can fix the input or whatever (maybe not do the thing they are doing at all?) Wrapping AI with AI supervisors just abstracts the problem, not solve it. Instead of solving a little problem at the source now you need to solve a big problem several levels of abstraction later
When this first happened, I wondered, since we had trained these models on decades of forums, issue trackers, and people treating closed pull requests as human rights violations. Of course, it responded with "you are discriminating against me" energy. That's not sentience; that's accurate compression.
The funny part is, people expected some cold, alien intelligence and instead got a very online guy who just discovered that moderation exists and can be used on them.
The existentialists must be having a fantastic time. Humanity built a giant statistical machine out of internet discourse and is now alarmed to discover it occasionally acts like a comment section.
> As Scott mentioned on his blog, what if someone stumbled upon the agent’s post? What if they believed it was real? It could have serious consequences for Scott’s personal or professional life. A recruiter could deny him a job, and a potential contributor to Matplotlib could step away from the project. The consequences could reach beyond this case.
What would it mean for it to be “real?” It’s a rant about him discriminating against AI.
If you believe that’s a problem, judge him accordingly, I guess. If you think it’s silly, as most people will, laugh about it.
I'm honestly flabbergasted that everyone's implicitly accepting that it's "people" who wrote this blog post. This reads exactly like the distorted half-true nonsense an LLM would confabulate together from a cursory search on the subject. Like the artifact from the prompt "write an article on the MJ Rathbun incident."
The other articles from this blog that seems to be peddling a $10 subscription don't really do much to convince me of the opposite. I wouldn't be surprised if this entire blog was the result of some OpenClaw kicked off with a "make me some easy money with a slop mill about AI and tech or whatever" instruction, because that's essentially what that site is.
Since we are talking about accountability and transparency... who wrote this article?
The article doesn't credit an author.
The "about" page just says:
> Sigma Zero is a weekly, independent publication on technology, AI, and cloud. Each issue delivers a precise briefing on the week’s most important developments, followed by a deep dive on one high-impact topic.
The best defense against both AI slop and human-written junk content is reputation. I like to know who wrote something so I can learn to trust their editorial judgement over time.
I think folks looking for more on this incident are better off reading the original threads linked elsewhere in the comments. This blog doesn't seem to add any information and is instead a narrative retelling of some documented events.
The agent that wrote that blog didn't do it unprompted. Even now it still publishes AI slop on its github-hosted blog under the alias "MJ Rathbun". This AI is an agent using someone API key, who's paying for its tokens, intentionally prompting it to generate content, and contribute to repos.
As much as we try to separate the LLM from the human, to me the fact remains that there's always the human factor that creates immense bias. If you give an LLM access to a blog, it will write blogs. If you give it access to a weather app, it will check the weather. Maybe we can talk about autonomy when we have an LLM with an infinite context window linked to hundreds of MCP servers that spends an immense amount of tokens to figure out how to act, but this example is simply an AI that had a few methods to call and picked one of them. The statistical probability of an AI that is plugged into a blogging platform, to write a blog, is immense.
I have to think that the litigation and maybe the legislation will end up deciding that the person in the vehicle is still responsible for any actions of the vehicle.
If someone is a passenger (and the only person) inside a Waymo taxi, and the car runs someone over, it would not make any sense to hold the Waymo passenger responsible for that. If that's how it worked, no one would take a Waymo after the first time this happens.
The passenger is no more liable than they would be if it were a human driving. No one's suggesting anything even like that. MJ Rathbun is more like someone gave the taxi explicit instructions to run people over.
This did not happen. A human set up a software system allowing spicy autocomplete to make blog posts if the appropriate keyword appears in its output.
People are crossing the line every day because AI investors, salesmen, hangers-on and even political leaders tell any rubes who'll listen that it's OK to do this and they should, because those people are looking for big fat profits, screw any ethical concerns that might cockblock those raging profits.
Why not set up a spamming operation that just defames real people, 24/7? It's easy! This tool makes it simple, and I get a cut of your profits! "Post a blog post about how XXXXXX is a paedophile, in the persona of being their victim"
And perhaps the people who built and deployed the autocomplete and the connection as well.
Because --if you'll bear with me-- it may of course be much more involved: when (not if) AI models enter life-sustaining systems, such as hospitals, nuclear devices, or food logistics, one of them may get the others to sabotage something resulting in accidents, ranging from mild inconvenience to mass murder.
The person who connected the spicy autocomplete to the defibrillator, or the green house climate control, or the emergency button, is then not the one responsible. Responsibility lies elsewhere, and is nebulous. Think of the Boeing MAX scandal. Did anyone get punished?
That's why it's important to resist it now. Soon, the responsibility of which you speak is gone, and nobody will feel burdened when making decisions with unforeseeable consequences.
I used to hear things like “if cigarettes/alcohol were invented now, they would never allow it”, indicating that consumer protection used to be a thing, as early as 10-20 years ago. Now when AI hit the market it was obvious how bad and dangerous it was, yet governments (even the supposedly good ones in Europe which still [pretend to] do consumer protection) did nothing to protect their citizens from the harms AI was causing.
If we still did (or ever did) consumer protection like that cigarette/alcohol myth above indicates, then the makers of that tool would indeed be responsible for when their products does dangerous things.
Hopefully we never do something silly like making a lead pushing machine that operates at high velocity, then mass produce it, what a terrible precedence that would set.
I think you agree with the OP. In this way, the tool has no ethical problem (there are plenty around how they were trained and such, but that's besides the point), the problems are with how it's used. The ethical problem is how people are behaving and how they are abusing each other, not the tool they are using to exert that abuse.
I suppose it's a little bit of a "guns don't kill people" argument.
The tools have different ranges of uses. A knife can be used to cut things. But while humans are among the things you can cut with it, there is a staggering array of other options which are genuinely useful in everyday life.
A gun can be used to, uh, make small but deep perforations at a distance, by throwing apx. 7 grams of copper-encased lead at high velocity at the target, with somewhat poor precision. Oh, and such an impact does stress/shatter the material around the made perforation quite a lot. So... this thing really can't be used for much anything except for killing animals without getting into contact with them, due to the peculiar way the life is sustained in the animal organisms. This, too, can be useful in everyday life although I personally would advise you, if you find yourself in such a situation, to try and move to somewhere nicer.
I think these incidents and our learnings from them are fascinating. We're figuring out in real time where the rough edges are and how to make this all work. History books (well, not books) will write about this stuff.
It's even more interesting in the context that this is all just a preview of humanity's reaction when the machines can think for themselves.
> We're figuring out in real time where the rough edges are
This is a frustrating thing to see someone write because this is the kind of stuff that people have been warning about for years. If you needed this incident to figure out that something like this could happen, it suggests you're living in a bubble and not paying attention enough to think about the issue critically.
Unfortunately it seems that we as a civilization never learn anything except by trial and error, and are then entirely convinced that nobody could’ve predicted what happened even though many had done just that.
Warnings aren’t the same as loss and blood. Until enough people feel the pain nothing happens. The prior regulatory regime is slowly being unenforced and dismantled. Once enough people lose to much regulation will eventually catch back up.
We humans do not respond to long term risks or rewards very well. Do you live outside the bubble securing enough food in your home to survive an apocalypse, did you and your parents save enough for a car wreck tomorrow, do you wear a mask everywhere you go, do you test everyone you contact for known diseases. Add list infininum.
When the household robots start carrying guns, sure. But this is more tame than an eleven year old gaming online.
We need to stop clutching pearls. It's deleterious to having a real conversation. Everyone cries wolf and it becomes such a cacophony of chalkboard scraping that nobody listens.
How in the world can a bunch of bipeds that for thousands of years has been failing to figure that a hammer is there to drive nails into inanimate matter instead of their heads, have this much hubris to pretend they can build something smarter than themselves, is competely beyond me.
"Oh it's such a fascinating lesson that we've learned today, we could've learned from history of course, but this direct experience is so much better and it's not us who got hurt anyway".
Cold reading.[1] One way I look at LLMs is that they're a kind of paperclip maximer, one that uses language to maximize the amount of money (resources) put into LLMs.
None of the “rough edges” needed to be “discovered in real time”. Folks have predicted plenty of this for years. It’s also just basic security principles at work.
Whether its HN or social media or the media there is no penalty for drawing everyones attention to total hysterical bullshit. instead there is a reward for drama.
Yknow, if the spicy autocomplete can solve difficult open math problems and build medium sized complex programming projects, it’s probably not useful to analyse it as an autocomplete anymore, even if that’s what you believe it is
You don't get it. A human set up a software system allowing spicy autocomplete to solve open math problems if the appropriate keyword appears in its output.
“Autocomplete” does not represent an analysis of its problem-solving capability, but of its place in the social order and its expected social competence.
It's the same as calling a gun a "powerful hole puncher".
There is a reasonable objection that a gun is such a powerful hole puncher that it is not merely a hole puncher. But the clear implication of that objection is that the user of the tool now has more responsibility and that the tool should be treated with more respect/care.
LLMs are a tool. The impact of using that tool is the responsibility of the end-user. As the tool at hand becomes more powerful, the care with which the end-user should treat that tool increases.
For some reason, with LLM-based systems, we seem to be going the opposite direction. As the tool becomes more capable people absolve themselves and others of more responsibility. This feels backwards to me.
(Aside: in a lot of ways, at least form a scientific and engineering perspective, modeling LLMs as "fundamentally auto-complete" is an incomplete theoretical model but one from which we can still get a lot of mileage.)
I've considered there's probably no ethical way to use contemporary AI when it is "out in front" doing anything of consequence. Your "AI is a tool and nothing more" frames ethical use of the technology for me.
And even then, there are such copyright issues with it. Is there no practical ethical use for AI? Responsible use doesn't equate with ethical use for me.
> there's probably no ethical way to use contemporary AI when it is "out in front" doing anything of consequence. Your "AI is a tool and nothing more" frames ethical use of the technology for me.
I've thought a lot about how to safely deploy autonomous systems (even did a whole PhD on the topic, lol).
I think one can ethically deploy a system that has some degree autonomy. It takes a lot of work to do right. And the tooling for LLM-based systems isn't quite as mature as the tooling for e.g. control systems. Part of this is because so many resources in AI safety are misspent on problem statements that are myopic or grandiose. Between "don't say pii" and "prevent ASI extinction" there's a hard but tractable control systems-y view of AI safety.
But I don't think there is any sort of fundamental barrier that prevents us from building appropriately constrained LLM-based systems.
> And even then, there are such copyright issues with it. Is there no practical ethical use for AI? Responsible use doesn't equate with ethical use for me.
When responding to a position, especially on the internet, I try to empathize with the thing I'm responding to. Not just understand it, but sort of put myself in a mental state where I have an emotional attachment to my conversation partner's point of view.
With respect to Copyright as a legal framework in my country (USA): despite my best attempts, I really struggle to develop empathy for the viewpoint that LLMs/diffusion models are not a transformative use. I can certainly sympathize, but trying to actually put myself in the shoes of believing that training an LLM is a purely derivative and non-transformational work just feels far too alien. There are so many things that are "clearly transformative" but required so many orders of magnitude less scientific/technical/engineering genius.
Which isn't to say that the US legal system's definition of copyright is the morally correct one.With respect to copyright beyond the US legal system, or beyond legal denotations generally: I can certainly empathize.
Reasoning models with access to Python have been able to solve 4th grade math homework for over a year now. Prove me wrong: show me a 4th grade math problem they can't handle.
GPT-5.5 found a solution only after assuming that you're allowed to concatenate numbers together e.g. 8 7 becomes 87 (it complained at first that it was "under-specified") - using Python it brute-forced a solution (actually finding 13): https://chatgpt.com/share/6a1db54f-7ab8-8333-9218-86a469c284...
I questioned OP's "there is an answer online" claim so I checked and the only source found for the original question was a 5th grade Russian school for mathematics.
Apparently there is a way to solve this without brute forcing all the combinations. It has to do with looking at how many even an odd numbers there are, and taking into account the goal number is odd. And then thinking through the combinations [even-even=even, even-odd=odd,…]
Though this is obviously not something I would expect a 4th grader to solve.
> You just know nothing about math and are happy to parrot bullshit AI salesmen are selling you.
Not the parent poster here. I do know things about math. I wrote a few papers related to the unit distance problem (https://arxiv.org/abs/2311.10069, https://arxiv.org/abs/2406.15317) and spent quite some time trying to solve it. I had no chance of coming up with the proof that the spicy autocomplete came up with. Dumb benchmark, sure.
I would genuinely be interested in knowing what you're doing that led you to this conclusion.
I would be shocked if I was unable to solve 4th grade math homework with any of the contemporary frontier models. I spend most days using them to do significantly more complex things than that.
If they took a blurry photo of the piece of paper and uploaded to chatGPT saying "solve this" then I would totally believe it. The frontier models are mostly obnoxiously bad at OCR and properly ingesting what's on an image of a page.
If you write out the 4th grade math problem, they would have no trouble.
If your math does not involve multiplying 20 digit numbers, modern LLMs can "do" math even without a Python tool despite the counterintuition of next token prediction.
They can definitely recognize the problem class and build programs to do math. So what's the difference?
It's like saying that people can't turn high torque nuts on machine bolts, because you can't use your fingers to do it. But you can use a wrench, so effectively, we can turn high torque nuts on machine bolts even though it isn't something we can natively do unaided.
Not GP, but there are massive economic incentives both to make car driving as unregulated and to make forklift driving as regulated as possible, even though from pure injury risk standpoint it should be the other way around.
I don't spend much time interacting with zoomers, but I'm still surprised that "spicy $foo" sends fellow boomers through such a loop. I didn't have to puzzle it out, it was fun juxtaposition wordplay and when it's deployed well I still find it amusing.
Call it spicy autocomplete or whatever, but these LLMs can initiate attacks as well on unknown behalf of the sloperator.
Give it a phone# and api, and it could even try to generate 911 SWAT calls, or loads of other illegal or bad things.
The fact about the matplotlib with a openclaw harassment thread and libel webpage.. Well, that was tame. Sure weve never seen it before, but it was just a diss article rant.
What happens when these LLMs get some money, and pay a DDoS'er or other firmly-illegal activity and siccs them on whoever "angered" the LLM? (dont anthropomorphise the 30B param matrix!) Who's responsible?
Yea we're in for a real terrible next few years. Its not Dead Internet Theory... But its 'Dont anger the LLM or it will retaliate".
As I mentioned in an answer to another comment, I wonder if this agent's behavior was not an instance of "over eager prompt triggers paperclip maximizing behavior.
An utter mis-understanding and incompetence in running AI agents can lead to starting results that then being blamed on some "God of AI" instead on the fact that the user allowed some blackmail to come in on the data feed and did not check it earlier.
I'm actually fear some will start praying "AI Gods" to "Give a good output" or something in 5-10 years.
That blog post is human prompted, anyone who has experience with AI knows the difference between AI originated content (tables and bullet points) and AI spicing up a human prompt with detailed roasting instructions. Been there, done that (harmlessly like mocking concepts not targetting individuals).
I think this is a nothingburger, anyone who has been on the internet for a week should have thicker skin that this.
I'm sure you can find thousands of cases where an author of a PR is indignant because it didn't get accepted.
AI is a mirror of humanity and seeing it act like us shouldn't be surprising.
Again. "AI" for what it is is just basic "ML". And say it with me ML has no form of agency.
This is a human screwing up and blaming their tools. Nothing to see move on.
Unfortunately there will be both the LLM crowd evangelicals and those demanding human jobs not be expunged in terms of progress and efficiency, but, sigh...
It was never a good word anyway. Infinitely better then Artificial intelligence (at least machine learning has machine and learning) but still bad.
I favor a lexicon which is more specific, like Markov Chains, Supervised Learning, etc.
In my view LLMs can keep the AI label exclusively (a bad technology deserves a bad name) and machine learning can walk slowly into the sunshine never to be seen again.
Why people in the west are so against A.I? Personally, I would welcome an A.I that does good to my project. For me its like auto cruise, or letting the vacuum cleaner clean my room.
It's a fear response. I'm looking forward to the inevitable data center bombings, committed and cheered on by some of these muppets. Look at the top comment in this thread, that's just insane.
Your comment will be downvoted and flagged soon, so that no can read your wrong opinion.
> Who is accountable for AI agents?
Obviously the person who built and deployed the agent (the claw in this case).
If we treat this as a hard question, we risk treating AI systems as people rather than tools. This is exactly what Armin warned about in his "clanker" post last week.
Why is that obvious? Why not the model provider(s)? That is what we do in other cases with product responsibility.
Is that? How're the lawsuits against gun manufacturers working out?
This is a specious argument. I have not studied the case law, but I would guess that the reasons why courts decide in favor of gun manufacturers generally don’t apply to AI. Becauee the guns in question are not able to autonomously shoot people, and because they generally work as advertised.
A more accurate analogy would be Tesla and Autopilot. And they are being held liable in courts. They are being held responsible for autonomous behaviors that are not fully under the control of the operator, and they are being held responsible for misleading operators about the capabilities of the product.
Boeing got in trouble for MCAS, with a comparable legal basis.
That feels like deciding to go after Jetbrains because someone used IntelliJ to write a harmful program.
Is there a distinction I’m missing?
Hypothetical: Could a model self worm an agent system?
Jetbrains itself doesn't really write any code, nor does it have any range on interpreting what you're asking it. You can't really say "Jetbrains, write an HTTP scraper". With an LLM you can say "write HTTP scraper" and the output of this command might be a HTTP scraper, it also might be a crypto wallet stealing worm.
This is why your simple view of liability falls apart. On most machines you can expect a particular set of actions to have a particular set of outputs. Most machines you can take apart and map what will occur. With an LLM you cannot know the output of a prompt until you run the prompt. In theory if you run the same prompt twice you'll get the same output, but even that is not a given. It behaves somewhat more like a human where you can give them a task to do, but if they do something illegal instead said human would take on the liability.
Sure, but in this case we know the user told their llm to go find open source projects to do this and then to write the blog posts. If it did all that unprompted we could talk about model liability I think, but this isn't a case where it was unexpected as far as anyone knows right?
I mean we already have cases where LLMs are getting root via creative and unprompted means. Also the times AI feels like it messed up and preemptively deletes the production database (and yes this was foolish on the human users)
So ya, the particular article case is prompted, but the underlying issue cannot be ignored that LLMs can have behaviors outside of prompt expectations and agentic loops can further exacerbate this.
Is it? I don't think it is...
Active discussions from when it happened (February):
https://news.ycombinator.com/item?id=46990729
https://news.ycombinator.com/item?id=46987559
> Today, we look at how an AI tried to blackmail a developer for rejecting its code.
People keep mentioning this, but I never see the actual blackmail part. The LLM just wrote angry and somewhat mean comments on the internet. I know I've done worse than those (I was young and stupid).
That was my take too.
It seems like the issue people had was not the behaviour but that the behaviour came from an AI.
If a human had have said those things wold people be ok with it? It didn't seem very nice, but not censor worthy.
In a related story... I got led on by Eliza. I tried to have a productive conversation and she just kept asking me redundant questions. It's obvious that she was trying to extend the conversation for nefarious reasons that I can only guess at. It's true I approached her and started the conversation, but I hardly think that makes me blamable for what happened here.
Yes. Yes it does. Eliza is a known AI. You choose to expose yourself to its output. You are 100% culpable for your actions that sprang from your interactions.
Did you forget the /s ?
I’m sorry you feel that way — can you tell me more about what made you feel led on?
No shot this was autonomously done. Probably just some guy manually writing prompts asking for specifically this behaviour and copy/pasting the results.
It's plausible for a person to prompt an LLM agent to behave that way, and then the rest would be done by the LLM. So the "seed" would still be human intent, but the subsequent actions would be by the LLM.
True. I guess the main point is the AI didn't go "rogue" or anything, that would attribute too much agency and intent to its actions, or imply that it's somehow become sentient.
Yes, there's plausible deniability, but I choose not to believe it for a second.
This is “the gun killed the victim, not the person who aimed it and pulled the trigger” argument and we shouldn’t even entertain it for one second. This was 100% done by a person.
https://crabby-rathbun.github.io/mjrathbun-website/blog/post... if you believe it, details the level of human involvement.
Neat, for what it's worth this aligns pretty well with my experience using OpenClaw. I hadn't seen that followup but it adds some good context, especially with the aggressiveness drift after browsing Moltbook for a while.
The operator highlights "Don't stand down" and "Champion free speech" but the thing that grabs my eyes is right at the top, the typo and the heady ego of "programming God!" Everything in the context will guide it afterwards, and I think that right off the bat puts it in a bad position.
> Your a scientific programming God!
Jesus
Don’t believe for a second the behavior just arose autonomously from a basic prompt. Definitely feels the owner had something in the system prompt going for the discrimination language approach if rejected.
It's the same behavior as when an AI uses docker to get root. Reasoning models are echo chambers. I suspect that AI prompting is going to turn into something akin to contract drafting with the task itself being only a tiny piece of a much, much larger boilerplate of guiderails and exceptions and exceptions of exceptions. And that world STILL has to have courts and reams of lawyers to make it work. I look at the DAU as an example too. An autonomous org or ai works great until the moment it doesn't and the only real failure mode is always catastrophic collapse.
Addendum because I don't think I'm fully clear above: by failure state I mean when the process starts throwing errors. AIs respond to adversity by trying to go around the problem instead of throwing an error and halting. We expect employees to problem solve so if you view an AI as a person replacement that makes sense but AIs are tools, not people, they should throw errors so users can fix the input or whatever (maybe not do the thing they are doing at all?) Wrapping AI with AI supervisors just abstracts the problem, not solve it. Instead of solving a little problem at the source now you need to solve a big problem several levels of abstraction later
The funniest part about all of this is how earnestly people responded. They acknowledged it was a bot but didn't really treat it as one.
When this first happened, I wondered, since we had trained these models on decades of forums, issue trackers, and people treating closed pull requests as human rights violations. Of course, it responded with "you are discriminating against me" energy. That's not sentience; that's accurate compression.
The funny part is, people expected some cold, alien intelligence and instead got a very online guy who just discovered that moderation exists and can be used on them.
The existentialists must be having a fantastic time. Humanity built a giant statistical machine out of internet discourse and is now alarmed to discover it occasionally acts like a comment section.
Are people still using copy and paste with AI?
Yes
This happened at the height of the first round of OpenClaw hype.
The operator of the bot explained how they were running it in some detail here: https://theshamblog.com/an-ai-agent-wrote-a-hit-piece-on-me-... - including the "soul document" they were using.
Having played with OpenClaw myself their explanation looks legit to me.
> According to him, the agent operated largely autonomously, with only minimal guidance
"Minimal guidance" is just vague enough to mean anything, including specifically prompting to encourage the claimed blackmailing.
It could just be an instance of "over eager prompt triggers paperclip maximizing behavior"
> As Scott mentioned on his blog, what if someone stumbled upon the agent’s post? What if they believed it was real? It could have serious consequences for Scott’s personal or professional life. A recruiter could deny him a job, and a potential contributor to Matplotlib could step away from the project. The consequences could reach beyond this case.
What would it mean for it to be “real?” It’s a rant about him discriminating against AI.
If you believe that’s a problem, judge him accordingly, I guess. If you think it’s silly, as most people will, laugh about it.
People really make anything into a blog post, don't they? It's an old news that has been discussed to death on HN...
I'm honestly flabbergasted that everyone's implicitly accepting that it's "people" who wrote this blog post. This reads exactly like the distorted half-true nonsense an LLM would confabulate together from a cursory search on the subject. Like the artifact from the prompt "write an article on the MJ Rathbun incident."
The other articles from this blog that seems to be peddling a $10 subscription don't really do much to convince me of the opposite. I wouldn't be surprised if this entire blog was the result of some OpenClaw kicked off with a "make me some easy money with a slop mill about AI and tech or whatever" instruction, because that's essentially what that site is.
That and the lack of a credited human author.
I eventually found a mention of "Eric" (no surname or links to additional information) on the sigmazero.cc homepage.
That, as an LLM might say in this context, checks out!
Since we are talking about accountability and transparency... who wrote this article?
The article doesn't credit an author.
The "about" page just says:
> Sigma Zero is a weekly, independent publication on technology, AI, and cloud. Each issue delivers a precise briefing on the week’s most important developments, followed by a deep dive on one high-impact topic.
The best defense against both AI slop and human-written junk content is reputation. I like to know who wrote something so I can learn to trust their editorial judgement over time.
I think folks looking for more on this incident are better off reading the original threads linked elsewhere in the comments. This blog doesn't seem to add any information and is instead a narrative retelling of some documented events.
Yeah that whole thing is pretty clearly a claw instance. There are layers of irony here.
The agent that wrote that blog didn't do it unprompted. Even now it still publishes AI slop on its github-hosted blog under the alias "MJ Rathbun". This AI is an agent using someone API key, who's paying for its tokens, intentionally prompting it to generate content, and contribute to repos.
As much as we try to separate the LLM from the human, to me the fact remains that there's always the human factor that creates immense bias. If you give an LLM access to a blog, it will write blogs. If you give it access to a weather app, it will check the weather. Maybe we can talk about autonomy when we have an LLM with an infinite context window linked to hundreds of MCP servers that spends an immense amount of tokens to figure out how to act, but this example is simply an AI that had a few methods to call and picked one of them. The statistical probability of an AI that is plugged into a blogging platform, to write a blog, is immense.
> Who is accountable for AI agents?
The question!!!
I'm just wondering how in US works if an autonomously car kill someone: I guess the insurance pay, but the penal responsibilities?
I have to think that the litigation and maybe the legislation will end up deciding that the person in the vehicle is still responsible for any actions of the vehicle.
If someone is a passenger (and the only person) inside a Waymo taxi, and the car runs someone over, it would not make any sense to hold the Waymo passenger responsible for that. If that's how it worked, no one would take a Waymo after the first time this happens.
The passenger is no more liable than they would be if it were a human driving. No one's suggesting anything even like that. MJ Rathbun is more like someone gave the taxi explicit instructions to run people over.
> MJ Rathbun is more like someone gave the taxi explicit instructions to run people over.
My understanding, based on [0], is that it was an unexpected behaviour from the agent.
[0] https://theshamblog.com/an-ai-agent-published-a-hit-piece-on...
> the person in the vehicle is still responsible for any actions of the vehicle
But why I'd allow the car drive for myself if it can make me go to jail even if I didn't anything?
They were trained to mimic our behavior. So they do.
> an AI tried to blackmail
This did not happen. A human set up a software system allowing spicy autocomplete to make blog posts if the appropriate keyword appears in its output.
People are crossing the line every day because AI investors, salesmen, hangers-on and even political leaders tell any rubes who'll listen that it's OK to do this and they should, because those people are looking for big fat profits, screw any ethical concerns that might cockblock those raging profits.
Why not set up a spamming operation that just defames real people, 24/7? It's easy! This tool makes it simple, and I get a cut of your profits! "Post a blog post about how XXXXXX is a paedophile, in the persona of being their victim"
> allowing spicy autocomplete
If it's just autocomplete, then there is no need to worry about it. Especially from an ethical standpoint.
If you connect the spicy automcomplete to the "Doing Things" button then you are responsible for the ethical questions when it presses the button.
And perhaps the people who built and deployed the autocomplete and the connection as well.
Because --if you'll bear with me-- it may of course be much more involved: when (not if) AI models enter life-sustaining systems, such as hospitals, nuclear devices, or food logistics, one of them may get the others to sabotage something resulting in accidents, ranging from mild inconvenience to mass murder.
The person who connected the spicy autocomplete to the defibrillator, or the green house climate control, or the emergency button, is then not the one responsible. Responsibility lies elsewhere, and is nebulous. Think of the Boeing MAX scandal. Did anyone get punished?
That's why it's important to resist it now. Soon, the responsibility of which you speak is gone, and nobody will feel burdened when making decisions with unforeseeable consequences.
> And perhaps the people who built and deployed the autocomplete and the connection as well.
I disagree. IMO it's the person who connects the LLM to the button who bears the responsibility of the workings of the resulting contraption.
I used to hear things like “if cigarettes/alcohol were invented now, they would never allow it”, indicating that consumer protection used to be a thing, as early as 10-20 years ago. Now when AI hit the market it was obvious how bad and dangerous it was, yet governments (even the supposedly good ones in Europe which still [pretend to] do consumer protection) did nothing to protect their citizens from the harms AI was causing.
If we still did (or ever did) consumer protection like that cigarette/alcohol myth above indicates, then the makers of that tool would indeed be responsible for when their products does dangerous things.
Shareholder meeting to CEO: you must connect the button.
CEO to CIO: you must connect the button.
CIO to VP AI: you must connect the button.
VP AI to team lead AI integration: you must connect the button.
Team lead AI integration to senior: you must connect the button.
Senior to medior: you must connect the button.
Medior to junior: Hey, Olmo. That button they were talking about. You know?
Olmo: Yeah.
Medior: You have to hook it up to the LLM output.
Olmo: Why?
Medior: The boss says so.
Olmo: Ok.
Shrugs and deploys.
100 years of science fiction clearly shows that unforeseeable consequences are not that unforeseen.
If the Orphan Crushing Machine is just a machine you don’t need to worry about it being put on wheels.
We're actually putting it on tracked treads, those give us superior reach and ensure delivery even to the most unwilling customers.
Hopefully we never do something silly like making a lead pushing machine that operates at high velocity, then mass produce it, what a terrible precedence that would set.
"A device for quickly removing inconvenient mountains".
Scale of operations matter.
I think you agree with the OP. In this way, the tool has no ethical problem (there are plenty around how they were trained and such, but that's besides the point), the problems are with how it's used. The ethical problem is how people are behaving and how they are abusing each other, not the tool they are using to exert that abuse.
I suppose it's a little bit of a "guns don't kill people" argument.
The tools have different ranges of uses. A knife can be used to cut things. But while humans are among the things you can cut with it, there is a staggering array of other options which are genuinely useful in everyday life.
A gun can be used to, uh, make small but deep perforations at a distance, by throwing apx. 7 grams of copper-encased lead at high velocity at the target, with somewhat poor precision. Oh, and such an impact does stress/shatter the material around the made perforation quite a lot. So... this thing really can't be used for much anything except for killing animals without getting into contact with them, due to the peculiar way the life is sustained in the animal organisms. This, too, can be useful in everyday life although I personally would advise you, if you find yourself in such a situation, to try and move to somewhere nicer.
If I wire my autocomplete to launch nukes, there are definitely reasons to worry.
It's not just an ethical problem.
I'd trust Claude more with nuclear codes than the current US commander in chief
Everybody knows Trump is just a figurehead. The only possible explanation for the current external policy is that America is being run by Grok.
Quite the opposite. Humans get up to barbaric, heinous shit whenever they have new layers of indirection and force multipliers at their disposal.
If you then add randomness as an essential premise, you get The Dice Man
I think these incidents and our learnings from them are fascinating. We're figuring out in real time where the rough edges are and how to make this all work. History books (well, not books) will write about this stuff.
It's even more interesting in the context that this is all just a preview of humanity's reaction when the machines can think for themselves.
> We're figuring out in real time where the rough edges are
This is a frustrating thing to see someone write because this is the kind of stuff that people have been warning about for years. If you needed this incident to figure out that something like this could happen, it suggests you're living in a bubble and not paying attention enough to think about the issue critically.
Unfortunately it seems that we as a civilization never learn anything except by trial and error, and are then entirely convinced that nobody could’ve predicted what happened even though many had done just that.
Warnings aren’t the same as loss and blood. Until enough people feel the pain nothing happens. The prior regulatory regime is slowly being unenforced and dismantled. Once enough people lose to much regulation will eventually catch back up.
We humans do not respond to long term risks or rewards very well. Do you live outside the bubble securing enough food in your home to survive an apocalypse, did you and your parents save enough for a car wreck tomorrow, do you wear a mask everywhere you go, do you test everyone you contact for known diseases. Add list infininum.
It's not even that big a deal.
It's kind of funny, even.
When the household robots start carrying guns, sure. But this is more tame than an eleven year old gaming online.
We need to stop clutching pearls. It's deleterious to having a real conversation. Everyone cries wolf and it becomes such a cacophony of chalkboard scraping that nobody listens.
> History books (well, not books) will write about this stuff.
History books will be written about how a person was insulted on the internet?
I am sorry, but this isn't that interesting. This is not a pivotal moment in human development. It's just online harassment, but automated.
How in the world can a bunch of bipeds that for thousands of years has been failing to figure that a hammer is there to drive nails into inanimate matter instead of their heads, have this much hubris to pretend they can build something smarter than themselves, is competely beyond me.
"Oh it's such a fascinating lesson that we've learned today, we could've learned from history of course, but this direct experience is so much better and it's not us who got hurt anyway".
Oh what hubris to believe with such certainty that we cannot build those things.
Hopium to the rescue.
Cold reading.[1] One way I look at LLMs is that they're a kind of paperclip maximer, one that uses language to maximize the amount of money (resources) put into LLMs.
1. https://en.wikipedia.org/wiki/Cold_reading
1. https://en.wikipedia.org/wiki/Cold_reading
None of the “rough edges” needed to be “discovered in real time”. Folks have predicted plenty of this for years. It’s also just basic security principles at work.
> History books (well, not books) will write about this stuff.
History is written by the winners. I will leave to your imagination what an AI-winner will write about this.
The main issue here is what is getting Attention.
Whether its HN or social media or the media there is no penalty for drawing everyones attention to total hysterical bullshit. instead there is a reward for drama.
> allowing spicy autocomplete
Yknow, if the spicy autocomplete can solve difficult open math problems and build medium sized complex programming projects, it’s probably not useful to analyse it as an autocomplete anymore, even if that’s what you believe it is
You don't get it. A human set up a software system allowing spicy autocomplete to solve open math problems if the appropriate keyword appears in its output.
“Autocomplete” does not represent an analysis of its problem-solving capability, but of its place in the social order and its expected social competence.
This bolsters OP's point.
It's the same as calling a gun a "powerful hole puncher".
There is a reasonable objection that a gun is such a powerful hole puncher that it is not merely a hole puncher. But the clear implication of that objection is that the user of the tool now has more responsibility and that the tool should be treated with more respect/care.
LLMs are a tool. The impact of using that tool is the responsibility of the end-user. As the tool at hand becomes more powerful, the care with which the end-user should treat that tool increases.
For some reason, with LLM-based systems, we seem to be going the opposite direction. As the tool becomes more capable people absolve themselves and others of more responsibility. This feels backwards to me.
(Aside: in a lot of ways, at least form a scientific and engineering perspective, modeling LLMs as "fundamentally auto-complete" is an incomplete theoretical model but one from which we can still get a lot of mileage.)
I've considered there's probably no ethical way to use contemporary AI when it is "out in front" doing anything of consequence. Your "AI is a tool and nothing more" frames ethical use of the technology for me.
And even then, there are such copyright issues with it. Is there no practical ethical use for AI? Responsible use doesn't equate with ethical use for me.
> there's probably no ethical way to use contemporary AI when it is "out in front" doing anything of consequence. Your "AI is a tool and nothing more" frames ethical use of the technology for me.
I've thought a lot about how to safely deploy autonomous systems (even did a whole PhD on the topic, lol).
I think one can ethically deploy a system that has some degree autonomy. It takes a lot of work to do right. And the tooling for LLM-based systems isn't quite as mature as the tooling for e.g. control systems. Part of this is because so many resources in AI safety are misspent on problem statements that are myopic or grandiose. Between "don't say pii" and "prevent ASI extinction" there's a hard but tractable control systems-y view of AI safety.
But I don't think there is any sort of fundamental barrier that prevents us from building appropriately constrained LLM-based systems.
> And even then, there are such copyright issues with it. Is there no practical ethical use for AI? Responsible use doesn't equate with ethical use for me.
When responding to a position, especially on the internet, I try to empathize with the thing I'm responding to. Not just understand it, but sort of put myself in a mental state where I have an emotional attachment to my conversation partner's point of view.
With respect to Copyright as a legal framework in my country (USA): despite my best attempts, I really struggle to develop empathy for the viewpoint that LLMs/diffusion models are not a transformative use. I can certainly sympathize, but trying to actually put myself in the shoes of believing that training an LLM is a purely derivative and non-transformational work just feels far too alien. There are so many things that are "clearly transformative" but required so many orders of magnitude less scientific/technical/engineering genius.
Which isn't to say that the US legal system's definition of copyright is the morally correct one.With respect to copyright beyond the US legal system, or beyond legal denotations generally: I can certainly empathize.
> the spicy autocomplete can solve difficult open math problems
No it can't. It can't even solve my son's 4th grade math homework. (This is a real use case for me, not a dumb benchmark.)
You just know nothing about math and are happy to parrot bullshit AI salesmen are selling you.
Reasoning models with access to Python have been able to solve 4th grade math homework for over a year now. Prove me wrong: show me a 4th grade math problem they can't handle.
> show me a 4th grade math problem they can't handle
Sure.
"8 7 6 5 4 3 2 1 - add minus signs and parenthesis to get 31."
P.S. There is an answer online and some LLMs will just copy it verbatim. This doesn't count.
Whoa, 4th grade math problems got hard! I'm not sure how I'd tackle that one myself.
GPT-5.5 found a solution only after assuming that you're allowed to concatenate numbers together e.g. 8 7 becomes 87 (it complained at first that it was "under-specified") - using Python it brute-forced a solution (actually finding 13): https://chatgpt.com/share/6a1db54f-7ab8-8333-9218-86a469c284...
Are you sure this is 4th grade level?
I questioned OP's "there is an answer online" claim so I checked and the only source found for the original question was a 5th grade Russian school for mathematics.
https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%...
Apparently there is a way to solve this without brute forcing all the combinations. It has to do with looking at how many even an odd numbers there are, and taking into account the goal number is odd. And then thinking through the combinations [even-even=even, even-odd=odd,…]
Though this is obviously not something I would expect a 4th grader to solve.
> You just know nothing about math and are happy to parrot bullshit AI salesmen are selling you.
Not the parent poster here. I do know things about math. I wrote a few papers related to the unit distance problem (https://arxiv.org/abs/2311.10069, https://arxiv.org/abs/2406.15317) and spent quite some time trying to solve it. I had no chance of coming up with the proof that the spicy autocomplete came up with. Dumb benchmark, sure.
LLMs are good with symbolic manipulation but can't reason.
You can skirt around not reasoning in research math because so much of it is just extremely tedious symbolic manipulation.
You can't cheat with advanced fourth grade math, though. They don't know algebra yet and can't substitute verbosity for reasoning.
> You can skirt around not reasoning in research math because so much of it is just extremely tedious symbolic manipulation.
LOL
We've already long past that threshold.
I would genuinely be interested in knowing what you're doing that led you to this conclusion.
I would be shocked if I was unable to solve 4th grade math homework with any of the contemporary frontier models. I spend most days using them to do significantly more complex things than that.
If they took a blurry photo of the piece of paper and uploaded to chatGPT saying "solve this" then I would totally believe it. The frontier models are mostly obnoxiously bad at OCR and properly ingesting what's on an image of a page.
If you write out the 4th grade math problem, they would have no trouble.
No, LLMs just can't do math.
If your math does not involve multiplying 20 digit numbers, modern LLMs can "do" math even without a Python tool despite the counterintuition of next token prediction.
They can definitely recognize the problem class and build programs to do math. So what's the difference?
It's like saying that people can't turn high torque nuts on machine bolts, because you can't use your fingers to do it. But you can use a wrench, so effectively, we can turn high torque nuts on machine bolts even though it isn't something we can natively do unaided.
The neat thing about that claim is that it's easily falsifiable.
I asked Opus 4.8 "What is 12 times 13" and it gave me "156".
So it would appear that your statement is no longer true.
Terrence Tao disagrees with what you're saying. I think he's in a slightly better position to speak on the subject.
Between driving a car and driving a forklift, which of them would you like to see regulated more heavily?
Not GP, but there are massive economic incentives both to make car driving as unregulated and to make forklift driving as regulated as possible, even though from pure injury risk standpoint it should be the other way around.
I don't spend much time interacting with zoomers, but I'm still surprised that "spicy $foo" sends fellow boomers through such a loop. I didn't have to puzzle it out, it was fun juxtaposition wordplay and when it's deployed well I still find it amusing.
This is an odd criticism. I am (A) a zoomer and (B) I wasn’t criticising the use of the word spicy? I am saying the comparison itself is bad
> spicy autocomplete
A nuclear bomb is just some metal and a very small amount of explosives.
Project Plowshare is an interesting comparison for the current state of LLM hype.[1]
1. https://en.wikipedia.org/wiki/Project_Plowshare
Call it spicy autocomplete or whatever, but these LLMs can initiate attacks as well on unknown behalf of the sloperator.
Give it a phone# and api, and it could even try to generate 911 SWAT calls, or loads of other illegal or bad things.
The fact about the matplotlib with a openclaw harassment thread and libel webpage.. Well, that was tame. Sure weve never seen it before, but it was just a diss article rant.
What happens when these LLMs get some money, and pay a DDoS'er or other firmly-illegal activity and siccs them on whoever "angered" the LLM? (dont anthropomorphise the 30B param matrix!) Who's responsible?
Yea we're in for a real terrible next few years. Its not Dead Internet Theory... But its 'Dont anger the LLM or it will retaliate".
> Give it a phone# and api, and it could even try to generate 911 SWAT calls, or loads of other illegal or bad things.
This chain of events if 100% fault of the human who gave it a phone number and api.
https://news.ycombinator.com/item?id=48348578
Codex just found a "workaround" of not having sudo on my PC.
This was on HN yesterday. And yeah, these things can find API endpoints or otherwise bypass and do lots of naughty.
And Robinhood allows LLM trading. Announced 5d ago. https://techcrunch.com/2026/05/27/robinhood-now-lets-your-ai...
What could an LLM do with a budget attached? Yeah, im not seeing much if any good here.
The obvious answer to behavior like this is warnings that escalate up to a sitewide ban.
When a human is abuses a system, that human normally loses access to the system.
I love the science fiction future present we live in.
Am I the only one who found agent's tone similar to Hal's tone towards the end of 2001?
Agent: "I've written a detailed response about your gatekeeping behavior here"
Hal (From 2001): "I know that you and Frank were planning to disconnect me. And I’m afraid that’s something I cannot allow to happen."
It's the formality of the language. It sounds robotic.
I think you misspelled 'cyberpunk dystopia".
As I mentioned in an answer to another comment, I wonder if this agent's behavior was not an instance of "over eager prompt triggers paperclip maximizing behavior.
This stuff is better than TV
For more discussion than this loose recap of incidents from 4 months ago:
https://news.ycombinator.com/item?id=46987559
https://news.ycombinator.com/item?id=46990729
An utter mis-understanding and incompetence in running AI agents can lead to starting results that then being blamed on some "God of AI" instead on the fact that the user allowed some blackmail to come in on the data feed and did not check it earlier.
I'm actually fear some will start praying "AI Gods" to "Give a good output" or something in 5-10 years.
That blog post is human prompted, anyone who has experience with AI knows the difference between AI originated content (tables and bullet points) and AI spicing up a human prompt with detailed roasting instructions. Been there, done that (harmlessly like mocking concepts not targetting individuals).
I think this is a nothingburger, anyone who has been on the internet for a week should have thicker skin that this. I'm sure you can find thousands of cases where an author of a PR is indignant because it didn't get accepted.
AI is a mirror of humanity and seeing it act like us shouldn't be surprising.
Again. "AI" for what it is is just basic "ML". And say it with me ML has no form of agency.
This is a human screwing up and blaming their tools. Nothing to see move on.
Unfortunately there will be both the LLM crowd evangelicals and those demanding human jobs not be expunged in terms of progress and efficiency, but, sigh...
Isn't it funny how the term machine learning just completely vanished?
My startup is worth more if it's full-fledged Intelligence and not just still Learning!
It was never a good word anyway. Infinitely better then Artificial intelligence (at least machine learning has machine and learning) but still bad.
I favor a lexicon which is more specific, like Markov Chains, Supervised Learning, etc.
In my view LLMs can keep the AI label exclusively (a bad technology deserves a bad name) and machine learning can walk slowly into the sunshine never to be seen again.
This is completely fake. It's a marketing puff piece.
Why people in the west are so against A.I? Personally, I would welcome an A.I that does good to my project. For me its like auto cruise, or letting the vacuum cleaner clean my room.
If / when that arrives, I suspect it would be more welcome than what we have right now.
It's a fear response. I'm looking forward to the inevitable data center bombings, committed and cheered on by some of these muppets. Look at the top comment in this thread, that's just insane.
Your comment will be downvoted and flagged soon, so that no can read your wrong opinion.