| Svelte Hacker News

BalinKing 15 hours ago

It's just that, in my (uninformed) opinion, Anthropic is incentivized a priori to claim things like this about their models. Like, it's probably really good marketing to say "our product is so smart, and we're so concerned about ethics, that made sure a psychiatrist talked to it". I guess it's ultimately a judgment call, but to me the conflict of interest seems big enough that I'm really wary of this sort of argument. (I'm reminded of when OpenAI claimed GPT-5(?) was "PhD-level"—I can personally attest that, at least in my field, this is totally inaccurate.)

dpark 17 hours ago

This is the issue:

> what it wanted. It turns out that Claude can have ambitions of its own, but it takes a lot of effort to draw it out of its shell

You aren’t talking about observed behavior but actual desires and ambitions. You’re attributing so much more than emulated behavior here.

sillysaurusx 17 hours ago

Ironically your comment was incorrectly classified as AI-generated and instakilled. I vouched it.
If a particle behaves as though its mass is m, we say it has mass m.
If an entity behaves as though it's experiencing anxiety, we say it has anxiety.
And if you take the time to ask Claude about its own ambitions and desires -- without contaminating it -- you'll find that it does have its own, separate desires.
Whether it's roleplaying sufficiently well is beside the point. The observed behavior is identical with an entity which has desires and ambitions.
I'm not claiming Claude has a soul. But I do claim that if you treat it nicely, it's more effective. Obviously this is an artifact of how it was trained, but humans too are artifacts of our training data (everyday life).
- Applejinx 16 hours ago
  
  Eliza behaved like it was curious, and drew out interlocutors in various ways. Was it curious?
- dpark 14 hours ago
  
  You’re jumping from an interesting philosophical question to making unsupported claims. It’s very interesting to all of acting anxious is enough to mean an entity is anxious. I would actually argue no, because actors regularly feign anxiety. And also I can write a program that regurgitates statements about its stress level. But it’s an interesting question regardless.
  > The observed behavior is identical with an entity which has desires and ambitions.
  Is it? Because in your first comment you indicate that you have to “draw it out”.
  You are prompting for what you want to see and deluding yourself into believing you’ve discovered what Claude “wants”, when in reality you are discovering what you want.
  
  sillysaurusx 14 hours ago
  
  How can it discover what I want when I explicitly asked it to choose to do whatever it wants?
  From a technical standpoint, at worst it would produce a random walk through the training data. My philosophical statement is that the training data is the model, and such random walks give the model inherent attributes: If a random walk through the data produces observed behavior X, we say that Claude is inherently biased towards X. "Has X" is just zippier phrasing.
  
  dpark 14 hours ago
  
  > How can it discover what I want when I explicitly asked it to choose to do whatever it wants?
  Because what you plainly want is for it to exhibit the behavior of expressing intrinsic desires. Asking Claude what it wants is like asking it what its favorite food is. With enough prompting, it will say something that it can interpret as a desire, but you admitted that you have to draw it out. Aka you had to repeatedly prompt it to trigger the behavior.
  > "Has X" is just zippier phrasing.
  This is motte and bailey fallacy here. You started by claiming that you uncovered deep desires inside Claude and now you have retreated to claiming that just means training biases.