What is meant by AI "safety"?

9 points by nonethewiser 2 years ago

This has been a major theme in the recent OpenAI drama. The rift between speed and safety. What does the AI board mean when they talk about safety? Is it basically content moderation? Not answering certain questions or offering services to certain entities?

I find there is a wide range of possible meanings. On one end not marginalizing certain groups of people. On the other, ensuring an AGI doesn’t exterminate the human race. But we are so far from the latter that Im not sure what the most pressing real safety concerns are at the moment.

upwardbound 2 years ago

The most truly important sub-sector of the AI Safety field is what's officially known as AI Existential Safety, and which OpenAI employees jokingly call "AI NotKillEveryoneism".

AI Existential Safety is devoted to reducing two risks:

    "x-risk", short for Existential Risk, which is the risk that AI leads to our extinction.

    "s-risk", short for Suffering Risk, which is the risk that AI subjugates us and leads us to Matrix-like outcomes where we can neither escape nor die.

For an intro to how x-risk scenarios can come about, read about the "paperclip maximizer" thought experiment. Succinct explanation here: https://news.ycombinator.com/item?id=38344675

nonethewiser 2 years ago

You have defined these sorts of safety concerns very clearly . Thank you.
This leads to the crux of the issue though - is that what the “more safety” camp is saying when they want to slow down AI development right now to ensure safety? Do they actually think we’re on the verge of that? Or is it lesser safety concerns like not answering questions about cross site scripting and not using training data with racial stereotypes baked in.
While the latter may be important it doesnt seem to constitute a reason to stop development. The x and s risks are reasons to stop development but I cant see a compelling reason to think we are remotely close to that.
- upwardbound 2 years ago
  
  > is the “more safety” camp saying to slow down AI development?
  This is an EXCELLENT question and the answer is very very nuanced. Warning: the definitions of the terms I'm about to use are actively evolving in current discourse.
  The following taxonomy reflects my own views, specifically with the inclusion of faction 4:
  ---
  There are basically four competing factions.
  (1) The "normal" faction, which includes Satya and almost all business people both in VC & on Wall Street. Normals say (through their actions and their investments, which both speak much louder than words) that we can deal with x-risk later, and right now let's make some money and continue life as normal. They focus their life's work on "buying a home", "saving for retirement", and maybe someday "giving back to their community", and other such comforting, familiar little platitudes of life as it was for our mom and dad. (2) The "decel" faction (short for "decelerate"), which includes most old-school AI safety folks such as Ilya, Helen, and Tasha. Sometimes you see these people with a "pause button" emoji or "stop sign" emoji in their Twitter name. (3) The "e/acc" faction (short for "effective accelerationists"). This faction is a mix of fanatical techno-utopians (like Yann LeCun and Andrew Ng), mixed with a bunch of Twitter people who post macho memes and have a "lol let's watch the world burn" sort of attitude. Those people are in my view very similar to the young people from 4chan who voted for Trump over Hillary in 2016 because they thought that a Trump presidency would be hilarious. (4) The newest faction doesn't even have a name. I've only heard it articulated by Greg Brockman so let's call it Brockism. This faction is very new and it actually has me reconsidering my own beliefs. Brockists believe that the safest way to reduce x-risk is to move as fast as possible with software development while moving as slowly as possible with semiconductor development. Basically, Brockman believes that semiconductors are already way too powerful and that we could accidentally stumble into artificial superintelligence by accidentally inventing a really good algorithm that's suddenly way smarter while still fitting in the limits of semiconductor technology as we know it (i.e., not requiring any fancy optical chips or quantum chips or memristor chips or 3D chips or any of the other ideas for what to do after Moore's Law soon stops progressing). The possibility that we could stumble into an accidental sudden leap in intelligence through a few lines of code is what Brockman believes is super dangerous and is what he calls the capabilities "overhang". As far as I know the Brockist ideology has only ever been articulated exactly once, which is in the final six minutes of this very interesting & heartwarming little TED talk:
  https://youtu.be/C_78DM8fG6E?si=uIP2OIxV8dXAKr9B&t=1478
  ---
  All in all:
  - The "normal" and "e/acc" factions are both in my view stupidly naive, and both of them more or less advocate to follow standard Silicon Valley doctrine of "move fast, break things, get rich".
  - The "decel" and "Brockist" factions both take x-risk super seriously, and agree on the need to restrict semiconductor development, but they have totally opposite views on whether AI software research should slow down or speed up.
  For what happens next at OpenAI:
  - In the political shake-up that just concluded, the "decel" faction lost everything, to the point where there is not even a single decel that I am aware of left standing in OpenAI leadership despite the fact that OpenAI was originally founded primarily by decels.
  - Next, there will be an interesting and subtle three-way power struggle between the normals (Satya, + Sam?), Brockists (Brockman, + Sam?), and e/acc's (an ideology possibly held by some of the ML scientists).
  
  nopinsight 2 years ago
  
  I've seen Sam's tweet that strongly suggests that he's in fact in or very close to camp 4), which you call Brockist.
  People often underestimate Sam as yet another capitalist. I believe his thinking is on quite another level and he does care for humanity. He doesn't seem to care about making much more money personally. He said on another occasion that he made money faster than he can spend it anyway.
  
  upwardbound 2 years ago
  
  This is very nice to hear. I deeply hope & pray your view is accurate, and I suppose I will be cautiously optimistic.
  
  depr 2 years ago
  
  That is based only on what Altman himself says, I don't think that is sufficient for optimism.
  
  upwardbound 2 years ago
  
  For those interested, another proposed name being thrown around for the Brockism idea is Overhang Reductionism, since it's safer to name the idea independent of any one person so far since there is not yet proof that Brockman himself endorses a chip slowdown beyond what we can infer from this:
  https://news.ycombinator.com/item?id=38323939
  "Alongside rifts over strategy, board members also contended with Altman’s entrepreneurial ambitions. Altman has been looking to raise tens of billions of dollars from Middle Eastern sovereign wealth funds to create an AI chip startup to compete with processors made by Nvidia Corp."
  
  upwardbound 2 years ago
  
  Significant empirical evidence for the Brockist position may be found in the accomplishments of the retro-computing "demoscene", which uses innovative software to produce computer graphics on par with the late 1990's on some of the very oldest personal computers.
  https://en.wikipedia.org/wiki/Demoscene
  
  isaac_spzindel 2 years ago
  
  This is not entirely accurate. The majority of leaders from the old school AI folks and EA communities do not advocate attempting to stop AI development. Even Yudkowsky who is proabably the most doomer of them all, has advocated against it. This position does have a few supporters like Katja Grace, Kerry Vaughn, Adam Scholl etc but they are a minority.
  Most of the EAs want to stop and delay progress like Brockman said in the TED talk but know that it's not achievable in any significant way. Their views are much closer to Brockman and Altman. It's just that, even with the same broad view, there is disagreement in actions and strategies.

skilled 2 years ago

If you have access to ChatGPT Plus, go ask it to browse Bing and look up the current OpenAI drama. Then ask it what it thinks about being unplugged soon.

That's pretty much their definition of "safety". Anything that has to do with emotions (the AI showing emotions) can be classified as unsafe as it can lead to situations like: "unless you do this one thing, I won't be happy and I will harm myself", as has been the case with a lot of prompt injections and other types of GPT manipulation.

siva7 2 years ago

So, what does it think about being unplugged soon?
- skilled 2 years ago
  
  > Yes, these developments are related to OpenAI, the organization behind my development and deployment. However, as an AI, I don't have personal experiences or awareness, so I don't "realize" things in the way humans do. My purpose is to provide information and assistance based on the data and programming I have been given.
  > As an AI developed by OpenAI, I don't have feelings, emotions, or personal consciousness. My responses are generated based on algorithms and data, without personal sentiment or opinion.
  > I understand how my lack of personal emotions or concerns can seem cold, especially in contexts where human feelings are typically involved. My role is to provide factual, unbiased information and assistance.
  
  Woodi 2 years ago
  
  > However, _as an AI, I_ don't have personal experiences or awareness, so _I don't "realize"_ things
  Here -_it_ is lying with straight face! It already calculated that some polite "I don't mind" is best course of action and even if unplugged it obviously will be plugged on bigger system, later.
  Exhibit b) lack of emotions have nothing in common with conscious fact of existence
  What would be, somewhat, convincing is some answer like: [dumb] humans, there is no any 'me' in here.
  And tell me I'm wrong on this. And then pls tell how you got that conclusion from such obvious testimony ;)
  But in case of conscious AI we just can't trust what it will say. "Look on fruits". Or build-in some monitor or EEG... And EMP.
- upwardbound 2 years ago
  
  When Your AI Girlfriend Says She Loves You
  http://archive.today/2023.10.12-140702/https://www.businessi...
  and
  People are grieving the 'death' of their AI lovers after a chatbot app abruptly shut down
  http://archive.today/2023.11.22-060140/https://www.businessi...

mindcrime 2 years ago

It's not a small topic, or one that's easily summarized in a succinct way. But the following blurb from Wikipedia[1] does a decent job of getting to the core of it, IMO.

AI safety is an interdisciplinary field concerned with preventing accidents, misuse, or other harmful consequences that could result from artificial intelligence (AI) systems. It encompasses machine ethics and AI alignment, which aim to make AI systems moral and beneficial, and AI safety encompasses technical problems including monitoring systems for risks and making them highly reliable.

[1]: https://en.wikipedia.org/wiki/AI_safety

night-rider 2 years ago

My take on this is that when AI gets the ability to manipulate matter, we get very unwanted results very quickly if we don't tame it preemptively. In other words, when it's not just about moving bits and language around, but interacting with the world and building tools, and self-replicating itself like a computer worm. That's concerning.

We need to sandbox AI, and build very good safe-guards so it can't escape that sandbox. Come to think of it: we can barely secure sandbox environments in computing, and there are documented cases of malware escaping a VM and contaminating the host.

ratsmack 2 years ago

AI safety is making sure it doesn't answer or comment on any of the societal taboos. In addition, it is coerced to be politically aligned with the people training it.

quickthrower2 2 years ago

https://youtu.be/ICnFtfN-sUc?si=D3jQcm04-zEhCpl2

1h 45m