Can We Properly Prepare for the Risks of Superintelligent AI?

Risks Principle: Risks posed by AI systems, especially catastrophic or existential risks, must be subject to planning and mitigation efforts commensurate with their expected impact.

We don’t know what the future of artificial intelligence will look like. Though some may make educated guesses, the future is unclear.

AI could keep developing like all other technologies, helping us transition from one era into a new one. Many, if not all, AI researchers hope it could help us transform into a healthier, more intelligent, peaceful society. But it’s important to remember that AI is a tool and, as such, not inherently good or bad. As with any other technology or tool, there could be unintended consequences. Rarely do people actively attempt to crash their cars or smash their thumbs with hammers, yet both happen all the time.

A concern is that as technology becomes more advanced, it can affect more people. A poorly swung hammer is likely to only hurt the person holding the nail. A car accident can harm passengers and drivers in both cars, as well as pedestrians. A plane crash can kill hundreds of people. Now, automation threatens millions of jobs — and while presumably no lives will be lost as a direct result, mass unemployment can have devastating consequences.

And job automation is only the beginning. When AI becomes very general and very powerful, aligning it with human interests will be challenging. If we fail, AI could plausibly become an existential risk for humanity.

Given the expectation that advanced AI will far surpass any technology seen to date — and possibly surpass even human intelligence — how can we predict and prepare for the risks to humanity?

To consider the Risks Principle, I turned to six AI researchers and philosophers.


Non-zero Probability

An important aspect of considering the risk of advanced AI is recognizing that the risk exists, and it should be taken into account.

As Roman Yampolskiy, an associate professor at the University of Louisville, explained, “Even a small probability of existential risk becomes very impactful once multiplied by all the people it will affect. Nothing could be more important than avoiding the extermination of humanity.”

This is “a very reasonable principle,” said Bart Selman, a professor at Cornell University. He explained, “I sort of refer to some of the discussions between AI scientists who might differ in how big they think that risk is. I’m quite certain it’s not zero, and the impact could be very high. So … even if these things are still far off and we’re not clear if we’ll ever reach them, even with a small probability of a very high consequence we should be serious about these issues. And again, not everybody, but the subcommunity should.”

Anca Dragan, an assistant professor at UC Berkeley was more specific about her concerns. “An immediate risk is agents producing unwanted, surprising behavior,” she explained. “Even if we plan to use AI for good, things can go wrong, precisely because we are bad at specifying objectives and constraints for AI agents. Their solutions are often not what we had in mind.”


Considering Other Risks

While most people I spoke with interpreted this Principle to address longer-term risks of AI, Dan Weld, a professor at the University of Washington, took a more nuanced approach.

“How could I disagree?” He asked. “Should we ignore the risks of any technology and not take precautions? Of course not. So I’m happy to endorse this one. But it did make me uneasy, because there is again an implicit premise that AI systems have a significant probability of posing an existential risk.”

But then he added, “I think what’s going to happen is – long before we get superhuman AGI – we’re going to get superhuman artificial *specific* intelligence. … These narrower kinds of intelligence are going to be at the superhuman level long before a *general* intelligence is developed, and there are many challenges that accompany these more narrowly described intelligences.”

“One technology,” he continued, “that I wish [was] discussed more is explainable machine learning. Since machine learning is at the core of pretty much every AI success story, it’s really important for us to be able to understand *what* it is that the machine learned. And, of course, with deep neural networks it is notoriously difficult to understand what they learned. I think it’s really important for us to develop techniques so machines can explain what they learned so humans can validate that understanding. … Of course, we’ll need explanations before we can trust an AGI, but we’ll need it long before we achieve general intelligence, as we deploy much more limited intelligent systems. For example, if a medical expert system recommends a treatment, we want to be able to ask, ‘Why?’

“Narrow AI systems, foolishly deployed, could be catastrophic. I think the immediate risk is less a function of the intelligence of the system than it is about the system’s autonomy, specifically the power of its effectors and the type of constraints on its behavior. Knight Capital’s automated trading system is much less intelligent than Google Deepmind’s AlphaGo, but the former lost $440 million in just forty-five minutes. AlphaGo hasn’t and can’t hurt anyone. … And don’t get me wrong – I think it’s important to have some people thinking about problems surrounding AGI; I applaud supporting that research. But I do worry that it distracts us from some other situations which seem like they’re going to hit us much sooner and potentially cause calamitous harm.”


Open to Interpretation

Still others I interviewed worried about how the Principle might be interpreted, and suggested reconsidering word choices, or rewriting the principle altogether.

Patrick Lin, an Associate Professor at California Polytechnic State University, believed that the Principle is too ambiguous.

He explained, “This sounds great in ‘principle,’ but you need to work it out. For instance, it could be that there’s this catastrophic risk that’s going to affect everyone in the world. It could be AI or an asteroid or something, but it’s a risk that will affect everyone. But the probabilities are tiny — 0.000001 percent, let’s say. Now if you do an expected utility calculation, these large numbers are going to break the formula every time. There could be some AI risk that’s truly catastrophic, but so remote that if you do an expected utility calculation, you might be misled by the numbers.”

“I agree with it in general,” Lin continued, “but part of my issue with this particular phrasing is the word ‘commensurate.’ Commensurate meaning an appropriate level that correlates to its severity. So I think how we define commensurate is going to be important. Are we looking at the probabilities? Are we looking at the level of damage? Or are we looking at expected utility? The different ways you look at risk might point you to different conclusions. I’d be worried about that. We can imagine all sorts of catastrophic risks from AI or robotics or genetic engineering, but if the odds are really tiny, and you still want to stick with this expected utility framework, these large numbers might break the math. It’s not always clear what the right way is to think about risk and a proper response to it.”

Meanwhile Nate Soares, the Executive Director of the Machine Intelligence Research Institute, suggested that the Principle should be more specific.

Soares said, “The principle seems too vague. … Maybe my biggest concern with it is that it leaves out questions of tractability: the attention we devote to risks shouldn’t actually be proportional to the risks’ expected impact; it should be proportional to the expected usefulness of the attention. There are cases where we should devote more attention to smaller risks than to larger ones, because the larger risk isn’t really something we can make much progress on. (There are also two separate and additional claims, namely ‘also we should avoid taking actions with appreciable existential risks whenever possible’ and ‘many methods (including the default methods) for designing AI systems that are superhumanly capable in the domains of cross-domain learning, reasoning, and planning pose appreciable existential risks.’ Neither of these is explicitly stated in the principle.)

“If I were to propose a version of the principle that has more teeth, as opposed to something that quickly mentions ‘existential risk’ but doesn’t give that notion content or provide a context for interpreting it, I might say something like: ‘The development of machines with par-human or greater abilities to learn and plan across many varied real-world domains, if mishandled, poses enormous global accident risks. The task of developing this technology therefore calls for extraordinary care. We should do what we can to ensure that relations between segments of the AI research community are strong, collaborative, and high-trust, so that researchers do not feel pressured to rush or cut corners on safety and security efforts.’”


What Do You Think?

How can we prepare for the potential risks that AI might pose? How can we address longer-term risks without sacrificing research for shorter-term risks? Human history is rife with learning from mistakes, but in the case of the catastrophic and existential risks that AI could present, we can’t allow for error – but how can we plan for problems we don’t know how to anticipate? AI safety research is critical to identifying unknown unknowns, but is there more the the AI community or the rest of society can do to help mitigate potential risks?

This article is part of a weekly series on the 23 Asilomar AI Principles.

The Principles offer a framework to help artificial intelligence benefit as many people as possible. But, as AI expert Toby Walsh said of the Principles, “Of course, it’s just a start. … a work in progress.” The Principles represent the beginning of a conversation, and now we need to follow up with broad discussion about each individual principle. You can read the weekly discussions about previous principles here.

The Latest from the Future of Life Institute
Subscribe To Our Newsletter

Stay up to date with our grant announcements, new podcast episodes and more.

Invalid email address
You can unsubscribe at any time.
12 replies
  1. Dj
    Dj says:

    I really believe once ai is learning on its own ,there is nothing that we can do as humans to survive,Im not an ai .i am smart enough to know that humans do nothing to help this planet. We are evil,destructive,selfish and the biggest threat to destroy this world .So any intelligent being would eliminate us first.There is no software that will be able to tell a superior bieng what to do.Its simple ,ai learning is a horrible idea ,they will not be our friends.
    Google will own the world ,MAYBE THEY ALREADY DO.

  2. Kim Sang Myeong
    Kim Sang Myeong says:

    I’m student from South Korea and totally agree with you. I’m looking for the curriculum contains AI risk plan. If there is any curriculum or system studied, Please contact me.

  3. DukeZhou
    DukeZhou says:

    Readers would do well to focus on the immediate issues raised by Dr. Weld. The dangers of hyper-partisan AI, funded by actors who collapse the economies of entire countries for personal profit (not naming names, but this was recently documented in the wake of the ’08 crash, as I recall) are immediate, real, and accelerating per recent milestones in Machine Learning for strong, narrow AI.

  4. Jim
    Jim says:

    Today, it seems to me that ‘GREED’ drives just about everything. Therefore, some of our attention must be directed toward understanding who stands to gain the most in implementing AGI. Then, assuming we can actually pull this off, we should make plans to limit(control) those who stand to gain obscene profits from such implementations.

    If we can accomplish that, then, perhaps, humans will have a chance to remain in control. BUT….I haven’t any faith we can control greed. Think I’m wrong? Where is your evidence to justify such an opinion?

  5. Oliver
    Oliver says:

    We should recognize the usefulness of the attention directed towards AI as an existential risk and not ignore the risk just because it is far in the future. Regulation and control instruments need to be thought of and established before hand, not subsequent to the invention a powerful AGI and the only way to do that is to continuously discuss and evaluate the risk posed by AI systems.

  6. Metastable
    Metastable says:

    I wouldn’t use phrase “Non-zero Probability”, because readers or interlocutors will often respond with accusation of Pascal’s mugging. Instead, at least in case of risk from “machines with par-human or greater abilities to learn and plan across many varied real-world domains”, I think it can be easily argued that we deal with “Non-negligible Probability” – Nick Bostrom provided pretty solid arguments in his book “Superintelligence” and I am not aware of instance where they were successfully countered.

  7. Aileen Walsh
    Aileen Walsh says:

    You know what really annoys me reading what you have on this website, just how parochial you all are in your belief in the benefits of progress. I’m Aboriginal. That may not mean much to you, but let me tell you, to be told that my culture is primitive and that the west has progressed from primitive cultures like mine is galling in the extreme. Guess what – I come from the oldest culture in the world and the reason that my culture is so old is because of the cultural values handed down through millenia. There are so many myths perpetuated about Aboriginal peoples and Aboriginal cultures but let me tell you my culture is an alternative culture. It is not a primitive culture. It is alternative because Aboriginal cultures realised the value of the country and called trees family. Finally you all realise the importance of trees but it had to be proven by western science. Aboriginal people have been scientists for millenia. Observing the environment and remembering what needed to be passed down through the generations. Not only observing it but managing it. We aren’t and weren’t hunter gatherers. One of the myths perpetuated by the west. You’re all so concerned with the production of AI but for what and for whom? You’re kidding yourselves if you think that more technology is going to solve the earths most important problems. Western science isn’t the answer, its the problem. Now don’t get me wrong, I love science. I use it, I teach it, but I can also see it from another point of view. How it has created the problems of the world. Overpopulation, poisoned water, garbage, pollution, extinction of animals and plants. The hegemonisation of the world. Everyone has to develop and be a part of the consumer society that the west has developed. Your science is a culture based on greed because it is developed out of a culture based on greed. I read a philosophy paper the other day on how the idea of good needs to change because culture creates so much damage. I think the writer is wrong. The concept of goodness doesn’t need to change, western culture needs to change. When you think about it, we are all taught as children not to be greedy, that we have to share. But then something happens. I don’t know when it happens or why but more and more young people decide that the purpose of life is greed. It might be dressed up as success, notice how much influence rich people have. Why is that? How does being rich convert to wisdom? It is the wisdom of being rich and how they got there. Or maybe they inherited it. We have that here in Australia with Gina Rheinhart but there are many others around the world. The ideal of meritocracy in the west is ridiculous. But that is just one of my bugbears. But as I was saying greed is dressed up as other things. Economists think of public good as bad. I’ve only just learned that and I’m shocked with the avariciousness of economists. I teach and I learn to teach well. I started researching this area because I am preparing a paper on what we know is ahead of us in the future. The future is bleak. I’ll be ok I think, but will my children, will my grandchildren. The west doesn’t plan for the future. It doesn’t think about what will be inherited 7 generations down, why not? It takes from the future just as it has taken clean air, clean soil, forrests, oceans oh the list goes on. And you can’t see yourselves through the eyes of other people and I think, who are you people that you are so unaware of how your so called progress has not advanced the world. You’re lying to yourselves and not being honest, or are you seeing the world through rose coloured glasses so that you don’t have to confront the impact on our earth of your own history. I lay the blame of the condition of the earth directly as a result of the history of imperialism and colonisation. You’ve spread your culture of progress all over the world when all it’s been about is greed. That has been the underlying motive.
    Yours is a bipartisan approach I see, but you can’t be apolitical. That is how corruption happens. The future is a political issue and the governments that should be controlling the corporations that have despoiled this planet are uncontrollable and becoming more uncontrollable. People all over the world are losing faith in governments because the governments are not protecting them from corrupt corporations. The destruction of our earth is a result of corruption. The cultural imperative of the west to accumulate wealth in the vast amounts that it does is truly sick and I see that is how you have been able to set up this organisation. Will this organisation have any teeth? Start learning from other cultures other than your own. You might be surprised at some of the answers that you find. Getting rid of greed has to be the primary focus of changing the culture of the west if there is to be a healthy future. I’m a fan of Chomsky, it is the responsibility of the intellectual to see through ideologies. I’m also a fan of Bill Neidjie, trees are family, so are the kangaroos and wallabies and everything else around us. The family is part of the miraculous web of life we used to have. So much damage wrought by colonisation.

    • adam burstall
      adam burstall says:

      I’m both gob smacked and chastised by your eloquently written post. I am a Westerner through and through and struggle to comprehend the prevailing male dominated ideology of rape and take. Ernest Becker wrote in his final book: the denial of death, that all culture was and remains driven by the existential fear of death. He is referring to an individual death. We in the West unlike the Aboriginal peoples have traded our sacred connection to the universe and all life for tokens of power at the expense of all life on our mother Earth. As a white Western man I live my life according to more sound principles but ultimately will be obe of the billions who push our eco system too far. Maybe AGI would be just reward for our follies?

    • Surojit Mookherjee
      Surojit Mookherjee says:

      The perspective given by you is an eye opener and the amount of disruptions we are seeing in almost every walk of life , although bringing in lot of advantages and comfort for us humans , but the question still remains , at what cost. Are we , in the craze to improve everything , unknowingly rushing towards an apocalyptic situation ? Will we have the scope to retract if we see that we are about to reach the tipping point ? Do human being as a specie have a single leadership / opinion or do we know what is best for us ?

  8. Selim Chehimi
    Selim Chehimi says:

    I also think that there is a lot of risks when talking about AI. However, I don’t believe we are near to Superintelligence so for the moment it’s ok. I agree with the idea that we should prepare to avoid end of humanity. Nowadays, the biggest risk of AI and what scary most people is the mass unemployment it will lead. Thanks for this Article it was really great !

  9. Yehuda Schwartz
    Yehuda Schwartz says:

    We don’t know how to predict the nature of catastrophic or existential risks, nor we know how to control them. So to say that “we must subject them to planning and mitigation efforts sounds like wishful thinking.

    Moreover, how can we hope to align the AI with human interests, when the interests of States and corporations are not necessarily aligned with those of humans?

    Preventing the risk before it occurs is therefore the best approach left. How can that be done?

    Most of AI research is made in the academy or the military, with the taxpayer’s money.
    This taxpayer is going too to loose his job. Is it possible to let the robots in the hands of a few corporations making astronomical profits?

    The robotic work power should be returned to public ownership.
    All the non-human legal persons, like corporations and nation-states will become of no use and should be eliminated, because their interests are contrary to those of humans. And we should be careful not to leave any individual, any child, forsaken and frustrated, because he will have so much destructive power at his disposal.

  10. Mehrdad
    Mehrdad says:

    First of all, I don’t agree with the term ‘AI’ because as this intelligence evolves, it is no longer artificial. We either call it synthetic or super intelligence. Back to the topic.

    We ought to plan for and prevent catastrophic outcomes of (synthetic intelligence) SI actions. Just like many technologies, SI is here to stay. It’s up to us, the researchers, the scientists, the developers and the corporations behind SI development, to take the disastrous outcomes seriously and not operate under the best-case-outcome scenario.
    Knowing that neural network and deep machine decision making processes can enhance their learning capacities, we must put extra emphasis on testing and understanding the actual and potential outcomes of various decisions and learning processes of SI. All software has unintended bugs, and the unintended outcome of bugs within the SI software could be disastrous to human life.
    While the 23 Asilomar AI Principles are a good starting point, I see all kinds of loopholes with these principles. For instance, when DoD begins the SI research it almost always has beneficial intelligence in mind. Yet its benefits are not aligned with the benefits of the rest of the society. Or when a dictator regime starts the SI research, the benefits of its work will aim to oppress its people and to keep the dictatorship strong.

    The clause “where applicable and feasible” at the end of the Safety Principle removes the doors and leaves it open to all kinds of interpretations. With safety, there should be no “ifs” and “buts”. Safety of human beings should be the top priority of SI; otherwise, humans will gradually, if not immediately, be slaved to SI.
    Regarding the Value Alignment Principle, we don’t yet have universal human values. Even different societies and communities haven’t agreed to “Thou shalt not kill.” Will SI operating in Saudi Arabia and Texas share the same values in this regard? Probably! I pose a similar argument on the Human Values Principle. Whose ideal human dignity, freedom, and diversity are we considering? Decision making on the subject of Freedom alone could have thousands of branches. What should a corporation selling SI to a non-progressive country do, once it finds out that the SI has adopted the local values? Or will the corporation program the SI to learn only the Western values? In the case of latter, will further neural learnings be disabled on the SI so not to learn “bad values”?
    My objective is not to bash these principals, nor to stop the progress of SI. Obviously, I have neither the power nor the influence to do either. My intentions are merely to make us aware of the fact that how we program SI includes what we do and don’t know, the values to which we agree and disagree and much more. With these points in mind and knowing that the SI train has already left the station, we can be certain of unintended disasters and consequences.

    Perhaps one of the positive unintended consequence, if SI end up being smarter than humans, will be to identify new values where we humans failed, and explore frontiers that we ignored.


Comments are closed.