More in-depth background reading about risks and benefits of artificial intelligence

Superintelligence survey

Click here to see this page in other languages:  Chinese  French  German Japanese  Russian

The Future of AI – What Do You Think?

Max Tegmark’s new book on artificial intelligence, Life 3.0: Being Human in the Age of Artificial Intelligence, explores how AI will impact life as it grows increasingly advanced, perhaps even achieving superintelligence far beyond human level in all areas. For the book, Max surveys experts’ forecasts, and explores a broad spectrum of views on what will/should happen. But it’s time to expand the conversation. If we’re going to create a future that benefits as many people as possible, we need to include as many voices as possible. And that includes yours! Below are the answers from the first 14,866 people who have taken the survey that goes along with Max’s book. To join the conversation yourself, please take the survey here.

How soon, and should we welcome or fear it?

The first big controversy, dividing even leading AI researchers, involves forecasting what will happen. When, if ever, will AI outperform humans at all intellectual tasks, and will it be a good thing?

Do you want superintelligence?

Everything we love about civilization is arguably the product of intelligence, so we can potentially do even better by amplifying human intelligence with machine intelligence. But some worry that superintelligent machines would end up controlling us and wonder whether their goals would be aligned with ours. Do you want there to be superintelligent AI, i.e., general intelligence far beyond human level?

What Should the Future Look Like?

In his book, Tegmark argues that we shouldn’t passively ask “what will happen?” as if the future is predetermined, but instead ask what we want to happen and then try to create that future.  What sort of future do you want?

If superintelligence arrives, who should be in control?
If you one day get an AI helper, do you want it to be conscious, i.e., to have subjective experience (as opposed to being like a zombie which can at best pretend to be conscious)?
What should a future civilization strive for?
Do you want life spreading into the cosmos?

The Ideal Society?

In Life 3.0, Max explores 12 possible future scenarios, describing what might happen in the coming millennia if superintelligence is/isn’t developed. You can find a cheatsheet that quickly describes each here, but for a more detailed look at the positives and negatives of each possibility, check out chapter 5 of the book. Here’s a breakdown so far of the options people prefer:

You can learn a lot more about these possible future scenarios — along with fun explanations about what AI is, how it works, how it’s impacting us today, and what else the future might bring — when you order Max’s new book.

The results above will be updated regularly. Please add your voice by taking the survey here, and share your comments below!

When AI Journalism Goes Bad

Slate is currently running a feature called “Future Tense,” which claims to be the “citizens guide to the future.” Two of their recent articles, however, are full of inaccuracies about AI safety and the researchers studying it. While this is disappointing, it also represents a good opportunity to clear up some misconceptions about why AI safety research is necessary.

The first contested article was Let Artificial Intelligence Evolve, by Michael Chorost, which displays a poor understanding of the issues surrounding the evolution of artificial intelligence. The second, How to be Good, by Adam Elkus, got some of the concerns about developing safe AI correct, but, in the process, did great disservice to one of today’s most prominent AI safety researchers, as well as to scientific research in general.

We do not know if AI will evolve safely

In his article, Chorost defends the idea of simply letting artificial intelligence evolve, without interference from researchers worried about AI safety. Chorost first considers an example from Nick Bostrom’s book, Superintelligence, in which a superintelligent system might tile the Earth with some undesirable product, thus eliminating all biological life. Chorost argues this is impossible because “a superintelligent mind would need time and resources to invent humanity-destroying technologies.” Of course it would. The concern is that a superintelligent system, being smarter than us, would be able to achieve such goals without us realizing what it was up to. How? We don’t know. This is one of the reasons it’s so important to study AI safety now.

It’s quite probable that a superintelligent system would not attempt such a feat, but at the moment, no one can guarantee that. We don’t know yet how a superintelligent AI will behave. There’s no reason to expect a superintelligent system to “think” like humans do, yet somehow we need to try to anticipate what an advanced AI will do. We can’t just hope that advanced AI systems will evolve compatibly with human life: we need to do research now to try to ensure compatibility.

Chorost then goes on to claim that a superintelligent AI won’t tile the Earth with some undesirable object because it won’t want to. He says, “Until an A.I. has feelings, it’s going to be unable to want to do anything at all, let alone act counter to humanity’s interests and fight off human resistance. Wanting is essential to any kind of independent action.” This represents misplaced anthropromorphization and a misunderstanding of programming goals. What an AI wants to do is dependent on what it is programmed to do. Microsoft Office doesn’t want me to spell properly, yet it will mark all misspelled words because that’s what it was programmed to do. And that’s just software, not an advanced, superintelligent system, which would be infinitely more complex.

If a robot is given the task of following a path to reach some destination, but is programmed to recognize that reaching the destination is more important than sticking to the path, then if it encounters an obstacle, it will find another route in order to achieve its primary objective. This isn’t because it has an emotional attachment to reaching its destination, but rather, that’s what it was programmed to do. AlphaGo doesn’t want to beat the world’s top Go player: it’s just been programmed to win at Go. The list of examples of a system wanting to achieve some goal can go on and on, and it has nothing to do with how (or whether) the system feels.

Chorost continues this argument by claiming: “And the minute an A.I. wants anything, it will live in a universe with rewards and punishments—including punishments from us for behaving badly. In order to survive in a world dominated by humans, a nascent A.I. will have to develop a human-like moral sense that certain things are right and others are wrong.” Unless it’s smart enough to trick us into thinking it’s doing what we want while doing something completely different without us realizing it. Any child knows that one of the best ways to not get in trouble is to not get caught. Why would we think a superintelligent system couldn’t learn the same lesson? A punishment might just antagonize it or teach it to deceive us. There’s also the chance that the superintelligent agent will partake in some sort of action that is too complex for us to understand its ramifications; we can’t punish an agent if we don’t realize that what it’s doing is harmful.

The article then considers that for a superintelligent system to want something in the way that biological entities want something, it can’t be made purely with electronics. The reasoning is that since humans are biochemical in nature, if we want to create a superintelligent system with human wants and needs, that system must be made of similar stuff. Specifically, Chorost says, “To get a system that has sensations, you would have to let it recapitulate the evolutionary process in which sensations became valuable.”

First, it’s not clear why we need a superintelligent system that exhibits sensations, nor is there any reason that should be a goal of advanced AI. Chorost argues that we need this because it’s the only way a system can evolve to be moral, but his arguments seem limited to the idea that for a system to be superintelligent, it must be human-like.

Yet, consider the analogy of planes to birds. Planes are essentially electronics and metal – none of the biochemistry of a bird – yet they can fly higher, faster, longer, and farther than any bird. And while collisions between birds and planes can damage a plane, they’re a lot more damaging to the bird. Though planes are on the “dumber” end of the AI superintelligence spectrum, compared to birds, they could be considered “superflying” systems. There’s no reason to expect a superintelligent system to be any more similar to humans than planes are to birds.

Finally, Chorost concludes the article by arguing that history has shown that as humanity has evolved, it has become less and less violent. He argues, “A.I.s will have to step on the escalator of reason just like humans have, because they will need to bargain for goods in a human-dominated economy and they will face human resistance to bad behavior.” However, even if this is a completely accurate prediction, he doesn’t explain how we survive a superintelligent system as it transitions from its early violent stages to the more advanced social understanding we have today.

Again, it’s important to keep in mind that perhaps as AI evolves, everything truly will go smoothly, but we don’t know for certain that’s the case. As long as there are unknowns about the future of AI, we need beneficial AI research.

This leads to the problematic second article by Elkus. The premise of his article is reasonable: he believes it will be difficult to teach human values to an AI, given that human values aren’t consistent across all societies. However, his shoddy research and poor understanding of AI research turn this article into an example of a dangerous and damaging type of scientific journalism, both for AI and science in general.

Bad AI journalism can ruin the science

Elkus looks at a single interview that AI researcher Stuart Russell gave to Quanta Magazine. He then uses snippets of that interview, taken out of context, as his basis for arguing that AI researchers are not properly addressing concerns about developing AI with human-aligned values. He criticizes Russell for only focusing on the technical side of robotics values, saying, “The question is not whether machines can be made to obey human values but which humans ought to decide those values.” On the contrary, both are important questions that must be asked, and Russell asks both questions in all of his published talks. The values a robot takes on will have to be decided by societies, government officials, policy makers, the robot’s owners, etc. Russell argues that the learning process should involve the entire human race, to the extent possible, both now and throughout history. In this talk he gave at CERN in January of this year, Russell clearly enunciates that the “obvious difficulties” of value alignment include the fact that “values differ across individuals and cultures.” Elkus essentially fabricates a position that Russell does not take in order to provide a line of attack.

Elkus also argues that Russell needs to “brush up on his A.I. History” and learn from failed research in the past, without realizing that those lessons are already incorporated into Russell’s research (and apparently without realizing that Russell is the co-author of the seminal textbook on Artificial Intelligence, which, over 20 year later, is still the most influential and fundamental text on AI — the book is viewed by other AI history experts, such as Nils Nilsson, as perhaps the authoritative source on much of AI’s history). He also misunderstands the objectives of having a robot learn about human values from something like movies or books. Elkus inaccurately suggests that the AI would learn only from one movie, which is obviously problematic if the AI only “watches” the silent, racist movie, Birth of a Nation. Instead, the AI could look at all movies. Then it could look at all criticisms and reviews of each movie, as well as how public reactions to the movies change over the years. This is just one example of how an AI could learn values, but certainly not the only one.

Finally, Elkus suggests that Russell, as a “Western, well-off, white male cisgender scientist,” has no right to be working on the problem of ensuring that machines respect human values. For the sake of civil discourse, we will ignore the ad hominem nature of this argument and assume that it is merely a recommendation to draw on the expertise of multiple disciplines and viewpoints. Yet a simple Google search would reveal that not only is Russell one of the fiercest advocates for ensuring we keep AI safe and beneficial, but he is an equally strong advocate for bringing together a broad coalition of researchers and the broadest possible range of people to tackle the question of human values. In this talk at the World Economic Forum in 2015, Russell predicted that “in the future, moral philosophy will be a key industry sector,” and he suggests that machines will need to “engage in an extended conversation with the human race” to learn about human values.

Two days after Elkus’s article went live, Slate published an interview with Russell, written by another author, that does do a reasonable job of explaining Russell’s research and his concerns about AI safety. However, this is uncommon. Rarely do scientists have a chance to defend themselves. Plus, even when they are able to rebut an article, seeds of doubt have already been planted in the public’s mind.

From the perspective of beneficial AI research, articles like Elkus’s do more harm than good. Elkus describes an important problem that must be solved to achieve safe AI, but portrays one of the top AI safety researchers as someone who doesn’t know what he’s doing. This unnecessarily increases fears about the development of artificial intelligence, making researchers’ jobs that much more difficult. More generally, this type of journalism can be damaging not only to the researcher in question, but also to the overall field. If the general public develops a distaste for some scientific pursuit, then raising the money necessary to perform the research becomes that much more difficult.

For the sake of good science, journalists must maintain a higher standard and do their own due diligence when researching a particular topic or scientist: when it comes to science, there is most definitely such a thing as bad press.

Introductory Resources on AI Safety Research

Reading list to get up to speed on the main ideas in the field. The resources are selected for relevance and/or brevity, and the list is not meant to be comprehensive.


For a popular audience:

Cade Metz, 2017. New York Times: Teaching A.I. Systems to Behave Themselves

FLI. AI risk background and FAQ. At the bottom of the background page, there is a more extensive list of resources on AI safety.

Tim Urban, 2015. Wait But Why: The AI Revolution. An accessible introduction to AI risk forecasts and arguments (with cute hand-drawn diagrams, and a few corrections from Luke Muehlhauser).

OpenPhil, 2015. Potential risks from advanced artificial intelligence. An overview of AI risks and timelines, possible interventions, and current actors in this space.

For a more technical audience:

Stuart Russell:

  • The long-term future of AI (longer version), 2015. A video of Russell’s classic talk, discussing why it makes sense for AI researchers to think about AI safety, and going over various misconceptions about the issues.
  • Concerns of an AI pioneer, 2015. An interview with Russell on the importance of provably aligning AI with human values, and the challenges of value alignment research.
  • On Myths and Moonshine, 2014. Russell’s response to the “Myth of AI” question on, which draws an analogy between AI research and nuclear research, and points out some dangers of optimizing a misspecified utility function.

Scott Alexander, 2015. No time like the present for AI safety work. An overview of long-term AI safety challenges, e.g. preventing wireheading and formalizing ethics.

Victoria Krakovna, 2015. AI risk without an intelligence explosion. An overview of long-term AI risks besides the (overemphasized) intelligence explosion / hard takeoff scenario, arguing why intelligence explosion skeptics should still think about AI safety.

Stuart Armstrong, 2014. Smarter Than Us: The Rise Of Machine Intelligence. A short ebook discussing potential promises and challenges presented by advanced AI, and the interdisciplinary problems that need to be solved on the way there.

Technical overviews

Soares and Fallenstein, 2017. Aligning Superintelligence with Human Interests: A Technical Research Agenda

Amodei, Olah, et al, 2016. Concrete Problems in AI safety. Research agenda focusing on accident risks that apply to current ML systems as well as more advanced future AI systems.

Jessica Taylor et al, 2016. Alignment for Advanced Machine Learning Systems

FLI, 2015. A survey of research priorities for robust and beneficial AI

Jacob Steinhardt, 2015. Long-Term and Short-Term Challenges to Ensuring the Safety of AI Systems. A taxonomy of AI safety issues that require ordinary vs extraordinary engineering to address.

Nate Soares, 2015. Safety engineering, target selection, and alignment theory. Identifies and motivates three major areas of AI safety research.

Nick Bostrom, 2014. Superintelligence: Paths, Dangers, Strategies. A seminal book outlining long-term AI risk considerations.

Steve Omohundro, 2007. The basic AI drives. A classic paper arguing that sufficiently advanced AI systems are likely to develop drives such as self-preservation and resource acquisition independently of their assigned objectives.

Technical work

Value learning:

Smitha Milli et al. Should robots be obedient? Obedience to humans may sound like a great thing, but blind obedience can get in the way of learning human preferences.

William Saunders et al, 2017. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention. (blog post)

Amin, Jiang, and Singh, 2017. Repeated Inverse Reinforcement Learning. Separates the reward function into a task-specific component and an intrinsic component. In a sequence of task, the agent learns the intrinsic component while trying to avoid surprising the human.

Dylan Hadfield-Menell et al, 2016. Cooperative inverse reinforcement learning. Defines value learning as a cooperative game where the human tries to teach the agent about their reward function, rather than giving optimal demonstrations like in standard IRL.

Owain Evans et al, 2016. Learning the Preferences of Ignorant, Inconsistent Agents.

Reward gaming / wireheading:

Tom Everitt et al, 2017. Reinforcement learning with a corrupted reward channel. A formalization of the reward misspecification problem in terms of true and corrupt reward, a proof that RL agents cannot overcome reward corruption, and a framework for giving the agent extra information to overcome reward corruption. (blog post)

Amodei and Clark, 2016. Faulty Reward Functions in the Wild. An example of reward function gaming in a boat racing game, where the agent gets a higher score by going in circles and hitting the same targets than by actually playing the game.

Everitt and Hutter, 2016. Avoiding Wireheading with Value Reinforcement Learning. An alternative to RL that reduces the incentive to wirehead.

Laurent Orseau, 2015. Wireheading. An investigation into how different types of artificial agents respond to opportunities to wirehead (unintended shortcuts to maximize their objective function).

Interruptibility / corrigibility:

Dylan Hadfield-Menell et al. The Off-Switch Game. This paper studies the interruptibility problem as a game between human and robot, and investigates which incentives the robot could have to allow itself to be switched off.

El Mahdi El Mhamdi et al, 2017. Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning.

Orseau and Armstrong, 2016. Safely interruptible agents. Provides a formal definition of safe interruptibility and shows that off-policy RL agents are more interruptible than on-policy agents. (blog post)

Nate Soares et al, 2015. Corrigibility. Designing AI systems without incentives to resist corrective modifications by their creators.

Scalable oversight:

Christiano, Leike et al, 2017. Deep reinforcement learning from human preferences. Communicating complex goals to AI systems using human feedback (comparing pairs of agent trajectory segments).

David Abel et al. Agent-Agnostic Human-in-the-Loop Reinforcement Learning.


Armstrong and Levinstein, 2017. Low Impact Artificial Intelligences. An intractable but enlightening definition of low impact for AI systems.

Babcock, Kramar and Yampolskiy, 2017. Guidelines for Artificial Intelligence Containment.

Scott Garrabrant et al, 2016. Logical Induction. A computable algorithm for the logical induction problem.

Note: I did not include literature on less neglected areas of the field like safe exploration, distributional shift, adversarial examples, or interpretability (see e.g. Concrete Problems or the CHAI bibliography for extensive references on these topics).

Collections of technical works

CHAI bibliography

MIRI publications

FHI publications

FLI grantee publications (scroll down)

Paul Christiano. AI control. A blog on designing safe, efficient AI systems (approval-directed agents, aligned reinforcement learning agents, etc).

If there are any resources missing from this list that you think are a must-read, please let me know! If you want to go into AI safety research, check out these guidelines and the AI Safety Syllabus.

(Thanks to Ben Sancetta, Taymon Beal and Janos Kramar for their feedback on this post.)

This article was originally posted on Victoria Krakovna’s blog.


Frequently Asked Questions about Artificial Intelligence

Q: Who conceived of and wrote FLI’s open letter on robust and beneficial AI?

A: The open letter has been an initiative of the Future of Life Institute (especially the FLI founders and Berkeley AI researcher and FLI Advisory Board Member Stuart Russell) in collaboration with the AI research community (including a number of signatories).


Q: What sorts of AI systems is this letter addressing?

A: There is indeed a proliferation of meanings of the term “Artificial Intelligence”, largely because the intelligence we humans enjoy is actually comprised of many different capabilities. Some draw a distinction between “Narrow AI” (like solving CAPTCHAs or completing Google searches) and “General AI” that could replicate most or all human capabilities, roughly at or above human level. The open letter concerns both types of systems.


Q: What are the concerns behind FLI’s open letter on autonomous weapons?

A: Autonomous weapons are ideal for tasks such as assassinations, destabilizing nations, subduing populations and selectively killing a particular ethnic group. If any major military power pushes ahead with AI weapon development, a global arms race is virtually inevitable, and the endpoint of this technological trajectory is obvious: autonomous weapons will become the Kalashnikovs of tomorrow. Starting a military AI arms race is a bad idea, and should be prevented by a ban on offensive autonomous weapons beyond meaningful human control. Read more about the arguments against offensive autonomous weapons here.


Q: Why is the future of AI suddenly in the news? What has changed?

A: In previous decades, AI research had proceeded more slowly than some experts predicted. According to experts in the field, however, this trend has reversed in the past 5 years or so. AI researchers have been repeatedly surprised by, for example, the effectiveness of new visual and speech recognition systems. AI systems can solve CAPTCHAs that were specifically devised to foil AIs, translate spoken text on-the-fly, and teach themselves how to play games they have neither seen before nor been programmed to play. Moreover, the real-world value of this effectiveness has prompted massive investment by large tech firms such as Google, Facebook, and IBM, creating a positive feedback cycle that could dramatically speed progress.


Q: What are the potential benefits of AI as it grows increasingly sophisticated?

A: It’s difficult to tell at this stage, but AI will enable many developments that could be terrifically beneficial if managed with enough foresight and care. For example, menial tasks could be automated, which could give rise to a society of abundance, leisure, and flourishing, free of poverty and tedium. As another example, AI could also improve our ability to understand and manipulate complex biological systems, unlocking a path to drastically improved longevity and health, and to conquering disease.


Q: What is the general nature of the concern about AI safety?

A: The basic concern as AI systems become increasingly powerful is that they won’t do what we want them to do – perhaps because they aren’t correctly designed, perhaps because they are deliberately subverted, or perhaps because they do what we tell them to do rather than what we really want them to do (like in the classic stories of genies and wishes.) Many AI systems are programmed to have goals and to attain them as effectively as possible – for example, a trading algorithm has the goal of maximizing profit. Unless carefully designed to act in ways consistent with human values, a highly sophisticated AI trading system might exploit means that even the most ruthless financier would disavow. These are systems that literally have a mind of their own, and maintaining alignment between human interests and their choices and actions will be crucial.


Q: What is FLI’s position on AI being a threat to humanity?

A: FLI’s general position is represented well by the open letter on robust and beneficial AI. We believe it is currently unknown whether over the coming decades AI will be more like the internet (vast upside, relatively small risks), more like nuclear technologies (enormous risks relative to upside to date), or something else. We suspect that in the long-term the upsides and the risks will both be huge, but most strongly believe that research into this question is warranted.


Q: A lot of concern appears to focus on human-level or “superintelligent” AI. Is that a realistic prospect in the foreseeable future?

A: AI is already superhuman at some tasks, for example numerical computations, and will clearly surpass humans in others as time goes on. We don’t know when (or even if) machines will reach human-level ability in all cognitive tasks, but most of the AI researchers at FLI’s conference in Puerto Rico put the odds above 50% for this century, and many offered a significantly shorter timeline. Since the impact on humanity will be huge if it happens, it’s worthwhile to start research now on how to ensure that any impact is positive. Many researchers also believe that dealing with superintelligent AI will be qualitatively very different from more narrow AI systems, and will require very significant research effort to get right.


Q: Isn’t AI just a tool like any other? Won’t AI just do what we tell it to do?

A: It likely will – however, intelligence is, by many definitions, the ability to figure out how to accomplish goals. Even in today’s advanced AI systems, the builders assign the goal but don’t tell the AI exactly how to accomplish it, nor necessarily predict in detail how it will be done; indeed those systems often solve problems in creative, unpredictable ways. Thus the thing that makes such systems intelligent is precisely what can make them difficult to predict and control. They may therefore attain the goal we set them via means inconsistent with our preferences.


Q: Can you give an example of achieving a beneficial goal via inappropriate means?

A: Imagine, for example, that you are tasked with reducing traffic congestion in San Francisco at all costs, i.e. you do not take into account any other constraints. How would you do it? You might start by just timing traffic lights better. But wouldn’t there be less traffic if all the bridges closed down from 5 to 10AM, preventing all those cars from entering the city? Such a measure obviously violates common sense, and subverts the purpose of improving traffic, which is to help people get around – but it is consistent with the goal of “reducing traffic congestion”.


Q: Why should we prepare for human-level AI technology now rather than decades down the line when it’s closer?

A: First, even “narrow” AI systems, which approach or surpass human intelligence in a small set of capabilities (such as image or voice recognition) already raise important questions regarding their impact on society. Making autonomous vehicles safe, analyzing the strategic and ethical dimensions of autonomous weapons, and the effect of AI on the global employment and economic systems are three examples. Second, the longer-term implications of human or super-human artificial intelligence are dramatic, and there is no consensus on how quickly such capabilities will be developed. Many experts believe there is a chance it could happen rather soon, making it imperative to begin investigating long-term safety issues now, if only to get a better sense of how much early progress is actually possible.


Q: Is the concern that autonomous AI systems could become malevolent or self aware, or develop “volition”, and turn on us? And can’t we just unplug them?

A: One important concern is that some autonomous systems are designed to kill or destroy for military purposes. These systems would be designed so that they could not be “unplugged” easily. Whether further development of such systems is a favorable long-term direction is a question we urgently need to address. A separate concern is that high-quality decision-making systems could inadvertently be programmed with goals that do not fully capture what we want. Antisocial or destructive actions may result from logical steps in pursuit of seemingly benign or neutral goals. A number of researchers studying the problem have concluded that it is surprisingly difficult to completely guard against this effect, and that it may get even harder as the systems become more intelligent. They might, for example, consider our efforts to control them as being impediments to attaining their goals.


Q: Are robots the real problem? How can AI cause harm if it has no ability to directly manipulate the physical world?

A: What’s new and potentially risky is not the ability to build hinges, motors, etc., but the ability to build intelligence. A human-level AI could make money on financial markets, make scientific inventions, hack computer systems, manipulate or pay humans to do its bidding – all in pursuit of the goals it was initially programmed to achieve. None of that requires a physical robotic body, merely an internet connection.


Q: Are there types of advanced AI that would be safer than others?

A: We don’t yet know which AI architectures are safe; learning more about this is one of the goals of our grants program. AI researchers are generally very responsible people who want their work to better humanity. If there are certain AI designs that turn out to be unsafe, then AI researchers will want to know this so they can develop alternative AI systems.


Q: Can humans stay in control of the world if human- or superhuman-level AI is developed?

A: This is a big question that it would pay to start thinking about. Humans are in control of this planet not because we are stronger or faster than other animals, but because we are smarter! If we cede our position as smartest on our planet, it’s not obvious that we’ll retain control.


Q: Is the focus on the existential threat of superintelligent AI diverting too much attention from more pressing debates about AI in surveillance and the battlefield and its potential effects on the economy?

A: The near term and long term aspects of AI safety are both very important to work on. Research into superintelligence is an important part of the open letter, but the actual concern is very different from the Terminator-like scenarios that most media outlets round off this issue to. A much more likely scenario is a superintelligent system with neutral or benevolent goals that is misspecified in a dangerous way. Robust design of superintelligent systems is a complex interdisciplinary research challenge that will likely take decades, so it is very important to begin the research now, and a large part of the purpose of our research program is to make that happen. That said, the alarmist media framing of the issues is hardly useful for making progress in either the near term or long term domain.

Q: How did FLI get started?

A: It all started with Jaan Tallinn’s longstanding interest in AI risk. In 2011, Max and Anthony organized the Foundational Questions Institute (FQXi) conference and invited Jaan, who made strong arguments for the importance of AI risk. In 2013, Max and Meia decided to create a new organization for reducing existential risk, modeled structurally after FQXi, and got Jaan and Anthony on board. In late 2013, Jaan introduced Max to Vika, who was later invited to join the initiative and organize volunteer efforts. The first FLI meeting was held in March 2014, bringing together interested scientists and volunteers from the Boston area.

Hawking Reddit AMA on AI

Our Scientific Advisory Board member Stephen Hawking’s long-awaited Reddit AMA answers on Artificial Intelligence just came out, and was all over today’s world news, including MSNBCHuffington PostThe Independent and Time.

Read the Q&A below and visit the official Reddit page for the full discussion:

Question 1:

Professor Hawking- Whenever I teach AI, Machine Learning, or Intelligent Robotics, my class and I end up having what I call “The Terminator Conversation.” My point in this conversation is that the dangers from AI are overblown by media and non-understanding news, and the real danger is the same danger in any complex, less-than-fully-understood code: edge case unpredictability. In my opinion, this is different from “dangerous AI” as most people perceive it, in that the software has no motives, no sentience, and no evil morality, and is merely (ruthlessly) trying to optimize a function that we ourselves wrote and designed. Your viewpoints (and Elon Musk’s) are often presented by the media as a belief in “evil AI,” though of course that’s not what your signed letter says. Students that are aware of these reports challenge my view, and we always end up having a pretty enjoyable conversation. How would you represent your own beliefs to my class? Are our viewpoints reconcilable? Do you think my habit of discounting the layperson Terminator-style “evil AI” is naive? And finally, what morals do you think I should be reinforcing to my students interested in AI?

Answer 1:

You’re right: media often misrepresent what is actually said. The real risk with AI isn’t malice but competence. A superintelligent AI will be extremely good at accomplishing its goals, and if those goals aren’t aligned with ours, we’re in trouble. You’re probably not an evil ant-hater who steps on ants out of malice, but if you’re in charge of a hydroelectric green energy project and there’s an anthill in the region to be flooded, too bad for the ants. Let’s not place humanity in the position of those ants. Please encourage your students to think not only about how to create AI, but also about how to ensure its beneficial use.

Question 2:

Hello Doctor Hawking, thank you for doing this AMA. I am a student who has recently graduated with a degree in Artificial Intelligence and Cognitive Science. Having studied A.I., I have seen first hand the ethical issues we are having to deal with today concerning how quickly machines can learn the personal features and behaviours of people, as well as being able to identify them at frightening speeds. However, the idea of a “conscious” or actual intelligent system which could pose an existential threat to humans still seems very foreign to me, and does not seem to be something we are even close to cracking from a neurological and computational standpoint. What I wanted to ask was, in your message aimed at warning us about the threat of intelligent machines, are you talking about current developments and breakthroughs (in areas such as machine learning), or are you trying to say we should be preparing early for what will inevitably come in the distant future?

Answer 2:

The latter. There’s no consensus among AI researchers about how long it will take to build human-level AI and beyond, so please don’t trust anyone who claims to know for sure that it will happen in your lifetime or that it won’t happen in your lifetime. When it eventually does occur, it’s likely to be either the best or worst thing ever to happen to humanity, so there’s huge value in getting it right. We should shift the goal of AI from creating pure undirected artificial intelligence to creating beneficial intelligence. It might take decades to figure out how to do this, so let’s start researching this today rather than the night before the first strong AI is switched on.

Question 3:

Hello, Prof. Hawking. Thanks for doing this AMA! Earlier this year you, Elon Musk, and many other prominent science figures signed an open letter warning the society about the potential pitfalls of Artificial Intelligence. The letter stated: “We recommend expanded research aimed at ensuring that increasingly capable AI systems are robust and beneficial: our AI systems must do what we want them to do.” While being a seemingly reasonable expectation, this statement serves as a start point for the debate around the possibility of Artificial Intelligence ever surpassing the human race in intelligence.
My questions: 1. One might think it impossible for a creature to ever acquire a higher intelligence than its creator. Do you agree? If yes, then how do you think artificial intelligence can ever pose a threat to the human race (their creators)? 2. If it was possible for artificial intelligence to surpass humans in intelligence, where would you define the line of “It’s enough”? In other words, how smart do you think the human race can make AI, while ensuring that it doesn’t surpass them in intelligence?

Answer 3:

It’s clearly possible for a something to acquire higher intelligence than its ancestors: we evolved to be smarter than our ape-like ancestors, and Einstein was smarter than his parents. The line you ask about is where an AI becomes better than humans at AI design, so that it can recursively improve itself without human help. If this happens, we may face an intelligence explosion that ultimately results in machines whose intelligence exceeds ours by more than ours exceeds that of snails.

Question 4:

I’m rather late to the question-asking party, but I’ll ask anyway and hope. Have you thought about the possibility of technological unemployment, where we develop automated processes that ultimately cause large unemployment by performing jobs faster and/or cheaper than people can perform them? Some compare this thought to the thoughts of the Luddites, whose revolt was caused in part by perceived technological unemployment over 100 years ago. In particular, do you foresee a world where people work less because so much work is automated? Do you think people will always either find work or manufacture more work to be done? Thank you for your time and your contributions. I’ve found research to be a largely social endeavor, and you’ve been an inspiration to so many.

Answer 4:

If machines produce everything we need, the outcome will depend on how things are distributed. Everyone can enjoy a life of luxurious leisure if the machine-produced wealth is shared, or most people can end up miserably poor if the machine-owners successfully lobby against wealth redistribution. So far, the trend seems to be toward the second option, with technology driving ever-increasing inequality.

Question 5:

Hello Professor Hawking, thank you for doing this AMA! I’ve thought lately about biological organisms’ will to survive and reproduce, and how that drive evolved over millions of generations. Would an AI have these basic drives, and if not, would it be a threat to humankind? Also, what are two books you think every person should read?

Answer 5:

An AI that has been designed rather than evolved can in principle have any drives or goals. However, as emphasized by Steve Omohundro, an extremely intelligent future AI will probably develop a drive to survive and acquire more resources as a step toward accomplishing whatever goal it has, because surviving and having more resources will increase its chances of accomplishing that other goal. This can cause problems for humans whose resources get taken away.

Question 6:

Thanks for doing this AMA. I am a biologist. Your fear of AI appears to stem from the assumption that AI will act like a new biological species competing for the same resources or otherwise transforming the planet in ways incompatible with human (or other) life. But the reason that biological species compete like this is because they have undergone billions of years of selection for high reproduction. Essentially, biological organisms are optimized to ‘take over’ as much as they can. It’s basically their ‘purpose’. But I don’t think this is necessarily true of an AI. There is no reason to surmise that AI creatures would be ‘interested’ in reproducing at all. I don’t know what they’d be ‘interested’ in doing. I am interested in what you think an AI would be ‘interested’ in doing, and why that is necessarily a threat to humankind that outweighs the benefits of creating a sort of benevolent God.

Answer 6:

You’re right that we need to avoid the temptation to anthropomorphize and assume that AI’s will have the sort of goals that evolved creatures to. An AI that has been designed rather than evolved can in principle have any drives or goals. However, as emphasized by Steve Omohundro, an extremely intelligent future AI will probably develop a drive to survive and acquire more resources as a step toward accomplishing whatever goal it has, because surviving and having more resources will increase its chances of accomplishing that other goal. This can cause problems for humans whose resources get taken away.

Wait But Why: ‘The AI Revolution’

Tim Urban of Wait But Why has an engaging two-part series on the development of superintelligent AI and the dramatic consequences it would have on humanity. Equal parts exciting and sobering, this is a perfect primer for the layperson and thorough enough to be read-worthy to acquaintances of the topic as well.

Part 1: The Road to Superintelligence

Part 2: Our Immortality or Extinction