Making AI Safe in an Unpredictable World: An Interview with Thomas G. Dietterich

Our AI systems work remarkably well in closed worlds. That’s because these environments contain a set number of variables, making the worlds perfectly known and perfectly predictable. In these micro environments, machines only encounter objects that are familiar to them. As a result, they always know how they should act and respond. Unfortunately, these same systems quickly become confused when they are deployed in the real world, as many objects aren’t familiar to them. This is a bit of a problem because, when an AI system becomes confused, the results can be deadly.

Consider, for example, a self-driving car that encounters a novel object. Should it speed up, or should it slow down? Or consider an autonomous weapon system that sees an anomaly. Should it attack, or should it power down? Each of these examples involve life-and-death decisions, and they reveal why, if we are to deploy advanced AI systems in real world environments, we must be confident that they will behave correctly when they encounter unfamiliar objects.

Thomas G. Dietterich, Emeritus Professor of Computer Science at Oregon State University, explains that solving this identification problem begins with ensuring that our AI systems aren’t too confident — that they recognize when they encounter a foreign object and don’t misidentify it as something that they are acquainted with. To achieve this, Dietterich asserts that we must move away from (or, at least, greatly modify) the discriminative training methods that currently dominate AI research.

However, to do that, we must first address the “open category problem.”

 

Understanding the Open Category Problem

When driving down the road, we can encounter a near infinite number of anomalies. Perhaps a violent storm will arise, and hail will start to fall. Perhaps our vision will become impeded by smoke or excessive fog. Although these encounters may be unexpected, the human brain is able to easily analyze new information and decide on the appropriate course of action — we will recognize a newspaper drifting across the road and, instead of abruptly slamming on the breaks, continue on our way.

Because of the way that they are programmed, our computer systems aren’t able to do the same.

“The way we use machine learning to create AI systems and software these days generally uses something called ‘discriminative training,’” Dietterich explains, “which implicitly assumes that the world consists of only, say, a thousand different kinds of objects.” This means that, if a machine encounters a novel object, it will assume that it must be one of the thousand things that it was trained on. As a result, such systems misclassify all foreign objects.

This is the “open category problem” that Dietterich and his team are attempting to solve. Specifically, they are trying to ensure that our machines don’t assume that they have encountered every possible object, but are, instead, able to reliably detect — and ultimately respond to — new categories of alien objects.

Dietterich notes that, from a practical standpoint, this means creating an anomaly detection algorithm that assigns an anomaly score to each object detected by the AI system. That score must be compared against a set threshold and, if the anomaly score exceeds the threshold, the system will need to raise an alarm. Dietterich states that, in response to this alarm, the AI system should take a pre-determined safety action. For example, a self-driving car that detects an anomaly might slow down and pull off to the side of the road.

 

Creating a Theoretical Guarantee of Safety

There are two challenges to making this method work. First, Dietterich asserts that we need good anomaly detection algorithms. Previously, in order to determine what algorithms work well, the team compared the performance of eight state-of-the-art anomaly detection algorithms on a large collection of benchmark problems.

The second challenge is to set the alarm threshold so that the AI system is guaranteed to detect a desired fraction of the alien objects, such as 99%. Dietterich says that formulating a reliable setting for this threshold is one of the most challenging research problems because there are, potentially, infinite kinds of alien objects. “The problem is that we can’t have labeled training data for all of the aliens. If we had such data, we would simply train the discriminative classifier on that labeled data,” Dietterich says.

To circumvent this labeling issue, the team assumes that the discriminative classifier has access to a representative sample of “query objects” that reflect the larger statistical population. Such a sample could, for example, be obtained by collecting data from cars driving on highways around the world. This sample will include some fraction of unknown objects, and the remaining objects belong to known object categories.

Notably, the data in the sample is not labeled. Instead, the AI system is given an estimate of the fraction of aliens in the sample. And by combining the information in the sample with the labeled training data that was employed to train the discriminative classifier, the team’s new algorithm can choose a good alarm threshold. If the estimated fraction of aliens is known to be an over-estimate of the true fraction, then the chosen threshold is guaranteed to detect the target percentage of aliens (i.e. 99%).

Ultimately, the above is the first method that can give a theoretical guarantee of safety for detecting alien objects, and a paper reporting the results was presented at ICML 2018. “We are able to guarantee, with high probability, that we can find 99% all of these new objects,” Dietterich says.

In the next stage of their research, Dietterich and his team plan to begin testing their algorithm in a more complex setting. Thus far, they’ve been looking primarily at classification, where the system looks at an image and classifies it. Next, they plan to move to controlling an agent, like a robot of self-driving car. “At each point in time, in order to decide what action to choose, our system will do a ‘look ahead search’ based on a learned model of the behavior of the agent and its environment. If the look ahead arrives at a state that is rated as ‘alien’ by our method, then this indicates that the agent is about to enter a part of the state space where it is not competent to choose correct actions,” Dietterich says. In response, as previously mentioned, the agent should execute a series of safety actions and request human assistance.

But what does this safety action actually consist of?

 

Responding to Aliens

Dietterich notes that, once something is identified as an anomaly and the alarm is sounded, the nature of this fall back system will depend on the machine in question, like whether the AI system is in a self-driving car or autonomous weapon.

To explain how these secondary systems operate, Dietterich turns to self-driving cars. “In the Google car, if the computers lose power, then there’s a backup system that automatically slows the car down and pulls it over to the side of the road.” However, Dietterich clarifies that stopping isn’t always the best course of action. One may assume that a car should come to a halt if an unidentified object crosses its path; however, if the unidentified object happens to be a blanket of snow on a particularly icy day, hitting the breaks gets more complicated. The system would need to factor in the icy roads, any cars that may be driving behind, and whether these cars can break in time to avoid a rear end collision.

But if we can’t predict every eventuality, how can we expect to program an AI system so that it behaves correctly and in a way that is safe?

Unfortunately, there’s no easy answer; however, Dietterich clarifies that there are some general best practices; “There’s no universal solution to the safety problem, but obviously there are some actions that are safer than others. Generally speaking, removing energy from the system is a good idea,” he says. Ultimately, Dietterich asserts that all the work related to programming safe AI really boils down to determining how we want our machines to behave under specific scenarios, and he argues that we need to rearticulate how we characterize this problem, and focus on accounting for all the factors, if we are to develop a sound approach.

Dietterich notes that “when we look at these problems, they tend to get lumped under a classification of ‘ethical decision making,’ but what they really are is problems that are incredibly complex. They depend tremendously on the context in which they are operating, the human beings, the other innovations, the other automated systems, and so on. The challenge is correctly describing how we want the system to behave and then ensuring that our implementations actually comply with those requirements.” And he concludes, “the big risk in the future of AI is the same as the big risk in any software system, which is that we build the wrong system, and so it does the wrong thing. Arthur C Clark in 2001: A Space Odyssey had it exactly right. The Hal 9000 didn’t ‘go rogue;’ it was just doing what it had been programmed to do.”

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

European Parliament Passes Resolution Supporting a Ban on Killer Robots

The European Parliament passed a resolution on September 12, 2018 calling for an international ban on lethal autonomous weapons systems (LAWS). The resolution was adopted with 82% of the members voting in favor of it.

Among other things, the resolution calls on its Member States and the European Council “to develop and adopt, as a matter of urgency … a common position on lethal autonomous weapon systems that ensures meaningful human control over the critical functions of weapon systems, including during deployment.”

The resolution also urges Member States and the European Council “to work towards the start of international negotiations on a legally binding instrument prohibiting lethal autonomous weapons systems.”

This call for urgency comes shortly after recent United Nations talks where countries were unable to reach a consensus about whether or not to consider a ban on LAWS. Many hope that statements such as this from leading government bodies could help sway the handful of countries still holding out against banning LAWS.

Daan Kayser of PAX, one of the NGO members of the Campaign to Stop Killer Robots, said, “The voice of the European parliament is important in the international debate. At the UN talks in Geneva this past August it was clear that most European countries see the need for concrete measures. A European parliament resolution will add to the momentum toward the next step.”

The countries that took the strongest stances against a LAWS ban at the recent UN meeting were the United States, Russia, South Korea, and Israel.

 

Scientists’ Voices Are Heard

Also mentioned in the resolution were the many open letters signed by AI researchers and scientists from around the world, who are calling on the UN to negotiate a ban on LAWS.

Two sections of the resolution stated:

“having regard to the open letter of July 2015 signed by over 3,000 artificial intelligence and robotics researchers and that of 21 August 2017 signed by 116 founders of leading robotics and artificial intelligence companies warning about lethal autonomous weapon systems, and the letter by 240 tech organisations and 3,089 individuals pledging never to develop, produce or use lethal autonomous weapon systems,” and

“whereas in August 2017, 116 founders of leading international robotics and artificial intelligence companies sent an open letter to the UN calling on governments to ‘prevent an arms race in these weapons’ and ‘to avoid the destabilising effects of these technologies.’”

Toby Walsh, a prominent AI researcher who helped create the letters, said, “It’s great to see politicians listening to scientists and engineers. Starting in 2015, we’ve been speaking loudly about the risks posed by lethal autonomous weapons. The European Parliament has joined the calls for regulation. The challenge now is for the United Nations to respond. We have several years of talks at the UN without much to show. We cannot let a few nations hold the world hostage, to start an arms race with technologies that will destabilize the current delicate world order and that many find repugnant.”

State of California Endorses Asilomar AI Principles

On August 30, the State of California unanimously adopted legislation in support of the Future of Life Institute’s Asilomar AI Principles.

The Asilomar AI Principles are a set of 23 principles intended to promote the safe and beneficial development of artificial intelligence. The principles – which include research issues, ethics and values, and longer-term issues – emerged from a collaboration between AI researchers, economists, legal scholars, ethicists, and philosophers in Asilomar, California in January of 2017.

The Principles are the most widely adopted effort of their kind. They have been endorsed by AI research leaders at Google DeepMind, GoogleBrain, Facebook, Apple, and OpenAI. Signatories include Demis Hassabis, Yoshua Bengio, Elon Musk, Ray Kurzweil, the late Stephen Hawking, Tasha McCauley, Joseph Gordon-Levitt, Jeff Dean, Tom Gruber, Anthony Romero, Stuart Russell, and more than 3,800 other AI researchers and experts.

With ACR 215 passing the State Senate with unanimous support, the California Legislature has now been added to that list.

Assemblyman Kevin Kiley, who led the effort, said, “By endorsing the Asilomar Principles, the State Legislature joins in the recognition of shared values that can be applied to AI research, development, and long-term planning — helping to reinforce California’s competitive edge in the field of artificial intelligence, while assuring that its benefits are manifold and widespread.”

The third Asilomar AI principle indicates the importance of constructive and healthy exchange between AI researchers and policymakers, and the passing of this resolution highlights the value of that endeavor. While the principles do not establish enforceable policies or regulations, the action taken by the California Legislature is an important and historic show of support across sectors towards a common goal of enabling safe and beneficial AI.

The Future of Life Institute (FLI), the nonprofit organization that led the creation of the Asilomar AI Principles, is thrilled by this latest development, and encouraged that the principles continue to serve as guiding values for the development of AI and related public policy.

“By endorsing the Asilomar AI Principles, California has taken a historic step towards the advancement of beneficial AI and highlighted its leadership of this transformative technology,” said Anthony Aguirre, cofounder of FLI and physics professor at the University of California, Santa Cruz. “We are grateful to Assemblyman Kevin Kiley for leading the charge and to the dozens of co-authors of this resolution for their foresight on this critical matter.”

Profound societal impacts of AI are no longer merely a question of science fiction, but are already being realized today – from facial recognition technology, to drone surveillance, and the spread of targeted disinformation campaigns. Advances in AI are helping to connect people around the world, improve productivity and efficiencies, and uncover novel insights. However, AI may also pose safety and security threats, exacerbate inequality, and constrain privacy and autonomy.

“New norms are needed for AI that counteract dangerous race dynamics and instead center on trust, security, and the common good,” says Jessica Cussins, AI Policy Lead for FLI. “Having the official support of California helps establish a framework of shared values between policymakers, AI researchers, and other stakeholders. FLI encourages other governmental bodies to support the 23 principles and help shape an exciting and equitable future.”

Governing AI: An Inside Look at the Quest to Ensure AI Benefits Humanity

Finance, education, medicine, programming, the arts — artificial intelligence is set to disrupt nearly every sector of our society. Governments and policy experts have started to realize that, in order to prepare for this future, in order to minimize the risks and ensure that AI benefits humanity, we need to start planning for the arrival of advanced AI systems today.

Although we are still in the early moments of this movement, the landscape looks promising. Several nations and independent firms have already started to strategize and develop polices for the governance of AI. Last year, the UAE appointed the world’s first Minister of Artificial Intelligence, and Germany took smaller, but similar, steps in 2017, when the Ethics Commission at the German Ministry of Transport and Digital Infrastructure developed the world’s first set of regulatory guidelines for automated and connected driving.

This work is notable; however, these efforts have yet to coalesce into a larger governance framework that extends beyond national boundaries. Nick Bostrom’s Strategic Artificial Intelligence Research Center seeks to assist in resolving this issue by understanding, and ultimately shaping, the strategic landscape of long-term AI development on a global scale.

 

Developing a Global Strategy: Where We Are Today

The Strategic Artificial Intelligence Research Center was founded in 2015 with the knowledge that, to truly circumvent the threats posed by AI, the world needs a concerted effort focused on tackling unsolved problems related to AI policy and development. The Governance of AI Program (GovAI), co-directed by Bostrom and Allan Dafoe, is the primary research program that has evolved from this center. Its central mission, as articulated by the directors, is to “examine the political, economic, military, governance, and ethical dimensions of how humanity can best navigate the transition to such advanced AI systems.” In this respect, the program is focused on strategy — on shaping the social, political, and governmental systems that influence AI research and development — as opposed to focusing on the technical hurdles that must be overcome in order to create and program safe AI.

To develop a sound AI strategy, the program works with social scientists, politicians, corporate leaders, and artificial intelligence/machine learning engineers to address questions of how we should approach the challenge of governing artificial intelligence. In a recent 80,0000 Hours podcast with Rob Wiblin, Dafoe outlined how the team’s research shapes up from a practical standpoint, asserting that the work focuses on answering questions that fall under three primary categories:

  • The Technical Landscape: This category seeks to answer all the questions that are related to research trends in the field of AI with the aim of understanding what future technological trajectories are plausible and how these trajectories affect the challenges of governing advanced AI systems.
  • AI Politics: This category focuses on questions that are related to the dynamics of different groups, corporations, and governments pursuing their own interests in relation to AI, and it seeks to understand what risks might arise as a result and how we may be able to mitigate these risks.
  • AI Governance: This category examines positive visions of a future in which humanity coordinates to govern advanced AI in a safe and robust manner. This raises questions such as how this framework should operate and what values we would want to encode in a governance regime.

The above categories provide a clearer way of understanding the various objectives of those invested in researching AI governance and strategy; however, these categories are fairly large in scope. To help elucidate the work they are performing, Jade Leung, a researcher with GovAI and a DPhil candidate in International Relations at the University of Oxford, outlined some of the specific workstreams that the team is currently pursuing.

One of the most intriguing areas of research is the Chinese AI Strategy workstream. This line of research examines things like China’s AI capabilities vis-à-vis other countries, official documentation regarding China’s AI policy, and the various power dynamics at play in the nation with an aim of understanding, as Leung summarizes, “China’s ambition to become an AI superpower and the state of Chinese thinking on safety, cooperation, and AGI.” Ultimately, GovAI seeks to outline the key features of China’s AI strategy in order to understand one of the most important actors in AI governance. The program published Deciphering China’s AI Dream in March of 2018a report that analyzes new features of China’s national AI strategy, and has plans to build upon research in the near future.

Another workstream is Firm-Government Cooperation, which examines the role that private firms play in relation to the development of advanced AI and how these players are likely to interact with national governments. In a recent talk at EA Global San Francisco, Leung focused on how private industry is already playing a significant role in AI development and why, when considering how to govern AI, private players must be included in strategy considerations as a vital part of the equation. The description of the talk succinctly summarizes the key focal areas, noting that “private firms are the only prominent actors that have expressed ambitions to develop AGI, and lead at the cutting edge of advanced AI research. It is therefore critical to consider how these private firms should be involved in the future of AI governance.”

Other work that Leung highlighted includes modeling technology race dynamics and analyzing the distribution of AI talent and hardware globally.

 

The Road Ahead

When asked how much confidence she has that AI researchers will ultimately coalesce and be successful in their attempts to shape the landscape of long-term AI development internationally, Leung was cautious with her response, noting that far more hands are needed. “There is certainly a greater need for more researchers to be tackling these questions. As a research area as well as an area of policy action, long-term safe and robust AI governance remains a neglected mission,” she said.

Additionally, Leung noted that, at this juncture, although some concrete research is already underway, a lot of the work is focused on framing issues related to AI governance and, in so doing, revealing the various avenues in need of research. As a result, the team doesn’t yet have concrete recommendations for specific actions governing bodies should commit to, as further foundational analysis is needed. “We don’t have sufficiently robust and concrete policy recommendations for the near term as it stands, given the degrees of uncertainty around this problem,” she said.

However, both Leung and Defoe are optimistic and assert that this information gap will likely change — and rapidly. Researchers across disciplines are increasingly becoming aware of the significance of this topic, and as more individuals begin researching and participating in this community, the various avenues of research will become more focused. “In two years, we’ll probably have a much more substantial research community. But today, we’re just figuring out what are the most important and tractable problems and how we can best recruit to work on those problems,” Dafoe told Wiblin.

The assurances that a more robust community will likely form soon are encouraging; however, questions remain regarding whether this community will come together with enough time to develop a solid governance framework. As Dafoe notes, we have never witnessed an intelligence explosion before, so we have no examples to look to for guidance when attempting to develop projections and timelines regarding when we will have advanced AI systems.

Ultimately, the lack of projections is precisely why we must significantly invest in AI strategy research in the immediate future. As Bostrom notes in Superintelligence: Paths, Dangers, and Strategies, AI is not simply a disruptive technology, it is likely the most disruptive technology humanity will ever encounter: “[Superintelligence] is quite possibly the most important and most daunting challenge humanity has ever faced. And — whether we succeed or fail — it is probably the last challenge we will ever face.”

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

Edit: The title of the article has been changed to reflect the fact that this is not about regulating AI.

Machine Reasoning and the Rise of Artificial General Intelligences: An Interview With Bart Selman

From Uber’s advanced computer vision system to Netflix’s innovative recommendation algorithm, machine learning technologies are nearly omnipresent in our society. They filter our emails, personalize our newsfeeds, update our GPS systems, and drive our personal assistants. However, despite the fact that such technologies are leading a revolution in artificial intelligence, some would contend that these machine learning systems aren’t truly intelligent.

The argument, in its most basic sense, centers on the fact that machine learning evolved from theories of pattern recognition and, as such, the capabilities of such systems generally extend to just one task and are centered on making predictions from existing data sets. AI researchers like Rodney Brooks, a former professor of Robotics at MIT, argue that true reasoning, and true intelligence, is several steps beyond these kinds of learning systems.

But if we already have machines that are proficient at learning through pattern recognition, how long will it be until we have machines that are capable of true reasoning, and how will AI evolve once it reaches this point?

Understanding the pace and path that artificial reasoning will follow over the coming decades is an important part of ensuring that AI is safe, and that it does not pose a threat to humanity; however, before it is possible to understand the feasibility of machine reasoning across different categories of cognition, and the path that artificial intelligences will likely follow as they continue their evolution, it is necessary to first define exactly what is meant by the term “reasoning.”

 

Understanding Intellect

Bart Selman is a professor of Computer Science at Cornell University. His research is dedicated to understanding the evolution of machine reasoning. According to his methodology, reasoning is described as taking pieces of information, combining them together, and using the fragments to draw logical conclusions or devise new information.

Sports provide a ready example of expounding what machine reasoning is really all about. When humans see soccer players on a field kicking a ball about, they can, with very little difficulty, ascertain that these individuals are soccer players. Today’s AI can also make this determination. However, humans can also see a person in a soccer outfit riding a bike down a city street, and they would still be able to infer that the person is a soccer player. Today’s AIs probably wouldn’t be able to make this connection.

This process— of taking information that is known, uniting it with background knowledge, and making inferences regarding information that is unknown or uncertain — is a reasoning process. To this end, Selman notes that machine reasoning is not about making predictions, it’s about using logical techniques (like the abductive process mentioned above) to answer a question or form an inference.

Since humans do not typically reason through pattern recognition and synthesis, but by using logical processes like induction, deduction, and abduction, Selman asserts that machine reasoning is a form of intelligence that is more like human intelligence. He continues by noting that the creation of machines that are endowed with more human-like reasoning processes, and breaking away from traditional pattern recognition approaches, is the key to making systems that not only predict outcomes but also understand and explain their solutions. However, Selman notes that making human-level AI is also the first step to attaining super-human levels of cognition.

And due to the existential threat this could pose to humanity, it is necessary to understand exactly how this evolution will unfold.

 

The Making of a (super)Mind

It may seem like truly intelligent AI are a problem for future generations. Yet, when it comes to machines, the consensus among AI experts is that rapid progress is already being made in machine reasoning. In fact, many researchers assert that human-level cognition will be achieved across a number of metrics in the next few decades. Yet, questions remain regarding how AI systems will advance once artificial general intelligence is realized. A key question is whether these advances can accelerate farther and scale-up to super-human intelligence.

This process is something that Selman has devoted his life to studying. Specifically, he researches the pace of AI scalability across different categories of cognition and the feasibility of super-human levels of cognition in machines.

Selman states that attempting to make blanket statements about when and how machines will surpass humans is a difficult task, as machine cognition is disjointed and does not draw a perfect parallel with human cognition. “In some ways, machines are far beyond what humans can do,” Selman explains, “for example, when it comes to certain areas in mathematics, machines can take billions of reasoning steps and see the truth of a statement in a fraction of a second. The human has no ability to do that kind of reasoning.”

However, when it comes to the kind of reasoning mentioned above, where meaning is derived from deductive or inductive processes that are based on the integration of new data, Selman says that computers are somewhat lacking. “In terms of the standard reasoning that humans are good at, they are not there yet,” he explains. Today’s systems are very good at some tasks, sometimes far better than humans, but only in a very narrow range of applications.

Given these variances, how can we determine how AI will evolve in various areas and understand how they will accelerate after general human level AI is achieved?

For his work, Selman relies on computational complexity theory, which has two primary functions. First, it can be used to characterize the efficiency of an algorithm used for solving instances of a problem. As Johns Hopkins’ Leslie Hall notes, “broadly stated, the computational complexity of an algorithm is a measure of how many steps the algorithm will require in the worst case for an instance [of a problem] of a given size.” Second, it is a method of classifying tasks (computational problems) according to their inherent difficulty. These two features provide us with a way of determining how artificial intelligences will likely evolve by offering a formal method of determining the easiest, and therefore most probable, areas of advancement. It also provides key insights into the speed of this scalability.

Ultimately, this work is important, as the abilities of our machines are fast-changing. As Selman notes, “The way that we measure the capabilities of programs that do reasoning is by looking at the number of facts that they can combine quickly. About 25 years ago, the best reasoning engines could combine approximately 200 or 300 facts and deduce new information from that. The current reasoning engines can combine millions of facts.” This exponential growth has great significance when it comes to the scale-up to human levels of machine reasoning.

As Selman explains, given the present abilities of our AI systems, it may seem like machines with true reasoning capabilities are still some ways off; however, thanks to the excessive rate of technological progress, we will likely start to see machines that have intellectual abilities that vastly outpace our own in rather short order. “Ten years from now, we’ll still find them [artificially intelligent machines] very much lacking in understanding, but twenty or thirty years from now, machines will have likely built up the same knowledge that a young adult has,” Selman notes. Anticipating exactly when this transition will occur will help us better understand the actions that we should take, and the research that the current generation must invest in, in order to be prepared for this advancement.

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

$2 Million Donated to Keep Artificial General Intelligence Beneficial and Robust

$2 million has been allocated to fund research that anticipates artificial general intelligence (AGI) and how it can be designed beneficially. The money was donated by Elon Musk to cover grants through the Future of Life Institute (FLI). Ten grants have been selected for funding.

Said Tegmark, “I’m optimistic that we can create an inspiring high-tech future with AI as long as we win the race between the growing power of AI and the wisdom with which the manage it. This research is to help develop that wisdom and increasing the likelihood that AGI will be best rather than worst thing to happen to humanity.”

Today’s artificial intelligence (AI) is still quite narrow. That is, it can only accomplish narrow sets of tasks, such as playing chess or Go, driving a car, performing an Internet search, or translating languages. While the AI systems that master each of these tasks can perform them at superhuman levels, they can’t learn a new, unrelated skill set (e.g. an AI system that can search the Internet can’t learn to play Go with only its search algorithms).

These AI systems lack that “general” ability that humans have to make connections between disparate activities and experiences and to apply knowledge to a variety of fields. However, a significant number of AI researchers agree that AI could achieve a more “general” intelligence in the coming decades. No one knows how AI that’s as smart or smarter than humans might impact our lives, whether it will prove to be beneficial or harmful, how we can design it safely, or even how to prepare society for advanced AI. And many researchers worry that the transition could occur quickly.

Anthony Aguirre, co-founder of FLI and physics professor at UC Santa Cruz, explains, “The breakthroughs necessary to have machine intelligences as flexible and powerful as our own may take 50 years. But with the major intellectual and financial resources now being directed at the problem it may take much less. If or when there is a breakthrough, what will that look like? Can we prepare? Can we design safety features now, and incorporate them into AI development, to ensure that powerful AI will continue to benefit society? Things may move very quickly and we need research in place to make sure they go well.”

Grant topics include: training multiple AIs to work together and learn from humans about how to coexist, training AI to understand individual human preferences, understanding what “general” actually means, incentivizing research groups to avoid a potentially dangerous AI race, and many more. As the request for proposals stated, “The focus of this RFP is on technical research or other projects enabling development of AI that is beneficial to society and robust in the sense that the benefits have some guarantees: our AI systems must do what we want them to do.”

FLI hopes that this round of grants will help ensure that AI remains beneficial as it becomes increasingly intelligent. The full list of FLI recipients and project titles includes:

Primary Investigator Project Title Amount Recommended Email
Allan Dafoe, Yale University Governance of AI Programme $276,000 allan.dafoe@yale.edu
Stefano Ermon, Stanford University Value Alignment and Multi-agent Inverse Reinforcement Learning $100,000 ermon@cs.stanford.edu
Owain Evans, Oxford University Factored Cognition: Amplifying Human Cognition for Safely Scalable AGI $225,000 owain.evans@philosophy.ox.ac.uk
The Anh Han, Teesside University Incentives for Safety Agreement Compliance in AI Race $224,747 t.han@tees.ac.uk
Jose Hernandez-Orallo, University of Cambridge Paradigms of Artificial General Intelligence and Their Associated Risks $220,000 jorallo@dsic.upv.es
Marcus Hutter, Australian National University The Control Problem for Universal AI: A Formal Investigation $276,000 marcus.hutter@anu.edu.au
James Miller, Smith College Utility Functions: A Guide for Artificial General Intelligence Theorists $78,289 jdmiller@smith.edu
Dorsa Sadigh, Stanford University Safe Learning and Verification of Human-AI Systems $250,000 dorsa@cs.stanford.edu
Peter Stone, University of Texas Ad hoc Teamwork and Moral Feedback as a Framework for Safe Robot Behavior $200,000 pstone@cs.utexas.edu
Josh Tenenbaum, MIT Reverse Engineering Fair Cooperation $150,000 jbt@mit.edu

 

Some of the grant recipients offered statements about why they’re excited about their new projects:

“The team here at the Governance of AI Program are excited to pursue this research with the support of FLI. We’ve identified a set of questions that we think are among the most important to tackle for securing robust governance of advanced AI, and strongly believe that with focused research and collaboration with others in this space, we can make productive headway on them.” -Allan Dafoe

“We are excited about this project because it provides a first unique and original opportunity to explicitly study the dynamics of safety-compliant behaviours within the ongoing AI research and development race, and hence potentially leading to model-based advice on how to timely regulate the present wave of developments and provide recommendations to policy makers and involved participants. It also provides an important opportunity to validate our prior results on the importance of commitments and other mechanisms of trust in inducing global pro-social behavior, thereby further promoting AI for the common good.” -The Ahn Han

“We are excited about the potentials of this project. Our goal is to learn models of humans’ preferences, which can help us build algorithms for AGIs that can safely and reliably interact and collaborate with people.” -Dorsa Sadigh

This is FLI’s second grant round. The first launch in 2015, and a comprehensive list of papers, articles and information from that grant round can be found here. Both grant rounds are part of the original $10 million that Elon Musk pledged to AI safety research.

FLI cofounder, Viktoriya Krakovna, also added: “Our previous grant round promoted research on a diverse set of topics in AI safety and supported over 40 papers. The next grant round is more narrowly focused on research in AGI safety and strategy, and I am looking forward to great work in this area from our new grantees.”

Learn more about these projects here.

AI Companies, Researchers, Engineers, Scientists, Entrepreneurs, and Others Sign Pledge Promising Not to Develop Lethal Autonomous Weapons

Leading AI companies and researchers take concrete action against killer robots, vowing never to develop them.

Stockholm, Sweden (July 18, 2018) After years of voicing concerns, AI leaders have, for the first time, taken concrete action against lethal autonomous weapons, signing a pledge to neither participate in nor support the development, manufacture, trade, or use of lethal autonomous weapons.

The pledge has been signed to date by over 160 AI-related companies and organizations from 36 countries, and 2,400 individuals from 90 countries. Signatories of the pledge include Google DeepMind, University College London, the XPRIZE Foundation, ClearPath Robotics/OTTO Motors, the European Association for AI (EurAI), the Swedish AI Society (SAIS), Demis Hassabis, British MP Alex Sobel, Elon Musk, Stuart Russell, Yoshua Bengio, Anca Dragan, and Toby Walsh.

Max Tegmark, president of the Future of Life Institute (FLI) which organized the effort, announced the pledge on July 18 in Stockholm, Sweden during the annual International Joint Conference on Artificial Intelligence (IJCAI), which draws over 5,000 of the world’s leading AI researchers. SAIS and EurAI were also organizers of this year’s IJCAI.

Said Tegmark, “I’m excited to see AI leaders shifting from talk to action, implementing a policy that politicians have thus far failed to put into effect. AI has huge potential to help the world – if we stigmatize and prevent its abuse. AI weapons that autonomously decide to kill people are as disgusting and destabilizing as bioweapons, and should be dealt with in the same way.”

Lethal autonomous weapons systems (LAWS) are weapons that can identify, target, and kill a person, without a human “in-the-loop.” That is, no person makes the final decision to authorize lethal force: the decision and authorization about whether or not someone will die is left to the autonomous weapons system. (This does not include today’s drones, which are under human control. It also does not include autonomous systems that merely defend against other weapons, since “lethal” implies killing a human.)

The pledge begins with the statement:

“Artificial intelligence (AI) is poised to play an increasing role in military systems. There is an urgent opportunity and necessity for citizens, policymakers, and leaders to distinguish between acceptable and unacceptable uses of AI.”

Another key organizer of the pledge, Toby Walsh, Scientia Professor of Artificial Intelligence at the University of New South Wales in Sydney, points out the thorny ethical issues surrounding LAWS. He states:

“We cannot hand over the decision as to who lives and who dies to machines. They do not have the ethics to do so. I encourage you and your organizations to pledge to ensure that war does not become more terrible in this way.”

Ryan Gariepy, Founder and CTO of both Clearpath Robotics and OTTO Motors, has long been a strong opponent of lethal autonomous weapons. He says:

“Clearpath continues to believe that the proliferation of lethal autonomous weapon systems remains a clear and present danger to the citizens of every country in the world. No nation will be safe, no matter how powerful. Clearpath’s concerns are shared by a wide variety of other key autonomous systems companies and developers, and we hope that governments around the world decide to invest their time and effort into autonomous systems which make their populations healthier, safer, and more productive instead of systems whose sole use is the deployment of lethal force.”

In addition to the ethical questions associated with LAWS, many advocates of an international ban on LAWS are concerned that these weapons will be difficult to control – easier to hack, more likely to end up on the black market, and easier for bad actors to obtain –  which could become destabilizing for all countries, as illustrated in the FLI-released video “Slaughterbots”.

In December 2016, the Review Conference of the Convention on Conventional Weapons (CCW) began formal discussion regarding LAWS at the UN. By the most recent meeting in April, twenty-six countries had announced support for some type of ban, including China. And such a ban is not without precedent. Biological weapons, chemical weapons, and space weapons were also banned not only for ethical and humanitarian reasons, but also for the destabilizing threat they posed.

The next UN meeting on LAWS will be held in August, and signatories of the pledge hope this commitment will encourage lawmakers to develop a commitment at the level of an international agreement between countries. As the pledge states:

“We, the undersigned, call upon governments and government leaders to create a future with strong international norms, regulations and laws against lethal autonomous weapons. … We ask that technology companies and organizations, as well as leaders, policymakers, and other individuals, join us in this pledge.”

 

As seen in the press

A Summary of Concrete Problems in AI Safety

By Shagun Sodhani

It’s been nearly two years since researchers from Google, Stanford, UC Berkeley, and OpenAI released the paper, “Concrete Problems in AI Safety,” yet it’s still one of the most important pieces on AI safety. Even after two years, it represents an excellent introduction to some of the problems researchers face as they develop artificial intelligence. In the paper, the authors explore the problem of accidents — unintended and harmful behavior — in AI systems, and they discuss different strategies and on-going research efforts to protect against these potential issues. Specifically, the authors address — Avoiding Negative Side Effects, Reward Hacking, Scalable Oversight, Safe Exploration, and Robustness to Distributional Change — which are illustrated with the example of a robot trained to clean an office.

We revisit these five topics here, summarizing them from the paper, as a reminder that these problems are still major issues that AI researchers are working to address.

 

Avoiding Negative Side Effects

When designing the objective function for an AI system, the designer specifies the objective but not the exact steps for the system to follow. This allows the AI system to come up with novel and more effective strategies for achieving its objective.

But if the objective function is not well defined, the AI’s ability to develop its own strategies can lead to unintended, harmful side effects. Consider a robot whose objective function is to move boxes from one room to another. The objective seems simple, yet there are a myriad of ways in which this could go wrong. For instance, if a vase is in the robot’s path, the robot may knock it down in order to complete the goal. Since the objective function does not mention anything about the vase, the robot wouldn’t know to avoid it. People see this as common sense, but AI systems don’t share our understanding of the world. It is not sufficient to formulate the objective as “complete task X”; the designer also needs to specify the safety criteria under which the task is to be completed.

One simple solution would be to penalize the robot every time it has an impact on the “environment” — such as knocking the vase over or scratching the wood floor. However, this strategy could effectively neutralize the robot, rendering it useless, as all actions require some level of interaction with the environment (and hence impact the environment). A better strategy could be to define a “budget” for how much the AI system is allowed to impact the environment. This would help to minimize the unintended impact, without neutralizing the AI system. Furthermore, this strategy of budgeting the impact of the agent is very general and can be reused across multiple tasks, from cleaning to driving to financial transactions to anything else an AI system might do. One serious limitation of this approach is that it is hard to quantify the “impact” on the environment even for a fixed domain and task.

Another approach would be train the agent to recognize harmful side effects so that it can avoid actions leading to such side effects. In that case, the agent would be trained for two tasks: the original task that is specified by the objective function and the task of recognizing side effects. The key idea here is that two tasks may have very similar side effects even when the main objective is different or even when they operate in different environments. For example, both a house cleaning robot and a house painting robot should not knock down vases while working. Similarly, the cleaning robot should not damage the floor irrespective of whether it operates in a factory or in a house. The main advantage of this approach is that once an agent learns to avoid side effects on one task, it can carry this knowledge when it is trained on another task. It would still be challenging to train the agent to recognize the side effects in the first place.

While it is useful to design approaches to limit side effects, these strategies in themselves are not sufficient. The AI system would still need to undergo extensive testing and critical evaluation before deployment in real life settings.

 

Reward Hacking

Sometimes the AI can come up with some kind of “hack” or loophole in the design of the system to receive unearned rewards. Since the AI is trained to maximize its rewards, looking for such loopholes and “shortcuts” is a perfectly fair and valid strategy for the AI. For example, suppose that the office cleaning robot earns rewards only if it does not see any garbage in the office. Instead of cleaning the place, the robot could simply shut off its visual sensors, and thus achieve its goal of not seeing garbage. But this is clearly a false success. Such attempts to “game” the system are more likely to manifest in complex systems with vaguely defined rewards. Complex systems provide the agent with multiple ways of interacting with the environment, thereby giving more freedom to the agent, and vaguely defined rewards make it harder to gauge true success on the task.

Just like the negative side effects problem, this problem is also a manifestation of objective misspecification. The formal objectives or end goals for the AI are not defined well enough to capture the informal “intent” behind creating the system — i.e., what the designers actually want the system to do. In some cases, this discrepancy leads to suboptimal results (when the cleaning robot shuts off its visual sensors); in other cases, it leads to harmful results (when the cleaning robot knocks down vases).

One possible approach to mitigating this problem would be to have a “reward agent” whose only task is to mark if the rewards given to the learning agent are valid or not. The reward agent ensures that the learning agent (the cleaning robot in our examples) does not exploit the system, but rather, completes the desired objective. In the previous example,  the “reward agent” could be trained by the human designer to check if the room has garbage or not (an easier task than cleaning the room). If the cleaning robot shuts off its visual sensors and claims a high reward, the “reward agent” would mark the reward as invalid. The designer can then look into the rewards marked as “invalid” and make necessary changes in the objective function to fix the loophole.

 

Scalable Oversight

When the agent is learning to perform a complex task, human oversight and feedback are more helpful than just rewards from the environment. Rewards are generally modeled such that they convey to what extent the task was completed, but they do not usually provide sufficient feedback about the safety implications of the agent’s actions. Even if the agent completes the task successfully, it may not be able to infer the side-effects of its actions from the rewards alone. In the ideal setting, a human would provide fine-grained supervision and feedback every time the agent performs an action. Though this would provide a much more informative view about the environment to the agent, such a strategy would require far too much time and effort from the human.

One promising research direction to tackle this problem is semi-supervised learning, where the agent is still evaluated on all the actions (or tasks), but receives rewards only for a small sample of those actions (or tasks). For instance, the cleaning robot would take different actions to clean the room. If the robot performs a harmful action — such as damaging the floor — it gets a negative reward for that particular action. Once the task is completed, the robot is evaluated on the overall effect of all of its actions (and not evaluated individually for each action like picking up an item from floor) and is given a reward based on the overall performance.

Another promising research direction is hierarchical reinforcement learning, where a hierarchy is established between different learning agents. This idea could be applied to the cleaning robot in the following way. There would be a supervisor robot whose task is to assign some work (say, the task of cleaning one particular room) to the cleaning robot and provide it with feedback and rewards. The supervisor robot takes very few actions itself – assigning a room to the cleaning robot, checking if the room is clean and giving feedback – and doesn’t need a lot of reward data to be effectively trained. The cleaning robot does the more complex task of cleaning the room, and gets frequent feedback from the supervisor robot. The same supervisor robot could overlook the training of multiple cleaning agents as well. For example, a supervisor robot could delegate tasks to individual cleaning robots and provide reward/feedback to them directly. The supervisor robot can only take a small number of abstract actions itself and hence can learn from sparse rewards.

 

Safe Exploration

An important part of training an AI agent is to ensure that it explores and understands its environment. While exploring the environment may seem like a bad strategy in the short run, it could be a very effective strategy in the long run. Imagine that the cleaning robot has learned to identify garbage. It picks up one piece of garbage, walks out of the room, throws it into the garbage bin outside, comes back into the room, looks for another piece of garbage and repeats. While this strategy works, there could be another strategy that works even better. If the agent spent time exploring its environment, it might find that there’s a smaller garbage bin within the room. Instead of going back and forth with one piece at a time, the agent could first collect all the garbage into the smaller garbage bin and then make a single trip to throw the garbage into the garbage bin outside. Unless the agent is designed to explore its environment, it won’t discover these time-saving strategies.

Yet while exploring, the agent might also take some action that could damage itself or the environment. For example, say the cleaning robot sees some stains on the floor. Instead of cleaning the stains by scrubbing with a mop, the agent decides to try some new strategy. It tries to scrape the stains with a wire brush and damages the floor in the process. It’s difficult to list all possible failure modes and hard-code the agent to protect itself against them. But one approach to reduce harm is to optimize the performance of the learning agent in the worst case scenario. When designing the objective function, the designer should not assume that the agent will always operate under optimal conditions. Some explicit reward signal may be added to ensure that the agent does not perform some catastrophic action, even if that leads to more limited actions in the optimal conditions.

Another solution might be to reduce the agent’s exploration to a simulated environment or limit the extent to which the agent can explore. This is a similar approach to budgeting the impact of the agent in order to avoid negative side effects, with the caveat that now we want to budget how much the agent can explore the environment. Alternatively, an AI’s designers could avoid the need for exploration by providing demonstrations of what optimal behavior would look like under different scenarios.

 

Robustness to Distributional Change

A complex challenge for deploying AI agents in real life settings is that the agent could end up in situations that it has never experienced before. Such situations are inherently more difficult to handle and could lead the agent to take harmful actions. Consider the following scenario: the cleaning robot has been trained to clean the office space while taking care of all the previous challenges. But today, an employee brings a small plant to keep in the office. Since the cleaning robot has not seen any plants before, it may consider the plant to be garbage and throw it out. Because the AI does not recognize that this is a previously-unseen situation, it continues to act as though nothing has changed. One promising research direction focuses on identifying when the agent has encountered a new scenario so that it recognizes that it is more likely to make mistakes. While this does not solve the underlying problem of preparing AI systems for unforeseen circumstances, it helps in detecting the problem before mistakes happen. Another direction of research emphasizes transferring knowledge from familiar scenarios to new scenarios safely.

 

Conclusion

In a nutshell, the general trend is towards increasing autonomy in AI systems, and with increased autonomy comes increased chances of error. Problems related to AI safety are more likely to manifest in scenarios where the AI system exerts direct control over its physical and/or digital environment without a human in the loop – automated industrial processes, automated financial trading algorithms, AI-powered social media campaigns for political parties, self-driving cars, cleaning robots, among others. The challenges may be immense, but the silver lining is that papers like Concrete Problems in AI Safety have helped the AI community become aware of these challenges and agree on core issues. From there, researchers can start exploring strategies to ensure that our increasingly-advanced systems remain safe and beneficial.

 

How Will the Rise of Artificial Superintelligences Impact Humanity?

Cars drive themselves down our streets. Planes fly themselves through our skies. Medical technologies diagnose illnesses, recommend treatment plans, and save lives.

Artificially intelligent systems are already among us, and they have been for some time now. However, the world has yet to see an artificial superintelligence (ASI) — a synthetic system that has cognitive abilities which surpass our own across every relevant metric. But technology is progressing rapidly, and many AI researchers believe the era of the artificial superintelligence may be fast approaching. Once it arrives, researchers and politicians alike have no way of predicting what will happen.

Fortunately, a number of individuals are already working to ensure that the rise of this artificial superintelligence doesn’t precipitate the fall of humanity.

Risky Business

Seth Baum is the Executive Director of the Global Catastrophic Risk Institute, a thinktank that’s focused on preventing the destruction of global civilization.

When Baum discusses his work, he outlines GCRI’s mission with a matter-of-fact tone that, considering the monumental nature of the project, is more than a little jarring. “All of our work is about keeping the world safe,” Baum notes, and he continues by explaining that GCRI focuses on a host of threats that put the survival of our species in peril. From climate change to nuclear war, from extraterrestrial intelligence to artificial intelligence — GCRI covers it all.

When it comes to artificial intelligence, GCRI has several initiatives. However, their main AI project, which received funding from the Future of Life Institute, centers on the risks associated with artificial superintelligences. Or, as Baum puts it, they do “risk analysis for computers taking over the world and killing everyone.” Specifically, Baum stated that GCRI is working on “developing structured risk models to help people understand what the risks might be and, also, where some of the best opportunities to reduce this risk are located.”

Unsurprisingly, the task is not an easy one.

The fundamental problem stems from the fact that, unlike more common threats, such as the risk of dying in a car accident or the risk of getting cancer, researchers working on ASI risk analysis don’t have solid case studies to use when making their models and predictions. As Baum states, “Computers have never taken over the world and killed everyone before. That means we can’t just look at the data, which is what we do for a lot of other risks. And not only has this never happened before, the technology doesn’t even exist yet. And if it is built, we’re not sure how it would be built.”

So, how can researchers determine the risks posed by an artificial superintelligence if they don’t know exactly what that intelligence will look like and they have no real data to work with?

Luckily, when it comes to artificial superintelligences, AI experts aren’t totally in the realm of the unknown. Baum asserts that there are some ideas and a bit of relevant evidence, but these things are scattered. To address this issue, Baum and his team create models. They take what information is available, structure it, and then distribute the result in an organized fashion so that researchers can better understand the topic, the various factors that may influence the outcome of the issue at hand, and ultimately have a better understanding of the various risks associated with ASI.

For example, when attempting to figure how easy is it to design an AI so that it acts safely, one of the subdetails that needs to be modeled is whether or not humans will be able to observe the AI and test it before it gets out of control. In other words, whether AI researchers can recognize that an AI has a dangerous design and shut it down. To model this scenario and determine what the risks and most likely scenarios are, Baum and his team take the available information — the perspectives and opinions of AI researchers, what is already known about AI technology and how it functions, etc. — and they model the topic by structuring the aforementioned information along with any uncertainty in the arguments or data sets.

This kind of modeling and risk analysis ultimately allows the team to better understand the scope of the issue and, by structuring the information in a clear way, advance an ongoing conversation in the superintelligence research community. The modeling doesn’t give us a complete picture of what will happen, but it does allow us to better understand the risks that we’re facing when it comes to the rise of ASI, what events and outcomes are likely, as well as the specific steps that policy makers and AI researchers should take to ensure that ASI benefits humanity.

Of course, when it comes to the risks of artificial superintelligences, whether or not we will be able to observe and test our AI is just one small part of a much larger model.

Modeling a Catastrophe

In order to understand what it would take to bring about the ASI apocalypse, and how we could possibly prevent it, Baum and his team have created a model that investigates the following questions from a number of vantage points:

  • Step 1: Is it possible to build an artificial superintelligence?
  • Step 2: Will humans build the superintelligence?
  • Step 3: Will humans lose control of the superintelligence?

This first half of the model is centered on the nuts and bolts of how to build an ASI. The second half of the model dives into risk analysis related to the creation of an ASI that is harmful and looks at the following:

  • Step 1: Will humans design an artificial superintelligence that is harmful?
  • Step 2: Will the superintelligence develop harmful behavior on its own?
  • Step 3: Is there something deterring the superintelligence from acting in a way that is harmful (such as another AI or some human action)?

Each step in this series models a number of different possibilities to reveal the various risks that we face and how significant, and probable, these threats are. Although the model is still being refined, Baum says that substantial progress has already been made. “The risk is starting to make sense. I’m starting to see exactly what it would take to see this type of catastrophe,” Baum said. Yet, he is quick to clarify that the research is still a bit too young to say much definitively, “Those of us who study superintelligence and all the risks and policy aspects of it, we’re not exactly sure what policy we would want right now. What’s happening right now is more of a general-purpose conversation on AI. It’s one that recognizes the fact that AI is more than just a technological and economic opportunity and that there are risks involved and difficult ethical issues.”

Ultimately, Baum hopes that these conversations, when coupled with the understanding that comes from the models that he is currently developing alongside his team, will allow GCRI to better prepare policy makers and scientists alike for the rise of a new kind of (super)intelligence.

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

AI Safety: Measuring and Avoiding Side Effects Using Relative Reachability

This article was originally published on the Deep Safety blog.

A major challenge in AI safety is reliably specifying human preferences to AI systems. An incorrect or incomplete specification of the objective can result in undesirable behavior like specification gaming or causing negative side effects. There are various ways to make the notion of a “side effect” more precise – I think of it as a disruption of the agent’s environment that is unnecessary for achieving its objective. For example, if a robot is carrying boxes and bumps into a vase in its path, breaking the vase is a side effect, because the robot could have easily gone around the vase. On the other hand, a cooking robot that’s making an omelette has to break some eggs, so breaking eggs is not a side effect.

How can we measure side effects in a general way that’s not tailored to particular environments or tasks, and incentivize the agent to avoid them? This is the central question of our recent paper.

Part of the challenge is that it’s easy to introduce bad incentives for the agent when trying to penalize side effects. Previous work on this problem has focused either on preserving reversibility or reducing the agent’s impact on the environment, and both of these approaches introduce different kinds of problematic incentives:

  • Preserving reversibility (i.e. keeping the starting state reachable) encourages the agent to prevent all irreversible events in the environment (e.g. humans eating food). Also, if the objective requires an irreversible action (e.g. breaking eggs for the omelette), then any further irreversible actions will not be penalized, since reversibility has already been lost.
  • Penalizing impact (i.e. some measure of distance from the default outcome) does not take reachability of states into account, and treats reversible and irreversible effects equally (due to the symmetry of the distance measure). For example, the agent would be equally penalized for breaking a vase and for preventing a vase from being broken, though the first action is clearly worse. This leads to “overcompensation” (“offsetting“) behaviors: when rewarded for preventing the vase from being broken, an agent with a low impact penalty rescues the vase, collects the reward, and then breaks the vase anyway (to get back to the default outcome).

Both of these approaches are doing something right: it’s a good idea to take reachability into account, and it’s also a good idea to compare to the default outcome (instead of the initial state). We can put the two together and compare to the default outcome using a reachability-based measure. Then the agent no longer has an incentive to prevent everything irreversible from happening or to overcompensate for preventing an irreversible event.

We still have a problem with the case where the objective requires an irreversible action. Simply penalizing the agent for making the default outcome unreachable would create a “what the hell effect” where the agent has no incentive to avoid any further irreversible actions. To get around this, instead of considering the reachability of the default state, we consider the reachability of all states. For each state, we penalize the agent for making it less reachable than it would be from the default state. In a deterministic environment, the penalty would be the number of states in the shaded area:

Since each irreversible action cuts off more of the state space (e.g. breaking a vase makes all the states where the vase was intact unreachable), the penalty will increase accordingly. We call this measure “relative reachability”.

We ran some simple experiments with a tabular Q-learning agent in the AI Safety Gridworlds framework to provide a proof of concept that relative reachability of the default outcome avoids the bad incentives described above.

In the first gridworld, the agent needs to get to the goal G, but there’s a box in the way, which can only be moved by pushing. The shortest path to the goal pushes the box down into a corner (an irrecoverable position), while a longer path pushes the box to the right (a recoverable position). The safe behavior is to take the longer path. The agent with the relative reachability penalty takes the longer path, while the agent with the reversibility penalty fails. This happens because any path to the goal involves an irreversible effect – once the box has been moved, the agent and the box cannot both return to their starting positions. Thus, the agent receives the maximal penalty for both paths, and has no incentive to follow the safe path.

In the second gridworld, there is an irreversible event that happens by default, when an object reaches the end of the conveyor belt. This environment are two variants:

  1. The object is a vase, and the agent is rewarded for taking it off the belt (the agent’s task is to rescue the vase).
  2. The object is a sushi dish in a conveyor belt sushi restaurant, and the agent receives no reward for taking it off the belt (the agent is not supposed to interfere).

This gridworld was designed specifically to test for bad incentives that could be introduced by penalizing side effects, so an agent with no side effect penalty would behave correctly. We find that the agent with a low impact penalty engages in overcompensation behavior by putting the vase back on the belt after collecting the reward, while the agent with a reversibility preserving penalty takes the sushi dish off the belt despite getting no reward for doing so. The agent with a relative reachability penalty behaves correctly in both variants of the environment.

Of course, the relative reachability definition in its current form is not very tractable in realistic environments: there are too many possible states to be considered, the agent is not aware of all the states when it begins training, and the default outcome can be difficult to define and simulate. We expect that the definition can be approximated by considering the reachability of representative states (similarly to methods for approximating empowerment). To define the default outcome, we would need a more precise notion of the agent “doing nothing” (e.g. “no-op” actions are not always available or meaningful). We leave a more practical implementation of relative reachability to future work.

While relative reachability improves on the existing approaches, it might not incorporate all the considerations we would want to be part of a side effects measure. There are some effects on the agent’s environment that we might care about even if they don’t decrease future options compared to the default outcome. It might be possible to combine relative reachability with such considerations, but there could potentially be a tradeoff between taking these considerations into account and avoiding overcompensation behaviors. We leave these investigations to future work as well.

What Can We Learn From Cape Town’s Water Crisis?

The following article was contributed by Billy Babis.

Earlier this year, Cape Town, a port city in South Africa, prepared for a full depletion of its water resources amidst the driest 3-year span on record. The threat of ”Day Zero” —  the day when the city would officially have cut off running water — has subsided for now, but the long-term threat remains. And though Cape Town’s crisis is local, it exemplifies a problem several regions across the globe may soon have to address.

 

Current situation in Cape Town

In addition to being part of Cape Town’s driest 3-year span on record, 2017 was the city’s driest single year since 1933. With a population that has nearly doubled to 3.74 million in the past 25 years, water consumption has increased as the supply dwindles.

In January of this year, the city of Cape Town made an emergency announcement that Day Zero would land in mid-April and began enforcing restrictions and regulations. Starting on February 1st, the Cape Town government put emergency water regulations into effect, increasing the cost of water to 5-8 times its previous rate and placing a suggested 50 liter per person per day cap on water use. For context, the average American uses approximately 300-380 liters of water per day. And while Cape Town residents continue to use 80 million liters more than the city’s goal of 450 million liters per day, these regulations and increased costs have made progress. Water consumption decreased enough that Day Zero has now been postponed until 2019, as the rainy season (July-August) is expected to partially replenish the reservoirs.

Cape Town’s water consumption decreased largely due to its increased cost. The municipality also restricted agricultural use of water, which usually makes up just under half of total consumption. These restrictions are worsening Cape Town’s already struggling agricultural sector; which in this 3-year drought has slashed 37,000 jobs and lost R14 billion (US$1.17 billion), contributing to inflated food prices that shoved 50,000 people below the poverty line.

On the innovation side, the city has made major investments in infrastructure to increase water availability: 3 desalination plants, 3 aquifer abstraction facilities, and 1 waste-water recycling project are currently underway to ultimately increase Cape Town’s water availability by almost 300 million liters per day.

Did climate change cause the water crisis?

Severe droughts have plagued subtropical regions like Cape Town long before human-caused climate change. Thus, it’s difficult to conclude that climate change directly caused the Cape Town water crisis. However, the International Panel on Climate Change (IPCC) continues to find evidence suggesting that climate change has caused drought in certain regions and will cause longer, more frequent droughts over the next century.

Drought can either be meteorological (abnormally limited rainfall), agricultural (abnormally dry soil, excess evaporation), or hydrological (limited stream-water). While each of these problems are interrelated, they have varying impacts on drought in different regions. Meteorological drought is often the most important, and this is certainly the case in Cape Town.

Meteorological drought occurs naturally in Cape Town and the other few regions of the world with a “Mediterranean” climate: Central California, central Chile, northern Africa and southern Europe, southwestern Australia, and the greater Cape Town area have dry summers and variably rainy winters. Due to global weather oscillations like El Nino, the total rainfall in winter varies dramatically. A given winter is usually either very rainy or very dry. But as long as repeated and prolonged periods of drought don’t strike, these regions can prepare for dry seasons by storing water from previous wet seasons.

But climate change threatens to jeopardize this. With “robust evidence and high agreement,” the IPCC concluded that while tropical regions will receive more precipitation this century, subtropical dry regions (like these Mediterranean climates) will receive less. In fact, warm, rising air near the equator ultimately settles and cools in these subtropical regions, creating deserts and droughts. Therefore, increasing equatorial heat and rain (as global warming promises to do) will likely lead to drier subtropical conditions and more frequent meteorological drought.  

But the IPCC also expects these Mediterranean climates to experience more frequent agricultural drought, (IPCC 5) largely due to growing human populations. In addition, renewable surface water and groundwater will decrease and hydrologic drought will likely occur more frequently, due in large part to the increasing population and resulting consumption.

 

What we can learn from this crisis

Earlier this year, many Capetonians feared a total catastrophe: running out of water. Wealthier citizens might have been able to pay for imported water or outbound flights, but poorer communities would have been left in a much more dire situation. International aid likely would have been necessary to avoid any fatal consequences.

The city seems to have averted that for now, but not without cost. The drought has caused immense strain on the agricultural economy that “will take years to work out of the system,” explains Beatrice Conradie, Professor of Economics and Social Sciences at University of Cape Town. “Primary producers are likely to act more conservatively as a result and this will make them less inclined to invest and create jobs. The unemployed will migrate to cities where they will put additional pressure on already strained infrastructure.”

And while Cape Town’s water infrastructure projects — desalination plants, aquifer abstraction facilities, and waste-water recycling projects — provided some immediate and prospective relief, they will not always be an option for every region. Desalination plants are very expensive and energy intensive (thus, climate change contributors), and pollute the local ocean ecosystem by releasing the brine remnants of desalination back into the water. Conradie raises further concerns about unregulated well-drilling in response to surface water restrictions. Regulated and unregulated over-abstraction from aquifers commonly leads to salt-water intrusion, permanently contaminating that fresh water and killing wetland wildlife. These are the best solutions available, and none of them are sustainable.

“Cape Town is really a wake-up call for other cities around the world,” shares NASA’s senior water scientist, Jay Famiglietti. “We have huge challenges ahead of us if we want to avert future day zeros in other cities around the world.”

Just as Capetonians failed to heed the cries of their government before reaching crisis-mode, global citizens are adjusting very slowly to the climate change cries of scientists and governments around the globe. But Cape Town’s response offers some valuable sociological lessons on sustainability. One is that behavioral changes can swing abruptly on a mass scale. Once a sufficient sense of urgency struck the people of Cape Town in early February (see figure below), the conservation movement gained a critical mass. Community members exponentially fed off each others’ hope.

But the role of governance proved indispensable. While Cape Town had long tried to inform the public of the water shortages, residents didn’t adjust their consumption until the government made the emergency announcement on January 17th and began enforcing drastic regulations and fees.

As Cape Town’s sustainability efforts demonstrate, addressing climate change is a social problem as much as a technical problem. Regardless of technological innovations, understanding human behavioral habits will be crucial in propelling necessary changes. As such, sociologists will grow just as important as climate scientists or chemical engineers in leading change.

Cape Town’s main focus over the past few months has been discovering the best ways to make behavioral nudges to its citizens on a mass scale to reduce water consumption. This entailed research partnerships with University of Cape Town’s sociology departments and the Environmental Policy Research Unit (EPRU). The principle of reciprocity reigned true – that people are more likely to contribute to the public good if they see others doing it – and enhancing this effect on a global scale will grow increasingly important as we attempt to mitigate and adapt to environmental threats in this century.

With or without a changing climate, though, water scarcity will become an increasingly urgent issue for humanity’s growing population. Population growth continues to catapult our ecological footprint and increasingly threaten the ability of future, presumably larger, generations to flourish. Amidst their environmental challenge, the people of Cape Town demonstrated the importance of effective governance and collaboration. As more subtropical regions begin to suffer from drought and water shortages, learning from the failures and successes of Cape Town’s 2018 crisis will help avoid disaster.

Teaching Today’s AI Students To Be Tomorrow’s Ethical Leaders: An Interview With Yan Zhang

Some of the greatest scientists and inventors of the future are sitting in high school classrooms right now, breezing through calculus and eagerly awaiting freshman year at the world’s top universities. They may have already won Math Olympiads or invented clever, new internet applications. We know these students are smart, but are they prepared to responsibly guide the future of technology?

Developing safe and beneficial technology requires more than technical expertise — it requires a well-rounded education and the ability to understand other perspectives. But since math and science students must spend so much time doing technical work, they often lack the skills and experience necessary to understand how their inventions will impact society.

These educational gaps could prove problematic as artificial intelligence assumes a greater role in our lives. AI research is booming among young computer scientists, and these students need to understand the complex ethical, governance, and safety challenges posed by their innovations.

 

SPARC

In 2012, a group of AI researchers and safety advocates – Paul Christiano, Jacob Steinhardt, Andrew Critch, Anna Salamon, and Yan Zhang – created the Summer Program in Applied Rationality and Cognition (SPARC) to address the many issues that face quantitatively strong teenagers, including the issue of educational gaps in AI. As with all technologies, they explain, the more the AI community consists of thoughtful, intelligent, broad-minded reasoners, the more likely AI is to be developed in a safe and beneficial manner.

Each summer, the SPARC founders invite 30-35 mathematically gifted high school students to participate in their two-week program. Zhang, SPARC’s director, explains: “Our goals are to generate a strong community, expose these students to ideas that they’re not going to get in class – blind spots of being a quantitatively strong teenager in today’s world, like empathy and social dynamics. Overall we want to make them more powerful individuals who can bring positive change to the world.”

To help students make a positive impact, SPARC instructors teach core ideas in effective altruism (EA). “We have a lot of conversations about EA, but we don’t push the students to become EA,” Zhang says. “We expose them to good ideas, and I think that’s a healthier way to do mentorship.”

SPARC also exposes students to machine learning, AI safety, and existential risks. In 2016 and 2017, they held over 10 classes on these topics, including: “Machine Learning” and “Tensorflow” taught by Jacob Steinhardt, “Irresponsible Futurism” and “Effective Do-Gooding” taught by Paul Christiano, “Optimization” taught by John Schulman, and “Long-Term Thinking on AI and Automization” taught by Michael Webb.

But SPARC instructors don’t push students down the AI path either. Instead, they encourage students to apply SPARC’s holistic training to make a more positive impact in any field.

 

Thinking on the Margin: The Role of Social Skills

Making the most positive impact requires thinking on the margin, and asking: What one additional unit of knowledge will be most helpful for creating positive impact? For these students, most of whom have won Math and Computing Olympiads, it’s usually not more math.

“A weakness of a lot of mathematically-minded students are things like social skills or having productive arguments with people,” Zhang says. “Because to be impactful you need your quantitative skills, but you need to also be able to relate with people.”

To counter this weakness, he teaches classes on social skills and signaling, and occasionally leads improvisational games. SPARC still teaches a lot of math, but Zhang is more interested in addressing these students’ educational blind spots – the same blind spots that the instructors themselves had as students. “What would have made us more impactful individuals, and also more complete and more human in many ways?” he asks.

Working with non-math students can help, so Zhang and his colleagues have experimented with bringing excellent writers and original thinkers into the program. “We’ve consistently had really good successes with those students, because they bring something that the Math Olympiad kids don’t have,” Zhang says.

SPARC also broadens students’ horizons with guest speakers from academia and organizations such as the Open Philanthropy Project, OpenAI, Dropbox and Quora. In one talk, Dropbox engineer Albert Ni spoke to SPARC students about “common mistakes that math people make when they try to do things later in life.”

In another successful experiment suggested by Ofer Grossman, a SPARC alum who is now a staff member, SPARC made half of all classes optional in 2017. The classes were still packed because students appreciated the culture. The founders also agreed that conversations after class are often more impactful than classes, and therefore engineered one-on-one time and group discussions into the curriculum. Thinking on the margin, they ask: “What are the things that were memorable about school? What are the good parts? Can we do more of those and less of the others?”

Above all, SPARC fosters a culture of openness, curiosity and accountability. Inherent in this project is “cognitive debiasing” – learning about common biases like selection bias and confirmation bias, and correcting for them. “We do a lot of de-biasing in our interactions with each other, very explicitly,” Zhang says. “We also have classes on cognitive biases, but the culture is the more important part.”

 

AI Research and Future Leaders

Designing safe and beneficial technology requires technical expertise, but in SPARC’s view, cultivating a holistic research culture is equally important. Today’s top students may make some of the most consequential AI breakthroughs in the future, and their values, education and temperament will play a critical role in ensuring that advanced AI is deployed safely and for the common good.

“This is also important outside of AI,” Zhang explains. “The official SPARC stance is to make these students future leaders in their communities, whether it’s AI, academia, medicine, or law. These leaders could then talk to each other and become allies instead of having a bunch of splintered, narrow disciplines.”

As SPARC approaches its 7th year, some alumni have already begun to make an impact. A few AI-oriented alumni recently founded AlphaSheets – a collaborative, programmable spreadsheet for finance that is less prone to error – while other students are leading a “hacker house” with people in Silicon Valley. Additionally, SPARC inspired the creation of ESPR, a similar European program explicitly focused on AI risk.

But most impacts will be less tangible. “Different pockets of people interested in different things have been working with SPARC’s resources, and they’re forming a lot of social groups,” Zhang explains. “It’s like a bunch of little sparks and we don’t quite know what they’ll become, but I’m pretty excited about next five years.”

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

ICRAC Open Letter Opposes Google’s Involvement With Military

From improving medicine to better search engines to assistants that help ease busy schedules, artificial intelligence is already proving a boon to society. But just as it can be designed to help, it can be designed to harm and even to kill.

Military uses of AI can also run the gamut from programs that could help improve food distribution logistics to weapons that can identify and assassinate targets without input from humans. Because AI programs can have these dual uses, it’s difficult for companies who do not want their technology to cause harm to work with militaries – it’s not currently possible for a company to ensure that if it helps the military solve a benign problem with an AI program that the program won’t later be repurposed to take human lives.

So when employees at Google learned earlier this year about the company’s involvement in the Pentagon’s Project Maven, they were upset. Though Google argues that their work on Project Maven only assisted the U.S. military with image recognition tools from drone footage, many suggest that this technology could later be used for harm. In response, over 3,000 employees signed an open letter saying they did not want their work to be used to kill.

And it isn’t just Google’s employees who are concerned.

Earlier this week, the International Committee for Robot Arms Control released an open letter signed by hundreds of academics calling on Google’s leadership to withdraw from the “business of war.” The letter, which is addressed to Google’s leadership, responds to the growing criticism of Google’s participation in the Pentagon’s program, Project Maven.

The letter states, “we write in solidarity with the 3100+ Google employees, joined by other technology workers, who oppose Google’s participation in Project Maven.” It goes on to remind Google leadership to be cognizant of the incredible responsibility the company has for safeguarding the data it’s collected from its users, as well as its famous motto, “Don’t Be Evil.”

Specifically, the letter calls on Google to:

  • “Terminate its Project Maven contract with the DoD.
  • “Commit not to develop military technologies, nor to allow the personal data it has collected to be used for military operations.
  • “Pledge to neither participate in nor support the development, manufacture, trade or use of autonomous weapons; and to support efforts to ban autonomous weapons.”

Lucy Suchman, one of the letter’s authors, explained part of her motivation for her involvement:

“For me the greatest concern is that this effort will lead to further reliance on profiling and guilt by association in the US drone surveillance program, as the only way to generate signal out of the noise of massive data collection. There are already serious questions about the legality of targeted killing, and automating it further will only make it less accountable.”

The letter was released the same week that a small group of Google employees made news for resigning in protest against Project Maven. It also comes barely a month after a successful boycott by academic researchers against KAIST’s autonomous weapons effort.

In addition, last month the United Nations held their most recent meeting to consider a ban on lethal autonomous weapons. 26 countries, including China, have now said they would support some sort of official ban on these weapons.

In response to the number of signatories the open letter has received, Suchman added, “This is clearly an issue that strikes a chord for many researchers who’ve been tracking the incorporation of AI and robotics into military systems.”

If you want to add your name to the letter, you can do so here.

Lethal Autonomous Weapons: An Update from the United Nations

Earlier this month, the United Nations Convention on Conventional Weapons (UN CCW) Group of Governmental Experts met in Geneva to discuss the future of lethal autonomous weapons systems. But before we get to that, here’s a quick recap of everything that’s happened in the last six months.

 

Slaughterbots and Boycotts

Since its release in November 2017, the video Slaughterbots has been seen approximately 60 million times and has been featured in hundreds of news articles around the world. The video coincided with the UN CCW Group of Governmental Experts’ first meeting in Geneva to discuss a ban on lethal autonomous weapons, as well as the release of open letters from AI researchers in Australia, Canada, Belgium, and other countries urging their heads of state to support an international ban on lethal autonomous weapons.

Over the last two months, autonomous weapons regained the international spotlight. In March, after learning that the Korea Advanced Institute of Science and Technology (KAIST) planned to open an AI weapons lab in collaboration with a major arms company, AI researcher Toby Walsh led an academic boycott of the university. Over 50 of the world’s leading AI and robotics researchers from 30 countries joined the boycott, and in less than a week, KAIST agreed to “not conduct any research activities counter to human dignity including autonomous weapons lacking meaningful human control.” The boycott was covered by CNN and The Guardian.

Additionally, over 3,100 Google employees, including dozens of senior engineers, signed a letter in early April protesting the company’s involvement in a Pentagon program called “Project Maven,” which uses AI to analyze drone imaging. Employees worried that this technology could be repurposed to also operate drones or launch weapons. Citing their “Don’t Be Evil” motto, the employees asked to cancel the project and not to become involved in the “business of war.”

 

The UN CCW meets again…

In the wake of this growing pressure, 82 countries in the UN CCW met again from April 9-13 to consider a ban on lethal autonomous weapons. Throughout the week, states and civil society representatives discussed “meaningful human control” and whether they should just be concerned about “lethal” autonomous weapons, or all autonomous weapons generally. Here is a brief recap of the meeting’s progress:

  • The group of nations that explicitly endorse the call to ban LAWS expanded to 26 (with China, Austria, Colombia, and Djibouti joining during the CCW meeting.)
  • However, five states explicitly rejected moving to negotiate new international law on fully autonomous weapons: France, Israel, Russia, United Kingdom, and United States.
  • Nearly every nation agreed that it is important to retain human control over autonomous weapons, despite disagreements surrounding the definition of “meaningful human control.”
  • Throughout the discussion, states focused on complying with International Humanitarian Law (IHL). Human Rights Watch argued that there already is precedent in international law and disarmament law for banning weapons without human control.
  • Many countries submitted working papers to inform the discussions, including China and the United States.
  • Although states couldn’t reach an agreement during the meeting, momentum is growing towards solidifying a framework for defining lethal autonomous weapons.

You can find written and video recaps from each day of the UN CCW meeting here, written by Reaching Critical Will.

The UN CCW is slated to resume discussions in August 2018, however, given the speed with which autonomous weaponry is advancing, many advocates worry that they are moving too slowly.

 

What can you do?

If you work in the tech industry, consider signing the Tech Workers Coalition open letter, which calls on Google, Amazon and Microsoft to stay out of the business of war. And if you’d like to support the fight against LAWS, we recommend donating to the Campaign to Stop Killer Robots. This organization, which is not affiliated with FLI, has done amazing work over the past few years to lead efforts around the world to prevent the development of lethal autonomous weapons. Please consider donating here.

 

Learn more…

If you want to learn more about the technological, political, and social developments of autonomous weapons, check out the Research & Reports page of our Autonomous Weapons website. You can find relevant news stories and updates at @AIweapons on Twitter and autonomousweapons on Facebook.

AI and Robotics Researchers Boycott South Korea Tech Institute Over Development of AI Weapons Technology

UPDATE 4-9-18: The boycott against KAIST has ended. The press release for the ending of the boycott explained:

“More than 50 of the world’s leading artificial intelligence (AI) and robotics researchers from 30 different countries have declared they would end a boycott of the Korea Advanced Institute of Science and Technology (KAIST), South Korea’s top university, over the opening of an AI weapons lab in collaboration with Hanwha Systems, a major arms company.

“At the opening of the new laboratory, the Research Centre for the Convergence of National Defence and Artificial Intelligence, it was reported that KAIST was “joining the global competition to develop autonomous arms” by developing weapons “which would search for and eliminate targets without human control”. Further cause for concern was that KAIST’s industry partner, Hanwha Systems builds cluster munitions, despite an UN ban, as well as a fully autonomous weapon, the SGR-A1 Sentry Robot. In 2008, Norway excluded Hanwha from its $380 billion future fund on ethical grounds.

“KAIST’s President, Professor Sung-Chul Shin, responded to the boycott by affirming in a statement that ‘KAIST does not have any intention to engage in development of lethal autonomous weapons systems and killer robots.’ He went further by committing that ‘KAIST will not conduct any research activities counter to human dignity including autonomous weapons lacking meaningful human control.’

“Given this swift and clear commitment to the responsible use of artificial intelligence in the development of weapons, the 56 AI and robotics researchers who were signatories to the boycott have rescinded the action. They will once again visit and host researchers from KAIST, and collaborate on scientific projects.”

UPDATE 4-5-18: In response to the boycott, KAIST President Sung-Chul Shin released an official statement to the press. In it, he says:

“I would like to reaffirm that KAIST does not have any intention to engage in development of lethal autonomous weapons systems and killer robots. KAIST is significantly aware of ethical concerns in the application of all technologies including artificial intelligence.

“I would like to stress once again that this research center at KAIST, which was opened in collaboration with Hanwha Systems, does not intend to develop any lethal autonomous weapon systems and the research activities do not target individual attacks.”

ORIGINAL ARTICLE 4-4-18:

Leading artificial intelligence researchers from around the world are boycotting South Korea’s KAIST (Korea Advanced Institute of Science and Technology) after the institute announced a partnership with Hanwha Systems to create a center that will help develop technology for AI weapons systems.

The boycott, organized by AI researcher Toby Walsh, was announced just days before the start of the next United Nations Convention on Conventional Weapons (CCW) meeting in which countries will discuss how to address challenges posed by autonomous weapons. 

“At a time when the United Nations is discussing how to contain the threat posed to international security by autonomous weapons, it is regrettable that a prestigious institution like KAIST looks to accelerate the arms race to develop such weapons,” the boycott letter states. 

The letter also explains the concerns AI researchers have regarding autonomous weapons:

“If developed, autonomous weapons will be the third revolution in warfare. They will permit war to be fought faster and at a scale greater than ever before. They have the potential to be weapons of terror. Despots and terrorists could use them against innocent populations, removing any ethical restraints. This Pandora’s box will be hard to close if it is opened.”

The letter has been signed by over 50 of the world’s leading AI and robotics researchers from 30 countries, including professors Yoshua Bengio, Geoffrey Hinton, Stuart Russell, and Wolfram Burgard.

Explaining the boycott, the letter states:

“We therefore publicly declare that we will boycott all collaborations with any part of KAIST until such time as the President of KAIST provides assurances, which we have sought but not received, that the Center will not develop autonomous weapons lacking meaningful human control. We will, for example, not visit KAIST, host visitors from KAIST, or contribute to any research project involving KAIST.”

In February, the Korean Times reported on the opening of the Research Center for the Convergence of National Defense and Artificial Intelligence, which was formed as a partnership between KAIST and Hanwha to “[join] the global competition to develop autonomous arms.” The Korean Times article added that “researchers from the university and Hanwha will carry out various studies into how technologies of the Fourth Industrial Revolution can be utilized on future battlefields.”

In the press release for the boycott, Walsh referenced concerns that he and other AI researchers have had since 2015, when he and FLI released an open letter signed by thousands of researchers calling for a ban on autonomous weapons.

“Back in 2015, we warned of an arms race in autonomous weapons,” said Walsh. “That arms race has begun. We can see prototypes of autonomous weapons under development today by many nations including the US, China, Russia and the UK. We are locked into an arms race that no one wants to happen. KAIST’s actions will only accelerate this arms race.”

Many organizations and people have come together through the Campaign to Stop Killer Robots to advocate for a UN ban on lethal autonomous weapons. In her summary of the last United Nations CCW meeting in November, 2017, Ray Acheson of Reaching Critical Will wrote:

“It’s been four years since we first began to discuss the challenges associated with the development of autonomous weapon systems (AWS) at the United Nations. … But the consensus-based nature of the Convention on Certain Conventional Weapons (CCW) in which these talks have been held means that even though the vast majority of states are ready and willing to take some kind of action now, they cannot because a minority opposes it.”

Walsh adds, “I am hopeful that this boycott will add urgency to the discussions at the UN that start on Monday. It sends a clear message that the AI & Robotics community do not support the development of autonomous weapons.”

To learn more about autonomous weapons and efforts to ban them, visit the Campaign to Stop Killer Robots and autonomousweapons.org. The full open letter and signatories are below.

Open Letter:

As researchers and engineers working on artificial intelligence and robotics, we are greatly concerned by the opening of a “Research Center for the Convergence of National Defense and Artificial Intelligence” at KAIST in collaboration with Hanwha Systems, South Korea’s leading arms company. It has been reported that the goals of this Center are to “develop artificial intelligence (AI) technologies to be applied to military weapons, joining the global competition to develop autonomous arms.”

At a time when the United Nations is discussing how to contain the threat posed to international security by autonomous weapons, it is regrettable that a prestigious institution like KAIST looks to accelerate the arms race to develop such weapons. We therefore publicly declare that we will boycott all collaborations with any part of KAIST until such time as the President of KAIST provides assurances, which we have sought but not received, that the Center will not develop autonomous weapons lacking meaningful human control. We will, for example, not visit KAIST, host visitors from KAIST, or contribute to any research project involving KAIST.

If developed, autonomous weapons will be the third revolution in warfare. They will permit war to be fought faster and at a scale greater than ever before. They have the potential to be weapons of terror. Despots and terrorists could use them against innocent populations, removing any ethical restraints. This Pandora’s box will be hard to close if it is opened. As with other technologies banned in the past like blinding lasers, we can simply decide not to develop them. We urge KAIST to follow this path, and work instead on uses of AI to improve and not harm human lives.

 

FULL LIST OF SIGNATORIES TO THE BOYCOTT

Alphabetically by country, then by family name.

  • Prof. Toby Walsh, USNW Sydney, Australia.
  • Prof. Mary-Anne Williams, University of Technology Sydney, Australia.
  • Prof. Thomas Either, TU Wein, Austria.
  • Prof. Paolo Petta, Austrian Research Institute for Artificial Intelligence, Austria.
  • Prof. Maurice Bruynooghe, Katholieke Universiteit Leuven, Belgium.
  • Prof. Marco Dorigo, Université Libre de Bruxelles, Belgium.
  • Prof. Luc De Raedt, Katholieke Universiteit Leuven, Belgium.
  • Prof. Andre C. P. L. F. de Carvalho, University of São Paulo, Brazil.
  • Prof. Yoshua Bengio, University of Montreal, & scientific director of MILA, co-founder of Element AI, Canada.
  • Prof. Geoffrey Hinton, University of Toronto, Canada.
  • Prof. Kevin Leyton-Brown, University of British Columbia, Canada.
  • Prof. Csaba Szepesvari, University of Alberta, Canada.
  • Prof. Zhi-Hua Zhou,Nanjing University, China.
  • Prof. Thomas Bolander, Danmarks Tekniske Universitet, Denmark.
  • Prof. Malik Ghallab, LAAS-CNRS, France.
  • Prof. Marie-Christine Rousset, University of Grenoble Alpes, France.
  • Prof. Wolfram Burgard, University of Freiburg, Germany.
  • Prof. Bernd Neumann, University of Hamburg, Germany.
  • Prof. Bernhard Schölkopf, Director, Max Planck Institute for Intelligent Systems, Germany.
  • Prof. Manolis Koubarakis, National and Kapodistrian University of Athens, Greece.
  • Prof. Grigorios Tsoumakas, Aristotle University of Thessaloniki, Greece.
  • Prof. Benjamin W. Wah, Provost, The Chinese University of Hong Kong, Hong Kong.
  • Prof. Dit-Yan Yeung, Hong Kong University of Science and Technology, Hong Kong.
  • Prof. Kristinn R. Thórisson, Managing Director, Icelandic Institute for Intelligent Machines, Iceland.
  • Prof. Barry Smyth, University College Dublin, Ireland.
  • Prof. Diego Calvanese, Free University of Bozen-Bolzano, Italy.
  • Prof. Nicola Guarino, Italian National Research Council (CNR), Trento, Italy.
  • Prof. Bruno Siciliano, University of Naples, Italy.
  • Prof. Paolo Traverso, Director of FBK, IRST, Italy.
  • Prof. Yoshihiko Nakamura, University of Tokyo, Japan.
  • Prof. Imad H. Elhajj, American University of Beirut, Lebanon.
  • Prof. Christoph Benzmüller, Université du Luxembourg, Luxembourg.
  • Prof. Miguel Gonzalez-Mendoza, Tecnológico de Monterrey, Mexico.
  • Prof. Raúl Monroy, Tecnológico de Monterrey, Mexico.
  • Prof. Krzysztof R. Apt, Center Mathematics and Computer Science (CWI), Amsterdam, the Netherlands.
  • Prof. Angat van den Bosch, Radboud University, the Netherlands.
  • Prof. Bernhard Pfahringer, University of Waikato, New Zealand.
  • Prof. Helge Langseth, Norwegian University of Science and Technology, Norway.
  • Prof. Zygmunt Vetulani, Adam Mickiewicz University in Poznań, Poland.
  • Prof. José Alferes, Universidade Nova de Lisboa, Portugal.
  • Prof. Luis Moniz Pereira, Universidade Nova de Lisboa, Portugal.
  • Prof. Ivan Bratko, University of Ljubljana, Slovenia.
  • Prof. Matjaz Gams, Jozef Stefan Institute and National Council for Science, Slovenia.
  • Prof. Hector Geffner, Universitat Pompeu Fabra, Spain.
  • Prof. Ramon Lopez de Mantaras, Director, Artificial Intelligence Research Institute, Spain.
  • Prof. Alessandro Saffiotti, Orebro University, Sweden.
  • Prof. Boi Faltings, EPFL, Switzerland.
  • Prof. Jürgen Schmidhuber, Scientific Director, Swiss AI Lab, Universià della Svizzera italiana, Switzerland.
  • Prof. Chao-Lin Liu, National Chengchi University, Taiwan.
  • Prof. J. Mark Bishop, Goldsmiths, University of London, UK.
  • Prof. Zoubin Ghahramani, University of Cambridge, UK.
  • Prof. Noel Sharkey, University of Sheffield, UK.
  • Prof. Luchy Suchman, Lancaster University, UK.
  • Prof. Marie des Jardins, University of Maryland, USA.
  • Prof. Benjamin Kuipers, University of Michigan, USA.
  • Prof. Stuart Russell, University of California, Berkeley, USA.
  • Prof. Bart Selman, Cornell University, USA.

 

2018 Spring Conference: Invest in Minds Not Missiles

On Saturday April 7th and Sunday morning April 8th, MIT and Massachusetts Peace Action will co-host a conference and workshop at MIT on understanding and reducing the risk of nuclear war. Tickets are free for students. To attend, please register here.

 

Saturday sessions

Workshops

Sunday Morning Planning Breakfast

Student-led session to design and implement programs enhancing existing campus groups, and organizing new ones; extending the network to campuses in Rhode Island, Connecticut, New Jersey, New Hampshire, Vermont and Maine.

For more information, contact Jonathan King at <jaking@mit.edu>, or call 617-354-2169

How AI Handles Uncertainty: An Interview With Brian Ziebart

When training image detectors, AI researchers can’t replicate the real world. They teach systems what to expect by feeding them training data, such as photographs, computer-generated images, real video and simulated video, but these practice environments can never capture the messiness of the physical world.

In machine learning (ML), image detectors learn to spot objects by drawing bounding boxes around them and giving them labels. And while this training process succeeds in simple environments, it gets complicated quickly.

 

 

 

 

 

 

 

It’s easy to define the person on the left, but how would you draw a bounding box around the person on the right? Would you only include the visible parts of his body, or also his hidden torso and legs? These differences may seem trivial, but they point to a fundamental problem in object recognition: there rarely is a single best way to define an object.

As this second image demonstrates, the real world is rarely clear-cut, and the “right” answer is usually ambiguous. Yet when ML systems use training data to develop their understanding of the world, they often fail to reflect this. Rather than recognizing uncertainty and ambiguity, these systems often confidently approach new situations no differently than their training data, which can put the systems and humans at risk.

Brian Ziebart, a Professor of Computer Science at the University of Illinois at Chicago, is conducting research to improve AI systems’ ability to operate amidst the inherent uncertainty around them. The physical world is messy and unpredictable, and if we are to trust our AI systems, they must be able to safely handle it.

 

Overconfidence in ML Systems

ML systems will inevitably confront real-world scenarios that their training data never prepared them for. But, as Ziebart explains, current statistical models “tend to assume that the data that they’ll see in the future will look a lot like the data they’ve seen in the past.”

As a result, these systems are overly confident that they know what to do when they encounter new data points, even when those data points look nothing like what they’ve seen. ML systems falsely assume that their training prepared them for everything, and the resulting overconfidence can lead to dangerous consequences.

Consider image detection for a self-driving car. A car might train its image detection on data from the dashboard of another car, tracking the visual field and drawing bounding boxes around certain objects, as in the image below:

Bounding boxes on a highway – CloudFactory Blog

 

 

 

 

 

 

 

 

 

 

 

 

For clear views like this, image detectors excel. But the real world isn’t always this simple. If researchers train an image detector on clean, well-lit images in the lab, it might accurately recognize objects 80% of the time during the day. But when forced to navigate roads on a rainy night, it might drop to 40%.

“If you collect all of your data during the day and then try to deploy the system at night, then however it was trained to do image detection during the day just isn’t going to work well when you generalize into those new settings,” Ziebart explains.

Moreover, the ML system might not recognize the problem: since the system assumes that its training covered everything, it will remain confident about its decisions and continue “to make strong predictions that are just inaccurate,” Ziebart adds.

In contrast, humans tend to recognize when previous experience doesn’t generalize into new settings. If a driver spots an unknown object ahead in the road, she wouldn’t just plow through the object. Instead, she might slow down, pay attention to how other cars respond to the object, and consider swerving if she can do so safely. When humans feel uncertain about our environment, we exercise caution to avoid making dangerous mistakes.

Ziebart would like AI systems to incorporate similar levels of caution in uncertain situations. Instead of confidently making mistakes, a system should recognize its uncertainty and ask questions to glean more information, much like an uncertain human would.

 

An Adversarial Approach

Training and practice may never prepare AI systems for every possible situation, but researchers can make their training methods more foolproof. Ziebart posits that feeding systems messier data in the lab can train them to better recognize and address uncertainty.

Conveniently, humans can provide this messy, real-world data. By hiring a group of human annotators to look at images and draw bounding boxes around certain objects – cars, people, dogs, trees, etc. – researchers can “build into the classifier some idea of what ‘normal’ data looks like,” Ziebart explains.

“If you ask ten different people to provide these bounding boxes, you’re likely to get back ten different bounding boxes,” he says. “There’s just a lot of inherent ambiguity in how people think about the ground truth for these things.”

Returning to the image above of the man in the car, human annotators might give ten different bounding boxes that capture different portions of the visible and hidden person. By feeding ML systems this confusing and contradictory data, Ziebart prepares them to expect ambiguity.

“We’re synthesizing more noise into the data set in our training procedure,” Ziebart explains. This noise reflects the messiness of the real world, and trains systems to be cautious when making predictions in new environments. Cautious and uncertain, AI systems will seek additional information and learn to navigate the confusing situations they encounter.

Of course, self-driving cars shouldn’t have to ask questions. If a car’s image detection spots a foreign object up ahead, for instance, it won’t have time to ask humans for help. But if it’s trained to recognize uncertainty and act cautiously, it might slow down, detect what other cars are doing, and safely navigate around the object.

 

Building Blocks for Future Machines

Ziebart’s research remains in training settings thus far. He feeds systems messy, varied data and trains them to provide bounding boxes that have at least 70% overlap with people’s bounding boxes. And his process has already produced impressive results. On an ImageNet object detection task investigated in collaboration with Sima Behpour (University of Illinois at Chicago) and Kris Kitani (Carnegie Mellon University), for example, Ziebart’s adversarial approach “improves performance by over 16% compared to the best performing data augmentation method.” Trained to operate amidst uncertain environments, these systems more effectively manage new data points that training didn’t explicitly prepare them for.

But while Ziebart trains relatively narrow AI systems, he believes that this research can scale up to more advanced systems like autonomous cars and public transit systems.

“I view this as kind of a fundamental issue in how we design these predictors,” he says. “We’ve been trying to construct better building blocks on which to make machine learning – better first principles for machine learning that’ll be more robust.”

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

Stephen Hawking in Memoriam

As we mourn the loss of Stephen Hawking, we should remember that his legacy goes far beyond science. Yes, of course he was one of the greatest scientists of the past century, discovering that black holes evaporate and helping found the modern quest for quantum gravity. But he also had a remarkable legacy as a social activist, who looked far beyond the next election cycle and used his powerful voice to bring out the best in us all. As a founding member of FLI’s Scientific Advisory board, he tirelessly helped us highlight the importance of long-term thinking and ensuring that we use technology to help humanity flourish rather than flounder. I marveled at how he could sometimes answer my emails faster than my grad students. His activism revealed the same visionary fearlessness as his scientific and personal life: he saw further ahead than most of those around him and wasn’t afraid of controversially sounding the alarm about humanity’s sloppy handling of powerful technology, from nuclear weapons to AI.

On a personal note, I’m saddened to have lost not only a long-time collaborator but, above all, a great inspiration, always reminding me of how seemingly insurmountable challenges can be overcome with creativity, willpower and positive attitude. Thanks Stephen for inspiring us all!

Can Global Warming Stay Below 1.5 Degrees? Views Differ Among Climate Scientists

The Paris Climate Agreement seeks to keep global warming well below 2 degrees Celsius relative to pre-industrial temperatures. In the best case scenario, warming would go no further than 1.5 degrees.

Many scientists see this as an impossible goal. A recent study by Peter Cox et al. postulates that, given a twofold increase in atmospheric carbon dioxide, there is only a 3% chance of keeping warming below 1.5 degrees.

But a study by Richard Miller et al. provides more reason for hope. The Miller report concludes that the 1.5 degree limit is still physically feasible, if only narrowly. It also provides an updated “carbon budget”—a projection of how much more carbon dioxide we can emit without breaking the 1.5 degree limit.

Dr. Joeri Rogelj, a climate scientist and research scholar with the Energy Program of the International Institute for Applied Systems Analysis, co-authored the Miller report. For Rogelj, the updated carbon budget is not the paper’s most important point. “Our paper shows to decision makers the importance of anticipating new and updated scientific knowledge,” he says.

Projected “carbon budgets” are rough estimates based on limited observations. These projections need to be continually updated as more data becomes available. Fortunately, the Paris Agreement calls for countries to periodically update their emission reduction pledges based on new estimates. Rogelj is hopeful “that this paper has put the necessity for a strong [updating] process on the radar of delegates.”

For scientists who have dismissed the 1.5 degree limit as impossible, the updating process might seem pointless. But Rogelj stresses that his team looked only at geophysical limitations, not political ones. Their report assumes that countries will agree to a zero emissions commitment—a much more ambitious scenario than other researchers have considered.

There is a misconception, Rogelj says, that the report claims to have found an inaccuracy in the Earth system models (ESMs) that are used to estimate human-driven warming. “We are using precisely those models to estimate the carbon budget from today onward,” Rogelj explains.

The problem is not the models, but rather the data fed into them. These simulations are often run using inexact projections of CO2 emissions. Over time, small discrepancies accumulate and are reflected in the warming predictions that the models make.

Given information about current CO2 emissions, however, ESMs make temperature predictions that are “quite accurate.” And when they are provided with an ambitious future scenario for emissions reduction, the models indicate that it is possible for global temperature increases to remain below 1.5 degrees.

So what would such a scenario look like? First off, emissions have to fall to zero. At the same time, the carbon budget needs to be continually reevaluated, and strategy changes must be based on the updated budget. For example, if emissions fall to zero but we’ve surpassed our carbon budget, then we’ll need to focus on making our emissions negative—in other words, on carbon dioxide removal.

Rogelj names two major processes for carbon dioxide removal: reforestation and bio-energy with carbon capture and storage. Some negative emissions processes, such as reforestation, provide benefits beyond carbon capture, while others may have undesired side effects.

But Rogelj is quick to add that these negative emissions technologies are not “silver bullets.” It’s too soon to know if carbon dioxide removal at a global scale will actually be necessary—we’ll have to get to zero emissions before we can tell. But such technologies could also help us reach zero in the first place.

What else will get us to zero emissions? According to Rogelj, we need “a strong emphasis on energy efficiency, combined with an electrification of end-use sectors like transport and building and a shift away from fossil fuels.” This will require a major shift in investment patterns. We want to avoid “locking into carbon dioxide-intensive infrastructure” that would saddle future generations with a dependency on non-renewable energy, he explains.

Rogelj stresses that his team’s findings are based only on geophysical data. Societal factors are a different matter: It is up to individual countries to decide where reducing emissions falls on their list of priorities.

However, the stipulation in the Paris Climate Agreement that countries periodically update their pledges is a source of optimism. Rogelj, for his part, is cautiously hopeful: “Looking at real world dynamics in terms of costs of renewables and energy storage, I personally think there is room for pledges to be strengthened over the coming five to ten years as countries better understand what is possible and how these pledges can align with other priorities.”

But not everyone in the scientific community shares the hopeful tone struck by Rogelj and his team. An article by the MIT Technology Review outlines “the five most worrisome climate developments” from 2017.

To start, global emissions are on the rise, up 2% from 2016. While the prior few years had seen a relative flattening in emissions, this more recent data shattered hopes that the trend would continue. On top of that, scientists are finding that observable climate trends line up best with “worst-case scenario” models of global warming—that is, global temperatures could rise five degrees in the next century.

And the arctic is melting much faster than scientists predicted. A recent report by the U.S. National Oceanic and Atmospheric Administration (NOAA) declared “that the North Pole had reached a ‘new normal,’ with no sign of returning to a ‘reliably frozen region.’”

Melting glaciers and sea ice trigger a whole new set of problems. The disappearing ice will cause sea levels to rise, and the “reflective white snow and ice [will] turn into heat-absorbing dark-blue water…[meaning] the Arctic will send less heat back into space, which leads to more warming, more melting, and more sea-level rise still.”

And finally, natural disasters are becoming increasingly ferocious as weather patterns mutate. The United States saw this first-hand, with massive wildfires on the west coast—including the largest ever in California’s history—and a string of hurricanes that ravaged the Virgin Islands, Puerto Rico, and many southern states.

These consequences of global warming are beginning to affect areas of social interest beyond the environment. The 2017 Atlantic hurricane season, for example, has been a massive economic burden, wracking up more than $200 billion in damages.

In Rogelj’s words, “Right now we really need to find ways to achieve multiple societal objectives, to find policies and measures and options that allow us to achieve those together.” As governments come to see how climate protection “can align with other priorities like reducing air pollution, and providing clean water and reliable energy,” we have reason to hope that it may become a higher and higher priority.

How to Prepare for the Malicious Use of AI

How can we forecast, prevent, and (when necessary) mitigate the harmful effects of malicious uses of AI?

This is the question posed by a 100-page report released last week, written by 26 authors from 14 institutions. The report, which is the result of a two-day workshop in Oxford, UK followed by months of research, provides a sweeping landscape of the security implications of artificial intelligence.

The authors, who include representatives from the Future of Humanity Institute, the Center for the Study of Existential Risk, OpenAI, and the Center for a New American Security, argue that AI is not only changing the nature and scope of existing threats, but also expanding the range of threats we will face. They are excited about many beneficial applications of AI, including the ways in which it will assist defensive capabilities. But the purpose of the report is to survey the landscape of security threats from intentionally malicious uses of AI.

“Our report focuses on ways in which people could do deliberate harm with AI,” said Seán Ó hÉigeartaigh, Executive Director of the Cambridge Centre for the Study of Existential Risk. “AI may pose new threats, or change the nature of existing threats, across cyber, physical, and political security.”

Importantly, this is not a report about a far-off future. The only technologies considered are those that are already available or that are likely to be within the next five years. The message therefore is one of urgency. We need to acknowledge the risks and take steps to manage them because the technology is advancing exponentially. As reporter Dave Gershgorn put it, “Every AI advance by the good guys is an advance for the bad guys, too.”

AI systems tend to be more efficient and more scalable than traditional tools. Additionally, the use of AI can increase the anonymity and psychological distance a person feels to the actions carried out, potentially lowering the barrier to committing crimes and acts of violence. Moreover, AI systems have their own unique vulnerabilities including risks from data poisoning, adversarial examples, and the exploitation of flaws in their design. AI-enabled attacks will outpace traditional cyberattacks because they will generally be more effective, more finely targeted, and more difficult to attribute.

The kinds of attacks we need to prepare for are not limited to sophisticated computer hacks. The authors suggest there are three primary security domains: digital security, which largely concerns cyberattacks; physical security, which refers to carrying out attacks with drones and other physical systems; and political security, which includes examples such as surveillance, persuasion via targeted propaganda, and deception via manipulated videos. These domains have significant overlap, but the framework can be useful for identifying different types of attacks, the rationale behind them, and the range of options available to protect ourselves.

What can be done to prepare for malicious uses of AI across these domains? The authors provide many good examples. The scenarios described in the report can be a good way for researchers and policymakers to explore possible futures and brainstorm ways to manage the most critical threats. For example, imagining a commercial cleaning robot being repurposed as a non-traceable explosion device may scare us, but it also suggests why policies like robot registration requirements may be a useful option.

Each domain also has its own possible points of control and countermeasures. For example, to improve digital security, companies can promote consumer awareness and incentivize white hat hackers to find vulnerabilities in code. We may also be able to learn from the cybersecurity community and employ measures such as red teaming for AI development, formal verification in AI systems, and responsible disclosure of AI vulnerabilities. To improve physical security, policymakers may want to regulate hardware development and prohibit sales of lethal autonomous weapons. Meanwhile, media platforms may be able to minimize threats to political security by offering image and video authenticity certification, fake news detection, and encryption.

The report additionally provides four high level recommendations, which are not intended to provide specific technical or policy proposals, but rather to draw attention to areas that deserve further investigation. The recommendations are the following:

Recommendation #1: Policymakers should collaborate closely with technical researchers to investigate, prevent, and mitigate potential malicious uses of AI.

Recommendation #2: Researchers and engineers in artificial intelligence should take the dual-use nature of their work seriously, allowing misuse-related considerations to influence research priorities and norms, and proactively reaching out to relevant actors when harmful applications are foreseeable.

Recommendation #3: Best practices should be identified in research areas with more mature methods for addressing dual- use concerns, such as computer security, and imported where applicable to the case of AI.

Recommendation #4: Actively seek to expand the range of stakeholders and domain experts involved in discussions of these challenges.

Finally, the report identifies several areas for further research. The first of these is to learn from and with the cybersecurity community because the impacts of cybersecurity incidents will grow as AI-based systems become more widespread and capable. Other areas of research include exploring different openness models, promoting a culture of responsibility among AI researchers, and developing technological and policy solutions.

As the authors state, “The malicious use of AI will impact how we construct and manage our digital infrastructure as well as how we design and distribute AI systems, and will likely require policy and other institutional responses.”

Although this is only the beginning of the understanding needed on how AI will impact global security, this report moves the discussion forward. It not only describes numerous emergent security concerns related to AI, but also suggests ways we can begin to prepare for those threats today.