Introductory Resources on AI Safety Research

Reading list to get up to speed on the main ideas in the field. The resources are selected for relevance and/or brevity, and the list is not meant to be comprehensive. [Updated on 15 August 2017.]


For a popular audience:

Cade Metz, 2017. New York Times: Teaching A.I. Systems to Behave Themselves

FLI. AI risk background and FAQ. At the bottom of the background page, there is a more extensive list of resources on AI safety.

Tim Urban, 2015. Wait But Why: The AI Revolution. An accessible introduction to AI risk forecasts and arguments (with cute hand-drawn diagrams, and a few corrections from Luke Muehlhauser).

OpenPhil, 2015. Potential risks from advanced artificial intelligence. An overview of AI risks and timelines, possible interventions, and current actors in this space.

For a more technical audience:

Stuart Russell:

  • The long-term future of AI (longer version), 2015. A video of Russell’s classic talk, discussing why it makes sense for AI researchers to think about AI safety, and going over various misconceptions about the issues.
  • Concerns of an AI pioneer, 2015. An interview with Russell on the importance of provably aligning AI with human values, and the challenges of value alignment research.
  • On Myths and Moonshine, 2014. Russell’s response to the “Myth of AI” question on, which draws an analogy between AI research and nuclear research, and points out some dangers of optimizing a misspecified utility function.

Scott Alexander, 2015. No time like the present for AI safety work. An overview of long-term AI safety challenges, e.g. preventing wireheading and formalizing ethics.

Victoria Krakovna, 2015. AI risk without an intelligence explosion. An overview of long-term AI risks besides the (overemphasized) intelligence explosion / hard takeoff scenario, arguing why intelligence explosion skeptics should still think about AI safety.

Stuart Armstrong, 2014. Smarter Than Us: The Rise Of Machine Intelligence. A short ebook discussing potential promises and challenges presented by advanced AI, and the interdisciplinary problems that need to be solved on the way there.

Technical overviews

Soares and Fallenstein, 2017. Aligning Superintelligence with Human Interests: A Technical Research Agenda

Amodei, Olah, et al, 2016. Concrete Problems in AI safety. Research agenda focusing on accident risks that apply to current ML systems as well as more advanced future AI systems.

Jessica Taylor et al, 2016. Alignment for Advanced Machine Learning Systems

FLI, 2015. A survey of research priorities for robust and beneficial AI

Jacob Steinhardt, 2015. Long-Term and Short-Term Challenges to Ensuring the Safety of AI Systems. A taxonomy of AI safety issues that require ordinary vs extraordinary engineering to address.

Nate Soares, 2015. Safety engineering, target selection, and alignment theory. Identifies and motivates three major areas of AI safety research.

Nick Bostrom, 2014. Superintelligence: Paths, Dangers, Strategies. A seminal book outlining long-term AI risk considerations.

Steve Omohundro, 2007. The basic AI drives. A classic paper arguing that sufficiently advanced AI systems are likely to develop drives such as self-preservation and resource acquisition independently of their assigned objectives.

Technical work

Value learning:

Smitha Milli et al. Should robots be obedient? Obedience to humans may sound like a great thing, but blind obedience can get in the way of learning human preferences.

William Saunders et al, 2017. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention. (blog post)

Amin, Jiang, and Singh, 2017. Repeated Inverse Reinforcement Learning. Separates the reward function into a task-specific component and an intrinsic component. In a sequence of task, the agent learns the intrinsic component while trying to avoid surprising the human.

Dylan Hadfield-Menell et al, 2016. Cooperative inverse reinforcement learning. Defines value learning as a cooperative game where the human tries to teach the agent about their reward function, rather than giving optimal demonstrations like in standard IRL.

Owain Evans et al, 2016. Learning the Preferences of Ignorant, Inconsistent Agents.

Reward gaming / wireheading:

Tom Everitt et al, 2017. Reinforcement learning with a corrupted reward channel. A formalization of the reward misspecification problem in terms of true and corrupt reward, a proof that RL agents cannot overcome reward corruption, and a framework for giving the agent extra information to overcome reward corruption. (blog post)

Amodei and Clark, 2016. Faulty Reward Functions in the Wild. An example of reward function gaming in a boat racing game, where the agent gets a higher score by going in circles and hitting the same targets than by actually playing the game.

Everitt and Hutter, 2016. Avoiding Wireheading with Value Reinforcement Learning. An alternative to RL that reduces the incentive to wirehead.

Laurent Orseau, 2015. Wireheading. An investigation into how different types of artificial agents respond to opportunities to wirehead (unintended shortcuts to maximize their objective function).

Interruptibility / corrigibility:

Dylan Hadfield-Menell et al. The Off-Switch Game. This paper studies the interruptibility problem as a game between human and robot, and investigates which incentives the robot could have to allow itself to be switched off.

El Mahdi El Mhamdi et al, 2017. Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning.

Orseau and Armstrong, 2016. Safely interruptible agents. Provides a formal definition of safe interruptibility and shows that off-policy RL agents are more interruptible than on-policy agents. (blog post)

Nate Soares et al, 2015. Corrigibility. Designing AI systems without incentives to resist corrective modifications by their creators.

Scalable oversight:

Christiano, Leike et al, 2017. Deep reinforcement learning from human preferences. Communicating complex goals to AI systems using human feedback (comparing pairs of agent trajectory segments).

David Abel et al. Agent-Agnostic Human-in-the-Loop Reinforcement Learning.


Armstrong and Levinstein, 2017. Low Impact Artificial Intelligences. An intractable but enlightening definition of low impact for AI systems.

Babcock, Kramar and Yampolskiy, 2017. Guidelines for Artificial Intelligence Containment.

Scott Garrabrant et al, 2016. Logical Induction. A computable algorithm for the logical induction problem.

Note: I did not include literature on less neglected areas of the field like safe exploration, distributional shift, adversarial examples, or interpretability (see e.g. Concrete Problems or the CHAI bibliography for extensive references on these topics).

Collections of technical works

CHAI bibliography

MIRI publications

FHI publications

FLI grantee publications (scroll down)

Paul Christiano. AI control. A blog on designing safe, efficient AI systems (approval-directed agents, aligned reinforcement learning agents, etc).

If there are any resources missing from this list that you think are a must-read, please let me know! If you want to go into AI safety research, check out these guidelines and the AI Safety Syllabus.

(Thanks to Ben Sancetta, Taymon Beal and Janos Kramar for their feedback on this post.)

This article was originally posted on Victoria Krakovna’s blog.

X-risk News of the Week: Ocean Warming and Nuclear Protests

X-risk = Existential Risk. The risk that we could accidentally (hopefully accidentally) wipe out all of humanity.
X-hope = Existential Hope. The hope that we will all flourish and live happily ever after.

Ocean warming and nuclear weapons are this week’s big x-risk news. It’s pretty clear why nuclear weapons could pose an existential risk. But ocean warming?

This is where cause and effect becomes the big issue.

Oceans are like a carbon dioxide sponge. In fact, the oceans have absorbed about 30% of all the carbon dioxide emitted since the start of the industrial revolution. That’s hundreds of billions of tons of carbon dioxide, and it’s taking a heavy toll on the health of the oceans.

As the news this week pointed out, acidification of the Great Barrier Reef (GBR) — something marine scientists have been warning the public about for decades – is occurring even faster than scientists previously thought. Thousands of species of marine life depend on the GBR and could be at risk. From a human standpoint, the GBR generates over $5.5 billion in revenue each year, employing nearly 70,000 people.

On a larger scale, more news came out this week about just how fast the ocean is changing as a result of climate change. In one article, Science News pointed out that in the last century, sea levels have risen faster than at any other time since the founding of Rome, 2800 years ago.

Meanwhile, the Guardian reported on a new study that found that even if we stem the rising global temperatures, ocean levels will continue to rise at a rapid rate because the ice sheets will continue to melt just at the current temperatures. Even if we do everything perfectly and don’t let global temperatures rise any higher than two degrees, ocean levels are still expected to rise 30 feet. According to the Guardian:

“20% of the world’s population will eventually have to migrate away from coasts swamped by rising oceans. Cities including New York, London, Rio de Janeiro, Cairo, Calcutta, Jakarta and Shanghai would all be submerged.”

In the United States alone, work associated with the oceans contributes billions of dollars to the U.S. GDP, while providing employment to millions of people. Worldwide, the oceans provide employment for an estimated 10-12% of the global population.

Now, imagine the global impact of a mass migration from some of the largest cities around the world, while at the same time, huge percentages of the global population lose their means of food and income. This is a problem that could easily escalate.

As these types of issues get worse and likely increase political strain between countries, the risks associated with nuclear weapons could also increase. More and more reports have been coming out in recent months, indicating that the risk of a nuclear war could be reaching levels not seen since the Cold War.

This weekend, thousands of protesters took to the streets, marching through London in what has been called Britain’s largest anti-nuclear weapons rally in a generation. The march was organized by the Campaign for Nuclear Disarmament, and featured vocal nuclear weapons opponents like Britain’s Labour Party leader, Jeremy Corbyn.

Among the biggest risks we face with nuclear weapons is that they might get triggered accidentally or in response to a false alarm. This week we also launched timeline of known close calls to show just how easy it would be for a nuclear war to be waged inadvertently.

Our hope is that the more well informed people are, the more likely they are to take the steps necessary to mitigate these risks. Nuclear weapons may represent an existential risk, but people taking action is among the greatest of existential hopes.

Secretary William Perry Talks at Google: My Journey at the Nuclear Brink

Former Secretary of Defense William J. Perry was 14 years old when the Japanese attacked Pearl Harbor. As he humorously explained during his Talk at Google this week, in his 14-year-old brain, he was mostly upset because he was worried the war would be over before he could become an Army Air Corps pilot and fight in the war.

Sure enough, the war ended one month before his 18th birthday. He joined the Army Engineers anyway, and was sent to the Army occupation of Japan. That experience quickly altered his perception of war.

“What I saw in Tokyo and Okinawa changed my view altogether about the glamour and the glory of war,” he told the audience at Google.

Tokyo was in ruins — more devastated than Hiroshima — after two years and thousands of firebombs. He then went to Okinawa, which was the site of the last great battle of WWII. The battle had dragged on for nearly three months, during which, 100,000 Japanese had attempted to defend the city in that battle. By the end 90,000 of the Japanese fighters had perished. Perry described his shock upon arriving there to see the city completely demolished — not one building was left standing, and the people who had survived were living in the rubble.

“And then I reflected on Hiroshima,” he said. “This was what could be done with thousand pound bombs. In the case of Tokyo, thousands of them over a two-year period, and with thousands of bombers delivering them. The same result in Hiroshima, in an instant, with one airplane and one bomb. Just one bomb. Even at the tender age of 18, I understood: this changed everything.”

This experience helped shape his understanding of and approach to modern warfare.

Fast forward to the Cuban Missile Crisis. At the start of the crisis, he was working in California for a defense electronics company, but also doing pro-bono consulting work for the government. He was immediately called to Washington D.C. with other analysts to study the data coming in to try to understand the status of the Cuban missiles.

“Every day I went into that analysis center I believed would be my last day on earth. That’s how close we came to a nuclear catastrophe at that time,” he explained to the audience. He later added, “I still believe we avoided that nuclear catastrophe as much by good luck as good management.”

He then spoke of an episode, many years later, when he was overseeing research at the Pentagon. He got a 3 AM call from a general who said his computer was showing a couple hundred nuclear missiles launched from Russia and on their way to the U.S. The general had already determined that it was a false alarm, but he didn’t understand what was wrong with his computer. After two days studying the problem, they figured out that the sergeant responsible for putting in the operating tape had accidentally put in a training tape: the general’s computer was showing realistic simulations.

“It was human error. No matter how complex your systems are, they’re always subject to human error,” Perry said of the event.

He personally experienced two incidents – one of human error and one of system error – which could easily have escalated to the launch of our own nuclear missiles. His explanation for why the people involved were able to recognize these were false alarms was that “nothing bad was going on in the world at that time.” Ever since, he’s wondered what would have happened if these false alarms had occurred during a crisis while the U.S. was on high alert. Would the country have launched a retaliation that could have inadvertently started a nuclear war?

To this day, nuclear systems are still subject to these types of errors. If an ICBM launch officer gets the warning that an attack is imminent s/he will notify the President, who will then have approximately 10 minutes to decide whether or not to launch the missiles before they’re destroyed. That’s 10 minutes for the President to assess the context of all problems in the world combined with the technical information and then decide whether or not to launch nuclear weapons.

In fact, one of Perry’s biggest concerns is that the ICBMs are susceptible to these kinds of false of alarms. He acknowledges that the probability of an accidental nuclear war is very low.

“But,” he says, “why should we accept any probability?”

Adding to that concern is the Obama Administration’s decision to rebuild the nuclear arsenal, which will cost American taxpayers approximately $1 trillion over the next couple of decades. Yet there is very little discussion about this plan in the public arena. As Perry explains, “So far the public is not only not participating in that debate, they’re not even aware of what’s going on.”

Perry is also worried about nuclear terrorism. During the talk, he describes a hypothetical situation in which a terrorist could set off a strategically placed nuclear weapon in a city like Washington D.C. and use that to bring the United States and even the global economy to its knees. He explains that the one reason a scenario like this hasn’t played out yet is because fissile material is so hard to come by.

Throughout the discussion and the Q&A segment, North Korea, India, Pakistan, Iran, and China all came up. While commenting on North Korea, he said:

“The real danger of a missile is not the missile, it’s the fact that it could carry a nuclear warhead.”

That said, of all possible nuclear scenarios, he believes an intentional, regional nuclear war between India and Pakistan could be the most likely.

Perry served as Secretary of Defense from 1994 to 1997, and in more recent years, he’s become a strong advocate for reducing the risks of nuclear weapons. In addition to his many accomplishments and achievements, Perry was awarded the Presidential Medal of Freedom in 1997.

We highly recommend the Talks at Google interview with Perry. We also recommend his new book, My Journey at the Nuclear Brink. You can learn more about his efforts to decrease the risks of nuclear destruction at the William J. Perry Project.

While Perry mentioned two nuclear close calls, there have been many other over the years. We’ve put together a timeline of close calls that we know about – there have likely been many others.




Who’s to Blame (Part 4): Who’s to Blame if an Autonomous Weapon Breaks the Law?


The previous entry in this series examined why it would be very difficult to ensure that autonomous weapon systems (AWSs) consistently comply with the laws of war.  So what would happen if an attack by an AWS resulted in the needless death of civilians or otherwise constituted a violation of the laws of war?  Who would be held legally responsible?

In that regard, AWSs’ ability to operate free of human direction, monitoring, and control would raise legal concerns not shared by drones and other earlier generations of military technology.  It is not clear who, if anyone, could be held accountable if and when AWS attacks result in illegal harm to civilians and their property.  This “accountability gap” was the focus of a 2015 Human Rights Watch report.  The HRW report ultimately concluded that there was no plausible way to resolve the accountability issue and therefore called for a complete ban on fully autonomous weapons.

Although some commentators have taken issue with this prescription, the diagnosis seems to be correct—it simply is not obvious who would be responsible if an AWS commits an illegal act.  This accountability gap exists because AWSs incorporate AI technology could collect information and determine courses of action based on the conditions in which they operate.  It is unlikely that even the most careful human programmers could predict the nearly infinite on-the-ground circumstances that an AWS could face.  It would therefore be difficult for an AWS designer–to say nothing of its military operators–to foresee how the AWS would react in the fluid, fast-changing world of combat operations.  The inability to foresee an AWS’s actions would complicate the assignment of legal responsibility.

Read more

Davos 2016 – The State of Artificial Intelligence

An interesting discussion at Davos 2016 on the current state of artificial intelligence, featuring Stuart Russell, Matthew Grob, Andrew Moore, and Ya-Qin Zhang:

Dr. David Wright on North Korea’s Satellite

Earlier this month, Dr. David Wright, co-director of the Union of Concerned Scientists Global Security Program, wrote two posts about North Korea’s satellite launch. While North Korea isn’t currently thought to pose an existential risk with their weapons, any time nuclear weapons are involved, the situation has the potential to quickly escalate to something that could be catastrophic to the future of humanity. We’re grateful to Wright and the UCS for allowing us to share his posts here.

North Korea is Launching a Rocket Soon: What Do We Know About It?

North Korea has announced that it will launch a rocket sometime in the next two weeks to put a satellite in orbit for the second time. What do we know about it, and how worried should we be?


Fig.1. The Unha-3 ready to launch in April 2012. (Source: Sungwon Baik / VOA)

What We Know

North Korea has been developing rockets—both satellite launchers and ballistic missiles—for more than 25 years. Developing rockets requires flight testing them in the atmosphere, and the United States has satellite-based sensors and ground-based radars that allow it to detect flight testing essentially worldwide. So despite North Korea being highly secretive, it can’t hide such tests, and we know what rockets it has flight tested.

North Korea’s military has short-range missiles that can reach most of South Korea, and a longer range missile—called Nodong in the West—that can reach parts of Japan. But it has yet to flight test any military missiles that can reach targets at a distance of greater than about 1,500 kilometers.

(It has two other ballistic missile designs—called the Musudan and KN-08 in the West—that it has exhibited in military parades on several occasions over the past few years, but has never flight tested. So we don’t know what their state of development is, but they can’t be considered operational without flight testing.)

North Korea’s Satellite launcher

North Korea has attempted 5 satellite launches, starting in 1998, with only one success—in December 2012. While that launch put a small satellite into space, the satellite was apparently tumbling and North Korea was never able to communicate with it.

The rocket that launched the satellite in 2012 is called the Unha-3 (Galaxy-3) (Fig. 1). North Korea has announced locations of the splashdown zones for its upcoming launch, where the rocket stages will fall into the sea; since these are very similar to the locations of the zones for its 2012 launch, that suggests the launcher will also be very similar (Fig. 2).

Fig. Fig. 2. The planned trajectory of the upcoming launch. (Source: D Wright in Google Earth)

Fig. 2. The planned trajectory of the upcoming launch. (Source: D Wright in Google Earth)

We know a lot about the Unha-3 from analyzing previous launches, especially after South Korea fished parts of the rocket out of the sea after the 2012 launch. It is about 30 m tall, has a launch mass of about 90 tons, and consists of 3 stages that use liquid fuel. A key point is that the two large lower stages rely on 1960s-era

Scud-type engines and fuel, rather than the more advanced engines and fuel that countries such as  Russia and China use. This is an important limitation on the capability of the rocket and suggests North Korea does not have access to, or has not mastered, more advanced technology.

(Some believe North Korea may have purchased a number of these more advanced engines from the Soviet Union. But it has never flight tested that technology, even in shorter range missiles.)

Because large rockets are highly complex technical systems, they are prone to failure. Just because North Korea was able to get everything to work in 2012, allowing it to orbit a satellite, that says very little about the reliability of the launcher, so it is unclear what the probability of a second successful launch is.

The Satellite

The satellite North Korea launched in 2012— the  Kawngmyongsong-3, or “Bright Star 3”—is likely similar in size and capability (with a mass of about 100 kg) to the current satellite (also called Kawngmyongsong ). The satellite is not designed to do much, since the goal of early satellite launches is learning to communicate with the satellite. It may send back photos from a small camera on board, but these would be too low resolution (probably hundreds of meters) to be useful for spying.

In 2012, North Korea launched its satellite into a “sun-synchronous orbit” (with an inclination of 97.4 degrees), which is an orbit commonly used for satellites that monitor the earth, such as for environmental monitoring. Its orbital altitude was about 550 km, which is twice as high as the Space Station, but lower than most satellites, which sit in higher orbits since atmospheric drag at low altitudes will slow a satellite and cause it to fall from orbit sooner. For North Korea, the altitude was limited by the capability of its launcher. We expect a similar orbit this time, although if the launcher has been modified to carry somewhat more fuel it might be able to carry the satellite to a higher altitude.

The Launch Site and Flight Path

The launch will take place from the Sohae site near the western coast of North Korea (Fig. 2). It would be most efficient to launch due east so that the rocket gains speed from the rotation of the earth. North Korea launched its early flights in that direction but now launches south to avoid overflying Japan—threading the needle between South Korea, China, and the Philippines.

North Korea has modified the Sohae launch site since the 2012 launch. It has increased the height of the gantry that holds the rocket before launch, so that it can accommodate taller rockets, but I expect that extra height will not be needed for this rocket. It has also constructed a building on the launch pad that houses the rocket while it is being prepared for launch (which is a standard feature of modern launch sites around the world). This means we will not be able to watch the detailed launch preparations, which gave indications of the timing of the launch in 2012.

Satellite launch or ballistic missile?

So, is this really an attempt to launch a satellite, or could it be a ballistic missile launch in disguise? Can you tell the difference?

Fig. 3. Trajectories for a long-range ballistic missile (red) and Unha-3 satellite launch (blue).

Fig. 3. Trajectories for a long-range ballistic missile (red) and Unha-3 satellite launch (blue).

The U.S. will likely have lots of sensors—on satellites, in the air, and on the ground and sea—watching the launch, and it will be able to quickly tell whether or not it is really a satellite launch because the trajectory of a satellite launch and ballistic missile are very different.

Figure 3 shows the early part of the trajectory of a typical liquid-fueled ballistic missile (ICBM) with a range of 12,000 km (red) and the Unha-3 launch trajectory from 2012 (blue). They differ in shape and in the length of time the rocket engines burn. In this example, the ICBM engines burn for 300 seconds and the Unha-3 engines burn for nearly twice that long. The ICBM gets up to high speed much faster and then goes much higher.

Interestingly, the Unha-3’s longer burn time means that its upper stages have been designed for use in a satellite launcher, rather than a ballistic missile. So this rocket looks more like a satellite launcher than a ballistic missile.

Long-Range Missile Capability?

Of course, North Korea can still learn a lot from satellite launches about the technology it can use to build a ballistic missile, since the two types of rockets use the same basic technology. That is the source of the concern about these launches.

The range of a missile is based on the technology used and other factors. Whether theUnha-3 could carry a nuclear warhead depends in part on how heavy a North Korea nuclear weapon is, which is a topic of ongoing debate. If the Unha were modified to carry a 1,000 kg warhead rather than a light satellite, the missile could have enough range to reach Alaska and possibly Hawaii, but might not be able to reach the continental U.S. (Fig. 4). If instead North Korea could reduce the warhead mass to around 500 kg, the missile would likely be able to reach large parts of the continental U.S.

North Korea has not flight tested a ballistic missile version of the Unha or a reentry heat shield that would be needed to protect the warhead as it reentered the atmosphere. Because of its large size, such a missile is unlikely to be mobile, and assembling and fueling it at the launch site would be difficult to hide. Its accuracy would likely be many kilometers.

Fig. 4: Distances from North Korea. (Source: D Wright in Google Earth)

Fig. 4: Distances from North Korea. (Source: D Wright in Google Earth)

The bottom line is that North Korea is developing the technology it could use to build a ballistic missile with intercontinental range. Today it is not clear that it has a system capable of doing so or a nuclear weapon that is small enough to be delivered on it. It has shown, however, the capability to continue to make progress on both fronts.

The U.S. approach to dealing with North Korea in recent years through continued sanctions has not been effective in stopping this progress. It’s time for the U.S. to try a different approach, including direct U.S.-Korean talks.

North Korea Successfully Puts Its Second Satellite in Orbit

North Korea launched earlier than expected, and successfully placed its second satellite into orbit.

The launch took place at 7:29 pm EST Saturday, Feb. 6, U.S. time, which was 8:59 am local time on Sunday in North Korea. It originally said its launch window would not start until Feb. 8. Apparently the rocket was ready and the weather was good for a launch.

The U.S. office that tracks objects in space, the Joint Space Operations Center (JSPOC), announced a couple hours later that it was tracking two objects in orbit—the satellite and the third stage of the launcher. The satellite was in a nearly circular orbit (466 x 501 km). The final stage maneuvered to put it in a nearly polar, sun-synchronous orbit, with an inclination of 97.5 degrees.

Because the satellite orbit and other details of the launch were similar to those of North Korea’s last launch, in December 2012, this implies that the launch vehicle was also very similar.

This post from December 2012 allows you to see the launch trajectory in 3D using Google Earth.

South Korea is reporting that after the first stage burned out and was dropped from the rocket, it exploded before reaching the sea. This may have been intended to prevent it from being recovered and studied, as was the first stage of its December 2012 launch.

The satellite, called the Kwangmyongsong-4, is likely very similar to the satellite launched three years ago. It will likely not be known for several days whether, unlike the 2012 satellite, it can stop tumbling in orbit and communicate with the ground. It is apparently intended to stay in orbit for about 4 years.

If it can communicate with the Kwangmyongsong-4, North Korea will learn about operating a satellite in space. Even if not, it gained experience with launching and learned more about the reliability of its rocket systems.

For more information about the launch, see my earlier post.

Note added: Feb. 7, 1:00 am

The two orbiting objects, the satellite and the third-stage rocket body, show up in the NORAD catalog of space objects as numbers 41332 for the satellite and 41333 for the rocket body. (Thanks to Jonathan McDowell for supplying these.)

X-risk News of the Week: AAAI, Beneficial AI Research, a $5M Contest, and Nuclear Risks

X-risk = Existential Risk. The risk that we could accidentally (hopefully accidentally) wipe out all of humanity.
X-hope = Existential Hope. The hope that we will all flourish and live happily ever after.

The highlights of this week’s news are all about research. And as is so often the case, research brings hope. Research can help us cure disease, solve global crises, find cost-effective solutions to any number of problems, and so on. The research news this week gives hope that we can continue to keep AI beneficial.

First up this week was the AAAI conference. As was mentioned in an earlier post, FLI participated in the AAAI workshop, AI, Ethics, and Safety. Eleven of our grant winners presented their research to date, for an afternoon of talks and discussion that focused on building ethics into AI systems, ensuring safety constraints are in place, understanding how and when things could go wrong, ensuring value alignment between humans and AI, and much more. There was also a lively panel discussion about new ideas for future AI research that could help ensure AI remains safe and beneficial.

The next day, AAAI President, Tom Dietterich (also an FLI grant recipient), delivered his presidential address with a focus on enabling more research into robust AI. He began with a Marvin Minsky quote, in which Minsky explained that when a computer encounters an error, it fails, whereas when the human brain encounters an error, it tries another approach. And with that example, Dietterich launched into his speech about the importance of robust AI and ensuring that an AI can address the various known and unknown problems it may encounter. While discussing areas in which AI development is controversial, he also made a point to mention his opposition to autonomous weapons, saying, “I share the concerns of many people that I think the development of autonomous offensive weapons, without a human in the loop, is a step that we should not take.”

AAAI also hosted a panel this week on the economic impact of AI, which included FLI Scientific Advisory Board members, Nick Bostrom and Erik Brynjofsson, as well as an unexpected appearance by FLI President, Max Tegmark. As is typical of such discussions, there was a lot of concern about the future of jobs and how average workers will continue to make a living. However, the TechRepublic noted that both Bostrom and Tegmark are hopeful that if we plan appropriately, then the increased automation could greatly improve our standard of living. As the TechRepublic reported:

“’Perhaps,’ Bostrom said, ‘we should strive for things outside the economic systems.’ Tegmark agreed. ‘Maybe we need to let go of the obsession that we all need jobs.’”

Also this week, IBM and the X Prize Foundation announced a $5 million collaboration, in which IBM is encouraging developers and researchers to use Watson as the base for creating “jaw-dropping, awe-inspiring” new technologies that will be presented during TED2020. There will be interim prizes for projects leading up to that event, while the final award will be presented after the TED2020 talks. As they explain on the X Prize page:

“IBM believes this competition can accelerate the creation of landmark breakthroughs that deliver new, positive impacts to peoples’ lives, and the transformation of industries and professions.

We believe that cognitive technologies like Watson represent an entirely new era of computing, and that we are forging a new partnership between humans and technology that will enable us to address many of humanity’s most significant challenges — from climate change, to education, to healthcare.”

Of course, not all news can be good news, and so the week’s highlights end with a reminder about the increasing threat of nuclear weapons. Last week, the Union of Concerned Scientists published a worrisome report about the growing concern that a nuclear war is becoming more likely. Among other things, the report considers the deteriorating relationship between Russia and the U.S., as well as the possibility that China may soon implement a hair-trigger-alert policy for their own nuclear missiles.

David Wright, co-director of the UCS Global Security Program, recently wrote a blog post about the report. Referring to first the U.S.-Russia concern and then the Chinese nuclear policy, he wrote:

“A state of heightened tension changes the context of a false alarm, should one occur, and tends to increase the chance that the warning will be seen as real. […] Should China’s political leaders agree with this change, it would be a dangerous shift that would increase the chance of an accidental or mistaken launch at the United States.”

Update: Another FLI grant winner, Dr. Wendell Wallach, made news this week for his talk at the Association for the Advancement of Science, in which he put forth a compromise for addressing the issue of autonomous weapons. According to Defense One, Wallach laid out three ideas:

“1) An executive order from the president proclaiming that lethal autonomous weapons constitute a violation of existing international humanitarian law.”

“2) Create an oversight and governance coordinating committee for AI.”

“3) Direct 10 percent of the funding in artificial intelligence to studying, shaping, managing and helping people adapt to the “societal impacts of intelligent machines.”

AAAI Safety Workshop Highlights: Debate, Discussion, and Future Research

The 30th annual Association for the Advancement of Artificial Intelligence (AAAI) conference kicked off on February 12 with two days of workshops, followed by the main conference, which is taking place this week. FLI is honored to have been a part of the AI, Ethics, and Safety Workshop that took place on Saturday, February 13.


Phoenix Convention Center where AAAI 2016 is taking place.

The workshop featured many fascinating talks and discussions, but perhaps the most contested and controversial was that by Toby Walsh, titled, “Why the Technological Singularity May Never Happen.”

Walsh explained that, though general knowledge has increased, human capacity for learning has remained relatively consistent for a very long time. “Learning a new language is still just as hard as it’s always been,” he said, to provide an example. If we can’t teach ourselves how to learn faster he doesn’t see any reason to believe that machines will be any more successful at the task.

He also argued that even if we assume that we can improve intelligence, there’s no reason to assume it will increase exponentially, leading to an intelligence explosion. He believes it is just as possible that machines could develop intelligence and learning that increases by half for each generation, thus it would increase, but not exponentially, and it would be limited.

Walsh does anticipate superintelligent systems, but he’s just not convinced they will be the kind that can lead to an intelligence explosion. In fact, as one of the primary authors of the Autonomous Weapons Open Letter, Walsh is certainly concerned about aspects of advanced AI, and he ended his talk with concerns about both weapons and job loss.

Both during and after his talk, members of the audience vocally disagreed, providing various arguments about why an intelligence explosion could be likely. Max Tegmark drew laughter from the crowd when he pointed out that while Walsh was arguing that a singularity might not happen, the audience was arguing that it might happen, and these “are two perfectly consistent viewpoints.”

Tegmark added, “As long as one is not sure if it will happen or it won’t, it’s wise to simply do research and plan ahead and try to make sure that things go well.”

As Victoria Krakovna has also explained in a previous post, there are other risks associated with AI that can occur without an intelligence explosion.

The afternoon portion of the talks were all dedicated to technical research by current FLI grant winners, including Vincent Conitzer, Fuxin Li, Francesca Rossi, Bas Steunebrink, Manuela Veloso, Brian Ziebart, Jacob Steinhardt, Nate Soares, Paul Christiano, Stefano Ermon, and Benjamin Rubinstein. Topics ranged from ensuring value alignments between humans and AI to safety constraints and security evaluation, and much more.

While much of the research presented will apply to future AI designs and applications, Li and Rubinstein presented examples of research related to image recognition software that could potentially be used more immediately.

Li explained the risks associated with visual recognition software, including how someone could intentionally modify the image in a human-imperceptible way to make it incorrectly identify the image. Current methods rely on machines accessing huge quantities of images to reference and learn what any given image is. However, even the smallest perturbation of the data can lead to large errors. Li’s own research looks at unique ways for machines to recognize an image, thus limiting the errors.

Rubinstein’s focus is geared more toward security. The research he presented at the workshop is similar to facial recognition, but goes a step farther, to understand how small changes made to one face can lead systems to confuse the image with that of someone else.

Fuxin Li

Fuxin Li


Ben Rubinstein




Future of beneficial AI research panel: Francesca Rossi, Nate Soares, Tom Dietterich, Roman Yampolskiy, Stefano Ermon, Vincent Conitzer, and Benjamin Rubinstein.

The day ended with a panel discussion on the next steps for AI safety research that also drew much debate between panelists and the audience. The panel included AAAI president, Tom Dietterich, as well as Rossi, Soares, Conitzer, Ermon, Rubinstein, and Roman Yampolskiy, who also spoke earlier in the day.

Among the prevailing themes were concerns about ensuring that AI is used ethically by its designers, as well as ensuring that a good AI can’t be hacked to do something bad. There were suggestions to build on the idea that AI can help a human be a better person, but again, concerns about abuse arose. For example, an AI could be designed to help voters determine which candidate would best serve their needs, but then how can we ensure that the AI isn’t secretly designed to promote a specific candidate?

Judy Goldsmith, sitting in the audience, encouraged the panel to consider whether or not an AI should be able to feel pain, which led to extensive discussion about the pros and cons of creating an entity that can suffer, as well as questions about whether such a thing could be created.


Francesca Rossi and Nate Soares


Tom Dietterich and Roman Yampolskiy

After an hour of discussion many suggestions for new research ideas had come up, giving researchers plenty of fodder for the next round of beneficial-AI grants.

We’d also like to congratulate Stuart Russell and Peter Norvig who were awarded the 2016 AAAI/EAAI Outstanding Educator Award for their seminal text “Artificial Intelligence: A Modern Approach.” As was mentioned during the ceremony, their work “inspired a new generation of scientists and engineers throughout the world.”


Congratulations to Peter Norvig and Stuart Russell!

Who’s to Blame (Part 3): Could Autonomous Weapon Systems Navigate the Law of Armed Conflict

“Robots won’t commit war crimes.  We just have to program them to follow the laws of war.”  This is a rather common response to the concerns surrounding autonomous weapons, and it has even been advanced as a reason that robot soldiers might be less prone to war crimes than human soldiers.  But designing such autonomous weapon systems (AWSs) is far easier said than done.  True, if we could design and program AWSs that always obeyed the international law of armed conflict (LOAC), then the issues raised in the previous segment of this series — which suggested the need for human direction, monitoring, and control of AWSs — would be completely unfounded. But even if such programming prowess is possible, it seems unlikely to be achieved anytime soon. Instead, we need be prepared for powerful AWS that may not recognize where the lines blur between what is legal and reasonable during combat and what is not.

While the basic LOAC principles seem straightforward at first glance, their application in any given military situation depends heavily on the specific circumstances in which combat takes place. And the difference between legal and illegal acts can be blurry and subjective.  It therefore would be difficult to reduce the laws and principles of armed conflict into a definite and programmable form that could be encoded into the AWS and, from which the AWS could consistently make battlefield decisions that comply with the laws of war.

Four core principles guide LOAC: distinction, military necessity, unnecessary suffering, and proportionality.  Distinction means that participants in an armed conflict must distinguish between military and civilian personnel (and between military and civilian objects) and limit their attacks to military targets.  It follows that an attack must be justified by military necessity–i.e., the attack, if successful, must give the attacker some military advantage.  The next principle, as explained by the International Committee of the Red Cross, is that combatants must not “employ weapons, projectiles [or] material and methods of warfare of a nature to cause superfluous injury or unnecessary suffering.”  Unlike the other core principles, the principle of unnecessary suffering generally protects combatants to the same extent as civilians.  Finally, proportionality dictates that the harm done to civilians and civilian property must not be excessive in light of the military advantage expected to be gained by an attack.

For a number of reasons, it would be exceedingly difficult to ensure that an AWS would consistently comply with these requirements if it were permitted to select and engage targets without human input.  One reason is that it would be difficult for an AWS to gather all objective information relevant to making determinations of the core LOAC principles.  For example, intuition and experience might allow a human soldier to infer from observing minute details of his surroundings–such as seeing a well-maintained children’s bicycle or detecting the scent of recently cooked food–that civilians may be nearby.  It might be difficult to program an AWS to pick up on such subtle insignificant clues, even though those clues might be critical to assessing whether a targeted structure contains civilians (relevant to distinction and necessity) or whether engaging nearby combatants might result in civilian casualties (relevant to proportionality).

But there is an even more fundamental and vexing challenge in ensuring that AWSs comply with LOAC: even if an AWS were somehow able to obtain all objective information relevant to the LOAC implications of a potential military engagement, all of the core LOAC principles are subjective to some degree.  For example, the operations manual of the US Air Force Judge Advocate General’s Office states that “[p]roportionality in attack is an inherently subjective determination that will be resolved on a case-by-case basis.”  This suggests that proportionality is not something that simply can be reduced to a formula or otherwise neatly encoded so that an AWS would never launch disproportionate attacks.  It would be even more difficult to formalize the concept of “military necessity,” which is fiendishly difficult to articulate without getting tautological and/or somehow incorporating the other LOAC principles.

The principle of distinction might seem fairly objective–soldiers are fair game, civilians are not.  But it can even be difficult–sometimes exceptionally so–to determine whether a particular individual is a combatant or a civilian.  The Geneva Conventions state that civilians are protected from attack “unless and for such time as they take a direct part in hostilities.”  But how “direct” must participation in hostilities be before a civilian loses his or her LOAC protection?  A civilian in an urban combat area who picks up a gun and aims it at an enemy soldier clearly has forfeited his civilian status.  But what about a civilian in the same combat zone who is acting as a spotter?  Who is transporting ammunition from a depot to the combatants’ posts?  Who is repairing an enemy Jeep?  Do these answers change if the combat zone is in a desert instead of a city?  Given that humans frequently disagree on where the boundary between civilians and combatants should lie, it would be difficult to agree on an objective framework that would allow an AWS to accurately distinguish between civilians and combatants in the myriad scenarios it might face on the battlefield.

Of course, humans can also have great difficulty in making such determinations–and humans have been known to intentionally violate LOAC’s core principles, a rather significant drawback to which AWSs might be more resistant.  But when a human commits a LOAC violation, that human being can be brought to justice and punished. Who would be held responsible if an AWS attack violates those same laws?  As of now, that is far from clear.  That accountability problem will be the subject of the next entry in this series.

X-risk News of the Week: Nuclear Winter and a Government Risk Report

X-risk = Existential Risk. The risk that we could accidentally (hopefully accidentally) wipe out all of humanity.
X-hope = Existential Hope. The hope that we will all flourish and live happily ever after.

The big news this week landed squarely in the x-risk end of the spectrum.

First up was a New York Times op-ed titled, Let’s End the Peril of a Nuclear Winter, and written by climate scientists, Drs. Alan Robock and Owen Brian Toon. In it, they describe the horrors of nuclear winter — the frigid temperatures, the starvation, and the mass deaths — that could terrorize the entire world if even a small nuclear war broke out in one tiny corner of the globe.

Fear of nuclear winter was one of the driving forces that finally led leaders of Russia and the US to agree to reduce their nuclear arsenals, and concerns about nuclear war subsided once the Cold War ended. However, recently, leaders of both countries have sought to strengthen their arsenals, and the threat of a nuclear winter is growing again. While much of the world struggles to combat climate change, the biggest risk could actually be that of plummeting temperatures if a nuclear war were to break out.

In an email to FLI, Robock said:

“Nuclear weapons are the greatest threat that humans pose to humanity.  The current nuclear arsenal can still produce nuclear winter, with temperatures in the summer plummeting below freezing and the entire world facing famine.  Even a ‘small’ nuclear war, using less than 1% of the current arsenal, can produce starvation of a billion people.  We have to solve this problem so that we have the luxury of addressing global warming.


Also this week, the Senate Armed Services Committee, led by James Clapper, released the Worldwide Threat Assessment of the US Intelligence Community for 2016. The document is 33 pages of potential problems the government is most concerned about in the coming year, a few of which can fall into the category of existential risks:

  1. The Internet of Things (IoT). Though this doesn’t technically pose an existential risk, it does have the potential to impact quality of life and some of the freedoms we typically take for granted. The report states: “In the future, intelligence services might use the IoT for identification, surveillance, monitoring, location tracking, and targeting for recruitment, or to gain access to networks or user credentials.”
  2. Artificial Intelligence. Clapper’s concerns are broad in this field. He argues: “Implications of broader AI deployment include increased vulnerability to cyberattack, difficulty in ascertaining attribution, facilitation of advances in foreign weapon and intelligence systems, the risk of accidents and related liability issues, and unemployment. […] The increased reliance on AI for autonomous decision making is creating new vulnerabilities to cyberattacks and influence operations. […] AI systems are susceptible to a range of disruptive and deceptive tactics that might be difficult to anticipate or quickly understand. Efforts to mislead or compromise automated systems might create or enable further opportunities to disrupt or damage critical infrastructure or national security networks.”
  3. Nuclear. Under the category of Weapons of Mass Destruction (WMD), Clapper dedicated the most space to concerns about North Korea’s nuclear weapons. However he also highlighted concerns about China’s work to modernize its nuclear weapons, and he argues that Russia violated the INF Treaty when they developed a ground-launch cruise missile.
  4. Genome Editing. Interestingly, gene editing was also listed in the WMD category. As Clapper explains, “Research in genome editing conducted by countries with different regulatory or ethical standards than those of Western countries probably increases the risk of the creation of potentially harmful biological agents or products.” Though he doesn’t explicitly refer to the CRISPR-Cas9 system, he does worry that the low cost and ease-of-use for new technologies will enable “deliberate or unintentional misuse” that could “lead to far reaching economic and national security implications.”

The report, though long, is an easy read, and it’s always worthwhile to understand what issues are motivating the government’s actions.


With our new series by Matt Scherer about the legal complications of some of the anticipated AI and autonomous weapons developments, the big news should have been about all of the headlines this week that claimed the federal government now considers AI drivers to be real drivers. Scherer, however, argues this is bad journalism. He provides his interpretation of the NHTSA letter in his recent blog post, “No, the NHTSA did not declare that AIs are legal drivers.”


While the headlines of the last few days may have veered toward x-risk, this week also marks the start of the 30th annual Association for the Advancement of Artificial Intelligence (AAAI) Conference. For almost a week, AI researchers will convene in Phoenix to discuss their developments and breakthroughs, and on Saturday, FLI grantees will present some of their research at the AI Ethics and Society Workshop. This is expected to be an event full of hope and excitement about the future!


Who’s to Blame (Part 2): What is an “autonomous” weapon?

The following is the second in a series about the limited legal oversight of autonomous weapons. The first segment can be found here.


Source: Peanuts by Charles Schulz, January 31, 2016 Via @GoComics

Before turning in greater detail to the legal challenges that autonomous weapon systems (AWSs) will present, it is essential to define what “autonomous” means in the weapons context.  It is, after all, the presence of “autonomy” that will distinguish AWSs from earlier weapon technologies.

Most dictionary definitions of “autonomy” focus on the presence of free will or freedom of action.  These are affirmative definitions, stating what autonomy is.  Some dictionary definitions approach autonomy from a different angle, defining it not by the presence of freedom of action, but rather by the absence of external constraints on that freedom (e.g., “the state of existing or acting separately from others).  This latter approach is more useful in the context of weapon systems, since the existing literature on AWSs seems to use the term “autonomous” as referring to a weapon system’s ability to operate free from human influence and involvement.

Existing AWS commentaries seem to focus on three general methods by which humans can govern an AWS’s actions.  This essay will refer to those methods as direction, monitoring, and control.  A weapon system’s “autonomy” therefore refers to the degree to which the weapon system operates free from human direction, monitoring, and/or control.

Human direction, in this context, refers to the extent to which humans specify the parameters of a weapon system’s operation, from the initial design and programming of the system all the way to battlefield orders regarding the selection of targets and the timing and method of attack.  Monitoring refers to the degree to which humans actively observe and collect information on a weapon system’s operations, whether through a live source such as a video feed or through regular reviews of data regarding a weapon system’s operations.  And control is the degree to which humans can intervene in real time to change what a weapon system is currently doing, such as by actively controlling the system’s physical movement and combat functions or by shutting it down completely if the system malfunctions.  Existing commentaries on “autonomy” in weapon systems all seem to invoke at least one of these three concepts, though they may use different words to refer to those concepts.

The operation of modern military drones such as the MQ-1 Predator and MQ-9 Reaper illustrates how these concepts work in practice.  A Predator or Reaper will not take off, select a target, or launch a missile without direct human input.  Such drones thus are completely dependent on human direction.  While a drone, like a commercial airliner on auto-pilot, may steer itself during non-mission-critical phases of flight, human operators closely monitor the drone throughout each mission both through live video feeds from cameras mounted on the drone and through flight data transmitted by the drone in real time.  And, of course, humans directly (though remotely) control the drone during all mission-critical phases.  Indeed, if the communications link that allows the human operator to control the drone fails, “the drone is programmed to fly autonomously in circles, or return to base, until the link can be reconnected.”  The dominating presence of human direction, monitoring, and control mean that a drone is, in effect, “little more than a super-fancy remote-controlled plane.”  The human-dependent nature of drones makes the task of piloting a drone highly stressful and labor-intensive–so much so that recruitment and retention of drone pilots has proven to be a major challenge for the U.S. Air Force.  That, of course, is part of why militaries might be tempted to design and deploy weapon systems that can direct themselves and/or that do not require constant human monitoring or control.

Direction, monitoring, and control are very much interrelated, with monitoring and control being especially intertwined.  During an active combat mission, human monitoring must be accompanied by human control (and vice versa) to act as an effective check on a weapon system’s operations.  (For that reason, commentators often seem to combine monitoring and control into a single broader concept, such as “oversight” or, my preferred term, “supervision.“)  Likewise, direction is closely related to control; an AWS could not be given new orders (i.e., direction) by a human commander if the AWS was not equipped with mechanisms allowing for human control of its operations.  Such an AWS would only be human-directed in terms of its initial programming.

Particularly strong human direction can also reduce the need for monitoring and control, and vice versa.  A weapon system that is subject to complete human direction in terms of the target, timing, and method of attack (and that has no ability to alter those parameters) has no more autonomy than fire-and-forget guided missiles, a technology that has been available for decades.  And a weapon system subject to constant real-time human monitoring and control may have no more practical autonomy than the remotely piloted drones that are already in widespread military use.

Consequently, the strongest concerns relate to weapon systems that are “fully autonomous”–that is, weapon systems that can select and engage targets without specific orders from a human commander and operate without real-time human supervision.  A 2015 Human Rights Watch (HRW) report, for instance, defines “fully autonomous weapons” as systems that lack meaningful human direction regarding the selection of targets and delivery of force and whose human supervision is so limited that humans are effectively “out-of-the-loop.”  A directive issued by the United States Department of Defense (DoD) in 2009 similarly defines an AWS as “a weapon system that, once activated, can select and engage targets without further intervention by a human operator.”

These sources also recognize the existence of weapon systems with lower levels of autonomy.  The DoD directive covers “semi-autonomous weapons systems” that are “intended to only engage individual targets or specific target groups that have been selected by a human operator.”  Such systems must be human-directed in terms of target selection, but could be largely free from human supervision and can even be self-directed with respect to the means and timing of attack.  The same directive discusses “human-supervised” AWSs that, while capable of fully autonomous operation, are “designed to provide human operators with the ability to intervene and terminate engagements.”  HRW similarly distinguishes fully autonomous weapons from those with a human “on the loop,” meaning AWSs that “can select targets and deliver force under the oversight of a human operator who can override the robots’ actions.”

In sum, “autonomy” in weapon systems refers to the degree to which the weapon system operates free from meaningful human direction, monitoring, and control.  Weapon systems that operate without those human checks on their autonomy would raise unique legal issues if those systems’ operations lead to violations of international law.  Those legal challenges will be the subject of the next post in this series.

This segment was originally posted on the blog, Law and AI.

MIRI’s February 2016 Newsletter

This post originally comes from MIRI’s website.

Research updates

General updates

  • Fundraiser and grant successes: MIRI will be working with AI pioneer Stuart Russell and a to-be-determined postdoctoral researcher on the problem of corrigibility, thanks to a $75,000 grant by the Center for Long-Term Cybersecurity.

News and links

X-risk News of the Week: Human Embryo Gene Editing

X-risk = Existential Risk. The risk that we could accidentally (hopefully accidentally) wipe out all of humanity.
X-hope = Existential Hope. The hope that we will all flourish and live happily ever after.

If you keep up with science news at all, then you saw the headlines splashed all over news sources on Monday: The UK has given researchers at the Francis Crick Institute permission to edit the genes of early-stage human embryos.

This is huge news, not only in genetics and biology fields, but for science as a whole. No other researcher has ever been granted permission to perform gene editing on viable human embryos before.

The usual fears of designer babies and slippery slopes popped up, but as most of the general news sources reported, those fears are relatively unwarranted for this research. In fact, this project, with is led by Dr. Kathy Niakan, could arguably be closer to the existential hope side of the spectrum.

Niakan’s objective is to try to understand the first seven days of embryo development, and she’ll do so by using CRISPR to systematically sweep through genes in embryos that were donated from in vitro fertilization (IVF) procedures. While research in mice and other animals has given researchers an idea of the roles different genes play at those early stages of development, there many genes that are uniquely human and can’t be studied in other animals. Many causes of infertility and miscarriages are thought to occur in some of those genes during those very early stages of development, but we can only determine that through this kind of research.

Niakan explained to the BBC, “We would really like to understand the genes needed for a human embryo to develop successfully into a healthy baby. The reason why it is so important is because miscarriages and infertility are extremely common, but they’re not very well understood.”

It may be hard to see how preventing miscarriages could be bad, but this is a controversial research technique under normal circumstances, and Niakan’s request for approval came on the heels of human embryo research that did upset the world.

Last year, outrage swept through the scientific community after scientists in China chose to skip proper approval processes to perform gene-editing research on nonviable human embryos. Many prominent scientists in the field, including FLI’s Scientific Advisory Board Member George Church, responded by calling for a temporary moratorium on using the CRISPR/Cas-9 gene-editing tool in human embryos that would be carried to term.

An important distinction to make here is that Dr. Niakan went through all of the proper approval channels to start her research. Though the UK’s approval process isn’t quite as stringent as that in the US – which prohibits all research on viable embryos – the Human Fertilisation and Embryology Authority, which is the approving body, is still quite strict, insisting, among other things, that the embryos be destroyed after 14 days to ensure they can’t ever be taken to term. The team will also only use embryos that were donated with full consent by the IVF patients

Max Schubert, a doctoral candidate of Dr. George Church’s lab at Harvard, explained that one of the reasons for the temporary moratorium was to give researchers time to study the effects of CRISPR first to understand how effective and safe it truly is. “I think [Niakan’s research] represents the kind of work that you need to do to understand the risks that those scientists are concerned about,” said Schubert.

John Min, also a PhD candidate in Dr. Church’s lab, pointed out that the knowledge we could gain from this research will very likely lead to medications and drugs that can be used to help prevent miscarriages, and that the final treatment could very possibly not involve any type of gene editing at all. This would eliminate, or at least limit, concerns about genetically modified humans.

Said Min, “This is a case that illustrates really well the potential of CRISPR technology … CRISPR will give us the answers to [Niakan’s] questions much more cheaply and much faster than any other existing technology.”

Who’s to Blame (Part 1): The Legal Vacuum Surrounding Autonomous Weapons

The year is 2020 and intense fighting has once again broken out between Israel and Hamas militants based in Gaza.  In response to a series of rocket attacks, Israel rolls out a new version of its Iron Dome air defense system.  Designed in a huge collaboration involving defense companies headquartered in the United States, Israel, and India, this third generation of the Iron Dome has the capability to act with unprecedented autonomy and has cutting-edge artificial intelligence technology that allows it to analyze a tactical situation by drawing from information gathered by an array of onboard sensors and a variety of external data sources.  Unlike prior generations of the system, the Iron Dome 3.0 is designed not only to intercept and destroy incoming missiles, but also to identify and automatically launch a precise, guided-missile counterattack against the site from where the incoming missile was launched.  The day after the new system is deployed, a missile launched by the system strikes a Gaza hospital far removed from any militant activity, killing scores of Palestinian civilians. Outrage swells within the international community, which demands that whoever is responsible for the atrocity be held accountable.  Unfortunately, no one can agree on who that is…

Much has been made in recent months and years about the risks associated with the emergence of artificial intelligence (AI) technologies and, with it, the automation of tasks that once were the exclusive province of humans.  But legal systems have not yet developed regulations governing the safe development and deployment of AI systems or clear rules governing the assignment of legal responsibility when autonomous AI systems cause harm.  Consequently, it is quite possible that many harms caused by autonomous machines will fall into a legal and regulatory vacuum.  The prospect of autonomous weapons systems (AWSs) throws these issues into especially sharp relief.  AWSs, like all military weapons, are specifically designed to cause harm to human beings—and lethal harm, at that.  But applying the laws of armed conflict to attacks initiated by machines is no simple matter.

The core principles of the laws of armed conflict are straightforward enough.  Those most important to the AWS debate are: attackers must distinguish between civilians and combatants; they must strike only when it is actually necessary to a legitimate military purpose; and they must refrain from an attack if the likely harm to civilians outweighs the military advantage that would be gained.  But what if the attacker is a machine?  How can a machine make the seemingly subjective determination regarding whether an attack is militarily necessary?  Can an AWS be programmed to quantify whether the anticipated harm to civilians would be “proportionate?”  Does the law permit anyone other than a human being to make that kind of determination?  Should it?

But the issue goes even deeper than simply determining whether the laws of war can be encoded into the AI components of an AWS.  Even if everyone agreed that a particular AWS attack constituted a war crime, would our sense of justice be satisfied by “punishing” that machine?  I suspect that most people would answer that question with a resounding “no.”  Human laws demand human accountability.  Unfortunately, as of right now, there are no laws at the national or international level that specifically address whether, when, or how AWSs can be deployed, much less who (if anyone) can be held legally responsible if an AWS commits an act that violates the laws of armed conflict.  This makes it difficult for those laws to have the deterrent effect that they are designed to have; if no one will be held accountable for violating the law, then no one will feel any particular need to ensure compliance with the law.  On the other hand, if there are human(s) with a clear legal responsibility to ensure that an AWS’s operations comply with the laws of war, then horrors such as the hospital bombing described in the intro to this essay would be much less likely to come to fruition.

So how should the legal voids surrounding autonomous weapons–and for that matter, AI in general–be filled?  Over the coming weeks and months, that question–along with the other questions raised in this essay–will be examined in greater detail on the FLI website and on the Law and AI blog.  Stay tuned.

The next segment of this series is scheduled for February 10.

The original post can be found at Law and AI.

Nuclear Warmongering Is Back in Fashion

“We should not be surprised that the Air Force and Navy think about actually employing nuclear weapons rather than keeping them on the shelf and assuming that will be sufficient for deterrence.”

This statement was made by Adam Lowther, a research professor at the Air Force Research Institute, in an article for The National Interest, in which he attempts to convince readers that, as the title says, “America Still Needs Its Nukes.” The comment is strikingly similar to one made by Donald Trump’s spokesperson, who said, “What good does it do to have a good nuclear triad, if you’re afraid to use it?”

Lowther wrote this article as a rebuttal to people like former Defense Secretary William Perry, who have been calling for a reduction of our nuclear arsenal. However, his arguments in support of his pro-nuclear weapons stance — and of his frighteningly pro-nuclear war stance — do not take into account some of the greatest concerns about having such a large nuclear arsenal.

Among the biggest issues is simply that, yes, a nuclear war would be bad. First, it’s nearly impossible launch a nuclear strike without killing innocent civilians. Likely millions of innocent civilians. The two atomic bombs dropped on Japan in WWII killed approximately 100,000 people. Modern hydrogen bombs are 10 to 1000 times more powerful, and a single strategically targeted bomb can kill millions.

Then, we still have to worry about the aftermath. Recent climate models have shown that a full-scale nuclear war might put enough smoke into the upper atmosphere that it could spread around the globe and cause temperatures to plummet by as much 40 degrees Farenheit for up to a decade. People around the world who survived the war – or who weren’t even a part of it – would likely succumb to starvation, hypothermia, disease, or desperate, armed gangs roving for food. But even for a small nuclear war — the kind that could potentially erupt between India and Pakistan — climate models predict that death tolls could reach 1 billion worldwide. Lowther insists that the military spends a significant amount of time studying war games, but how much of that time is spent considering the hundreds of millions of Americans who might die as a result of nuclear winter? Or, as Dr. Alan Robock calls it, self-assured destruction.

A nuclear war could be horrifying, and preventing one should be a constant goal.

This brings up another point that Max Tegmark mentions in the comments section of the article:

“To me, a key question is this, which he [Lowther] never addresses: What is the greatest military threat to the US? A deliberate nuclear attack by Russia/China, or a US-Russia nuclear war starting by accident, as has nearly happened many times in the past? If the latter, then downsizing our nuclear arsenal will make us all safer.”

Does upgrading our nuclear arsenal really make us safer, as Lowther argues? Many people, Perry and Tegmark included, argue that spending $1 trillion to upgrade our nuclear weapons arsenal would actually make us less safe, by inadvertently increasing our chances of nuclear war.

And apparently the scientists behind the Doomsday Clock agree. The Bulletin of Atomic Scientists, who run the Doomsday Clock, announced today that the clock would remain set at three minutes to midnight. In their statement about this decision, they reminded viewers that the clock is a metaphor for the existential risks that pose a threat to the planet. As the Bulletin said,

“Three minutes (to midnight) is too close. Far too close. We, the members of the Science and Security Board of the Bulletin of the Atomic Scientists, want to be clear about our decision not to move the hands of the Doomsday Clock in 2016: That decision is not good news, but an expression of dismay that world leaders continue to fail to focus their efforts and the world’s attention on reducing the extreme danger posed by nuclear weapons and climate change.

“When we call these dangers existential, that is exactly what we mean: They threaten the very existence of civilization and therefore should be the first order of business for leaders who care about their constituents and their countries.”

According to CNN, the Bulletin believes the best way to get the clock to move back would be to spend less on nuclear arms, re-energize the effort for disarmament, and engage more with North Korea.

In what one commenter criticizes as a “bait-and-switch”, Lowther refers to people who make these arguments as “abolitionists,” whom he treats as crusading for a total ban against all nuclear weapons. The truth is more nuanced and interesting. While some groups do indeed call for a ban on nuclear weapons, a large majority of experts are simply advocating for making the world a safer place by: 1) reducing the number of nuclear weapons to a number that will provide sufficient deterrence, and 2) eliminating hair-trigger alert — both in an effort to decrease the chances of an accidental nuclear war. Lowther insists that he and the military don’t maintain a Cold-War mindset because they’ve been so focused on Islamic militants. However, it’s his belief that we should not rule out the possibility of using nuclear weapons that is precisely the Cold-War mindset concerning most people.

As Dr. David Wright from the Union of Concerned Scientists told FLI in an earlier interview:

“Today, nuclear weapons are a liability. They don’t address the key problems that we’re facing, like terrorism … and by having large numbers of them around … you could have a very rapid cataclysm that people are … reeling from forever.”