Trump to Pull US Out of Nuclear Treaty

Last week, U.S. President Donald Trump confirmed that the United States will be pulling out of the landmark Intermediate-Range Nuclear Forces Treaty (INF). The INF treaty, which went into effect in 1987, banned ground-launched nuclear missiles that have a range of 500 km to 5,500 km (310 to 3,400 miles). Although the agreement covers land-based missiles that carry both nuclear and conventional warheads, it doesn’t cover any air-launched or sea-launched weapons.

Nonetheless, when it was signed into effect by Former U.S. President Ronald Reagan and Former Soviet President Mikhail Gorbachev, it led to the elimination of nearly 2,700 short- and medium-range missiles. More significantly, it helped bring an end to a dangerous nuclear standoff between the two nations, and the trust that it fostered played a critical part in defusing the Cold War.

Now, as a result of the recent announcements from the Trump administration, all of this may be undone. As Malcolm Chalmers, deputy director general of the Royal United Services Institute, stated in an interview with The Guardian, “This is the most severe crisis in nuclear arms control since the 1980s. If the INF treaty collapses, and with the New Start treaty on strategic arms due to expire in 2021, the world could be left without any limits on the nuclear arsenals of nuclear states for the first time since 1972.”

Of course, the U.S. isn’t the only player that’s contributing to unravelling an arms treaty that helped curb competition and contributed to bringing an end to the Cold War.

Reports indicate that Russia has been violating the INF treaty since at least 2014, a fact that was previously acknowledged by the Obama administration and which President Trump cited in his INF withdrawal announcement last week. “Russia has violated the agreement. They’ve been violating it for many years, and I don’t know why President Obama didn’t negotiate or pull out,” Trump stated. “We’re not going to let them violate a nuclear agreement and do weapons and we’re not allowed to.…so we’re going to terminate the agreement. We’re going to pull out,” he continued.

Trump also noted that China played a significant role in his decision to pull the U.S. out of the INF treaty. Since China was not a part of the negotiations and is not a signatory, the country faces no limits when it comes to developing and deploying intermediate-range nuclear missiles — a fact that China has exploited in order to amass a robust missile arsenal. Trump noted that the U.S. will  have to develop those weapons, “unless Russia comes to us and China comes to us and they all come to us and say, ‘let’s really get smart and let’s none of us develop those weapons, but if Russia’s doing it and if China’s doing it, and we’re adhering to the agreement, that’s unacceptable.”

 

A Growing Concern

Concerns over Russian missile systems that breach the INF treaty are real and valid. Equally valid are the concerns over China’s weapons strategy. However, experts note that President Trump’s decision to leave the INF treaty doesn’t set us on the path to the negotiating table, but rather, toward another nuclear arms race.

Russian officials have been clear in this regard, with Leonid Slutsky, who chairs the foreign affairs committee in Russia’s lower house of parliament, stating this week that a U.S. withdrawal from the INF agreement “would mean a real new Cold War and an arms race with 100 percent probability” and “a collapse of the planet’s entire nonproliferation and disarmament regime.”

This is precisely why many policy experts assert that withdrawal is not a viable option and, in order to achieve a successful resolution, negotiations must continue. Wolfgang Ischinger, the former German ambassador to the United States, is one such expert. In a statement issued over the weekend, he noted that he is “deeply worried” about President Trump’s plans to dismantle the INF treaty and urged the U.S. government to, instead, work to expand the treaty. “Multilateralizing this agreement would be a lot better than terminating it,” he wrote on Twitter.

Even if the U.S. government is entirely disinterested in negotiating, and the Trump administration seeks only to respond with increased weaponry, policy experts assert that withdrawing from the INF treaty is still an unavailing and unnecessary move. As Jeffrey Lewis, the director of the East Asia nonproliferation program at the Middlebury Institute of International Studies at Monterey, notes, the INF doesn’t prohibit sea- or air-based systems. Consequently, the U.S. could respond to Russian and Chinese political maneuverings with increased armament without escalating international tensions by upending longstanding treaties.

Indeed, since President Trump made his announcement, a number of experts have condemned the move and called for further negotiations. EU spokeswoman Maja Kocijancic said that the U.S. and Russia “need to remain in a constructive dialogue to preserve this treaty” as it “contributed to the end of the Cold War, to the end of the nuclear arms race and is one of the cornerstones of European security architecture.”  

Most notably, in a statement that was issued Monday, the European Union cautioned the U.S. against withdrawing from the INF treaty, saying, “The world doesn’t need a new arms race that would benefit no one and on the contrary, would bring even more instability.”

An image of Hurricane Michael making landfall October 11, 2018. Photo courtesy of NASA.

IPCC 2018 Special Report Paints Dire — But Not Completely Hopeless — Picture of Future

On Wednesday, October 10, the panhandle of Florida was struck by Hurricane Michael, which has already claimed over 30 lives and destroyed communities, homes and infrastructure across multiple states. Michael is the strongest hurricane in recorded history to make landfall in that region. And in coming years, it’s likely that we’ll continue to see an increase in record breaking storms — as well as record-breaking heat waves, droughts, floods, and wildfires.

Only two days before Michael unleashed its devastation on the United States, the United Nations International Panel on Climate Change (IPCC) released a dire report on the prospects for maintaining global temperature rise to 1.5°C—and why we must meet this challenge head on.

In 2015, roughly during the time that the Paris Climate Agreement was being signed, global temperatures reached 1°C above pre-industrial levels. And we’re already feeling the impacts of this increase in the form of bigger storms, bigger wildfires, higher temperatures, melting arctic ice, etc.

The recent IPCC report concludes that, if society continues on its current trajectory — and even if the world abides by the Paris Climate Agreement — the planet will hit 1.5°C of warming in a matter of decades, and possibly in the next 12 years. And every half degree more that temperatures rises is expected to bring on even more extreme effects. Even if we can limit global warming to 1.5°C, the report predicts we’ll lose most coral reefs, sea levels will rise and flood many coastal communities, more people around the world will experience extreme heat waves, and other natural disasters can be expected to increase.

As global temperatures rise, they don’t rise evenly across the globe. Land air is expected to reach higher temperatures than that over the oceans, so what could be 1.5°C on average across earth, might be a 3-4.5°C increase in some sections of the world. This has the potential to trigger deadly heat waves, wildfires and droughts, which would also negatively impact local ecosystems and farmland.

But what about if we reach 2°C? This level of temperature increase is often floated as the highest limit the world can handle without too much suffering – but how much worse will it be than 1.5°C?

A difference of 0.5°C may not seem like much, but it could mean the difference between a world with some surviving coral reefs, and a world in which they — and many other species — are all destroyed. Two degrees could lead to an extra 420 million people experiencing extreme and possibly deadly heat waves. Some regions of the world will see increases in temperatures as high as 4-6°C. Sea levels are predicted to rise an extra 10 centimeters at 2°C versus 1.5°C, which could impact an extra 10 million people along coastal areas.

Meanwhile, human health will deteriorate; diseases like malaria and dengue fever could become more prevalent and spread into new regions with this increase in temperature. Farmland for many staple crops could decrease, and even livestock are expected to be adversely affected as feed quality and water availability may decrease.

The list goes on and on. But perhaps one of the greatest threats of climate change is that those who will likely be the hardest hit by increasing temperatures are those who are already among the poorest and most vulnerable.

Yet we’re not quite out of time. As the report highlights, all of these problems arise as a result of society taking little to no action. But what if we did start taking steps to reduce global warming? What if we could get governments and corporations to recognize the need to reduce emissions and switch to clean, alternative, renewable energy sources? What if individuals made changes to their own lifestyles while also encouraging their government leaders to take action?

The report suggests that under those circumstances, if we can achieve global net-zero emissions — that is, such low levels of carbon or other pollutants are emitted that they can be absorbed by trees and soil — then we can still prevent temperatures from exceeding 1.5°C. Temperatures will still increase somewhat as a result of current emissions, but there’s still time to curtail the most severe effects.

There are other organizations that believe we can achieve global net-zero emissions as well. For example, this summer, the Exponential Climate Action Roadmap was released, which offers a roadmap to achieve the goals of the Paris Climate Agreement by 2030. Or there’s The Solutions Project, which maps out steps to quickly achieve 100% renewable energy. And Drawdown provides 80 steps we can take to reduce emissions.

We don’t have much time left, but it’s not too late. The prospects are dire if we continue on our current trajectory, but if society can recognize the urgency of the situation and come together to take action, there’s still hope of keeping the worst effects of climate change at bay.

An edited version of this article was originally published on Metro. Photo courtesy of NASA.

Genome Editing and the Future of Biowarfare: A Conversation with Dr. Piers Millett

In both 2016 and 2017, genome editing made it into the annual Worldwide Threat Assessment of the US Intelligence Community. One of biotechnology’s most promising modern developments, it had now been deemed a danger to US national security – and then, after two years, it was dropped from the list again. All of which raises the question: what, exactly, is genome editing, and what can it do?

Most simply, the phrase “genome editing” represents tools and techniques that biotechnologists use to edit the genomethat is, the DNA or RNA of plants, animals, and bacteria. Though the earliest versions of genome editing technology have existed for decades, the introduction of CRISPR in 2013 “brought major improvements to the speed, cost, accuracy, and efficiency of genome editing.

CRISPR, or Clustered Regularly Interspersed Short Palindromic Repeats, is actually an ancient mechanism used by bacteria to remove viruses from their DNA. In the lab, researchers have discovered they can replicate this process by creating a synthetic RNA strand that matches a target DNA sequence in an organism’s genome. The RNA strand, known as a “guide RNA,” is attached to an enzyme that can cut DNA. After the guide RNA locates the targeted DNA sequence, the enzyme cuts the genome at this location. DNA can then be removed, and new DNA can be added. CRISPR has quickly become a powerful tool for editing genomes, with research taking place in a broad range of plants and animals, including humans.

A significant percentage of genome editing research focuses on eliminating genetic diseases. However, with tools like CRISPR, it also becomes possible to alter a pathogen’s DNA to make it more virulent and more contagious. Other potential uses include the creation of “‘killer mosquitos,’ plagues that wipe out staple crops, or even a virus that snips at people’s DNA.”

But does genome editing really deserve a spot among the ranks of global threats like nuclear weapons and cyber hacking? To many members of the scientific community, its inclusion felt like an overreaction. Among them was Dr. Piers Millett, a science policy and international security expert whose work focuses on biotechnology and biowarfare.

Millett wasn’t surprised that biotechnology in general made it into these reports: what he didn’t expect was for one specific tool, genome editing, to be called out. In his words: “I would personally be much more comfortable if it had been a broader sentiment to say ‘Hey, there’s a whole bunch of emerging biotechnologies that could destabilize our traditional risk equation in this space, and we need to be careful with that.’ …But calling out specifically genome editing, I still don’t fully understand any rationale behind it.”

This doesn’t mean, however, that the misuse of genome editing is not cause for concern. Even proper use of the technology often involves the genetic engineering of biological pathogens, research that could very easily be weaponized. Says Millett, “If you’re deliberately trying to create a pathogen that is deadly, spreads easily, and that we don’t have appropriate public health measures to mitigate, then that thing you create is amongst the most dangerous things on the planet.”

 

Biowarfare Before Genome Editing

A medieval depiction of the Black Plague.

Developments such as CRISPR present new possibilities for biowarfare, but biological weapons caused concern long before the advent of gene editing. The first recorded use of biological pathogens in warfare dates back to 600 BC, when Solon, an Athenian statesman, poisoned enemy water supplies during the siege of Krissa. Many centuries later, during the 1346 AD siege of Caffa, the Mongol army catapulted plague-infested corpses into the city, which is thought to have contributed to the 14th century Black Death pandemic that wiped out up to two thirds of Europe’s population.

Though biological weapons were internationally banned by the 1925 Geneva Convention, state biowarfare programs continued and in many cases expanded during World War II and the Cold War. In 1972, as evidence of these violations mounted, 103 nations signed a treaty known as the Biological Weapons Convention (BWC). The treaty bans the creation of biological arsenals and outlaws offensive biological research, though defensive research is permissible. Each year, signatories are required to submit certain information about their biological research programs to the United Nations, and violations reported to the UN Security Council may result in an inspection.

But inspections can be vetoed by the permanent members of the Security Council, and there are no firm guidelines for enforcement. On top of this, the line that separates permissible defensive biological research from its offensive counterpart is murky and remains a subject of controversy. And though the actual numbers remain unknown, pathologist Dr. Riedel asserts that “the number of state-sponsored programs [that have engaged in offensive biological weapons research] has increased significantly during the last 30 years.”

 

Dual Use Research

So biological warfare remains a threat, and it’s one that genome editing technology could hypothetically escalate. Genome editing falls into a category of research and technology that’s known as “dual-use” – that is, it has the potential both for beneficial advances and harmful misuses. “As an enabling technology, it enables you to do things, so it is the intent of the user that determines whether that’s a positive thing or a negative thing,” Millett explains.

And ultimately, what’s considered positive or negative is a matter of perspective. “The same activity can look positive to one group of people, and negative to another. How do we decide which one is right and who gets to make that decision?” Genome editing could be used, for example, to eradicate disease-carrying mosquitoes, an application that many would consider positive. But as Millet points out, some cultures view such blatant manipulation of the ecosystem as harmful or “sacrilegious.”

Millett believes that the most effective way to deal with dual-use research is to get the researchers engaged in the discussion. “We have traditionally treated the scientific community as part of the problem,” he says. “I think we need to move to a point where the scientific community is the key to the solution, where we’re empowering them to be the ones who identify the risks, the ones who initiate the discussion about what forms this research should take.” A good scientist, he adds, is one “who’s not only doing good research, but doing research in a good way.”

 

DIY Genome Editing

But there is a growing worry that dangerous research might be undertaken by those who are not scientists at all. There are already a number of do-it-yourself (DIY) genome editing kits on the market today, and these relatively inexpensive kits allow anyone, anywhere to edit DNA using CRISPR technology. Do these kits pose a real security threat? Millett explains that risk level can be assessed based on two distinct criteria: likelihood and potential impact. Where the “greatest” risks lie will depend on the criterion.

“If you take risk as a factor of likelihood of impact, the most likely attacks will come from low-powered actors, but have a minimal impact and be based on traditional approaches, existing pathogens, and well characterized risks and threats,” Millett explains. DIY genome editors, for example, may be great in number but are likely unable to produce a biological agent capable of causing widespread harm.

“If you switch it around and say where are the most high impact threats going to come from, then I strongly believe that that [type of threat] requires a level of sophistication and technical competency and resources that are not easy to acquire at this point in time,” says Millett. “If you’re looking for advanced stuff: who could misuse genome editing? States would be my bet in the foreseeable future.”

State Bioweapons Programs

Large-scale bioweapons programs, such as those run by states, pose a double threat: there is always the possibility of accidental release alongside the potential for malicious use. Millett believes that these threats are roughly equal, a conclusion backed by a thousand page report from Gryphon Scientific, a US defense contractor.

Historically, both accidental release and malicious use of biological agents have caused damage. In 1979, there was the accidental release of aerosolized anthrax from the Sverdlovsk [now Ekaterinburg] bioweapons production facility in the Soviet Union – a clogged air filter in the facility had been removed, but had not been replaced. Ninety-four people were affected by the incident and at least 64 died, along with a number of livestock. The Soviet secret police attempted a cover-up and it was not until years later that the administration admitted the cause of the outbreak.

More recently, Millett says, a US biodefense facility “failed to kill the anthrax that it sent out for various lab trials, and ended up sending out really nasty anthrax around the world.” Though no one was infected, a 2015 government investigation revealed that “over the course of the last decade, 86 facilities in the United States and seven other countries have received low concentrations of live [anthrax] spore samples… thought to be completely inactivated.”

These incidents pale, however, in comparison with Japan’s intentional use of biological weapons during the 1930s and 40s. There is “a published history that suggests up to 30,000 people were killed in China by the Japanese biological weapons program during the lead up to World War II. And if that data is accurate, that is orders of magnitude bigger than anything else,” Millett says.

Given the near-impossibility of controlling the spread of disease, a deliberate attack may have accidental effects far beyond what was intended. The Japanese, for example, may have meant to target only a few Chinese villages, only to unwittingly trigger an epidemic. There are reports, in fact, that thousands of Japan’s own soldiers became infected during a biological attack in 1941.

Despite the 1972 ban on biological weapons programs, Millett believes that many countries still have the capacity to produce biological weapons. As an example, he explains that the Soviets developed “a set of research and development tools that would answer the key questions and give you all the key capabilities to make biological weapons.”

The BWC only bans offensive research, and “underneath the umbrella of a defensive program,” Millett says, “you can do a whole load of research and development to figure out what you would want to weaponize if you were going to make a weapon.” Then, all a country needs to start producing those weapons is “the capacity to scale up production very, very quickly.” The Soviets, for example, built “a set of state-based commercial infrastructure to make things like vaccines.” On a day-to-day basis, they were making things the Soviet Union needed. “But they could be very radically rebooted and repurposed into production facilities for their biological weapons program,” Millett explains. This is known as a “breakout program.”

Says Millett, “I believe there are many, many countries that are well within the scope of a breakout program … so it’s not that they necessarily at this second have a fully prepared and worked-out biological weapons program that they can unleash on the world tomorrow, but they might well have all of the building blocks they need to do that in place, and a plan for how to turn their existing infrastructure towards a weapons program if they ever needed to. These components would be permissible under current international law.”

 

Biological Weapons Convention

This unsettling reality raises questions about the efficacy of the BWC – namely, what does it do well, and what doesn’t it do well? Millett, who worked for the BWC for well over a decade, has a nuanced view.

“The very fact that we have a ban on these things is brilliant,” he says. “We’re well ahead on biological weapons than many other types of weapons systems. We only got the ban on nuclear weapons – and it was only joined by some tiny number of countries – last year. Chemical weapons, only in 1995. The ban on biological weapons is hugely important. Having a space at the international level to talk about those issues is very important.” But, he adds, “we’re rapidly reaching the end of the space that I can be positive about.”

The ban on biological weapons was motivated, at least in part, by the sense that – unlike chemical weapons – they weren’t particularly useful. Traditionally, chemical and biological weapons were dealt with together. The 1925 Geneva Protocol banned both, and the original proposal for the Biological Weapons Convention, submitted by the UK in 1969, would have dealt with both. But the chemical weapons ban was ultimately dropped from the BWC, Millett says, “because that was during Vietnam, and so there were a number of chemical agents that were being used in Vietnam that weren’t going to be banned.” Once the scope of the ban had been narrowed, however, both the US and the USSR signed on.

Millet describes the resulting document as “aspirational.” He explains,“The Biological Weapons Convention is four pages long, whereas the [1995] Chemical Weapons Convention is 200 pages long, give or take.” And the difference “is about the teeth in the treaty.”

“The BWC is…a short document that’s basically a commitment by states not to make these weapons. The Chemical Weapons Convention is an international regime with an organization, with an inspection regime intended to enforce that. Under the BWC, if you are worried about another state, you’re meant to try to resolve those concerns amicably. But if you can’t do that, we move onto Article Six of the Convention, where you report it to the Security Council. The Security Council is meant to investigate it, but of course if you’re a permanent member of the Security Council, you can veto that, so that doesn’t happen.”

 

De-escalation

One easy way that states can avoid raising suspicion is to be more transparent. As Millett puts it, “If you’re not doing naughty things, then it’s on you to demonstrate that you’re not.” This doesn’t mean revealing everything to everybody. It means finding ways to show other states that they don’t need to worry.

As an example, Millett cites the heightened security culture that developed in the US after 9/11. Following the 2001 anthrax letter attacks, as well as a large investment in US biodefense programs, an initiative was started to prevent foreigners from working in those biodefense facilities. “I’m very glad they didn’t go down that path,” says Millett, “because the greatest risk, I think, was not that a foreign national would sneak in.” Rather, “the advantage of having foreign nationals in those programs was at the international level, when country Y stands up and accuses the US of having an illicit bioweapons program hidden in its biodefense program, there are three other countries that can stand up and say, ‘Well, wait a minute. Our scientists are in those facilities. We work very closely with that program, and we see no evidence of what you’re saying.’”

Historically, secrecy surrounding bioweapons programs has led other countries to begin their own research. Before World War I, the British began exploring the use of bioweapons. The Germans were aware of this. By the onset of the war, the British had abandoned the idea, but the Germans, not knowing this, began their own bioweapons program in an attempt to keep up. By World War II, Germany no longer had a bioweapons program. But the Allies believed they still did, and the U.S. bioweapons program was born of such fears.

 

What now?

Asked if he believes genome editing is a bioweapons “game changer”, Millett says no. “I see it as an enabling technology in the short to medium term, then maybe with longer-term implications [for biowarfare], but then we’re out into the far distance of what we can reasonably talk about and predict,” he says. “Certainly for now, I think its big impact is it makes it easier, faster, cheaper, and more reliable to do things that you could do using traditional approaches.”

But as biotechnology continues to evolve, so too will biowarfare. For example, it will eventually be possible for governments to alter specific genes in their own populations. “Imagine aerosolizing a lovely genome editor that knocks out a specifically nasty gene in your population,” says Millett. “It’s a passive thing. You breathe it in [and it] retroactively alters the population[’s DNA].

A government could use such technology to knock out a gene linked to cancer or other diseases. But, Millett says, “what would happen if you came across a couple of genes that at an individual level were not going to have an impact, but at a population level were connected with something, say, like IQ?” With the help of a genome editor, a government could make their population smarter, on average, by a few IQ points.

“There’s good economic data that says that [average IQ] is … statistically important,” Millett says. “The GDP of the country will be noticeably affected if we could just get another two or three percent IQ points. There are direct national security implications of that. If, for example, Chinese citizens got smarter on average over the next couple of generations by a couple of IQ points per generation, that has national security implications for both the UK and the US.”

For now, such an endeavor remains in the realm of science fiction. But technology is evolving at a breakneck speed, and it’s more important than ever to consider the potential implications of our advancements. That said, Millett is optimistic about the future. “I think the key is the distribution of bad actors versus good actors,” he says. As long as the bad actors remain the minority, there is more reason to be excited for the future of biotechnology than there is to be afraid of it.

Dr. Piers Millett holds fellowships at the Future of Humanity Institute, the University of Oxford, and the Woodrow Wilson Center for International Policy and works as a consultant for the World Health Organization. He also served at the United Nations as the Deputy Head of the Biological Weapons Convention.  

Cognitive Biases and AI Value Alignment: An Interview with Owain Evans

At the core of AI safety, lies the value alignment problem: how can we teach artificial intelligence systems to act in accordance with human goals and values?

Many researchers interact with AI systems to teach them human values, using techniques like inverse reinforcement learning (IRL). In theory, with IRL, an AI system can learn what humans value and how to best assist them by observing human behavior and receiving human feedback.

But human behavior doesn’t always reflect human values, and human feedback is often biased. We say we want healthy food when we’re relaxed, but then we demand greasy food when we’re stressed. Not only do we often fail to live according to our values, but many of our values contradict each other. We value getting eight hours of sleep, for example, but we regularly sleep less because we also value working hard, caring for our children, and maintaining healthy relationships.

AI systems may be able to learn a lot by observing humans, but because of our inconsistencies, some researchers worry that systems trained with IRL will be fundamentally unable to distinguish between value-aligned and misaligned behavior. This could become especially dangerous as AI systems become more powerful: inferring the wrong values or goals from observing humans could lead these systems to adopt harmful behavior.

 

Distinguishing Biases and Values

Owain Evans, a researcher at the Future of Humanity Institute, and Andreas Stuhlmüller, president of the research non-profit Ought, have explored the limitations of IRL in teaching human values to AI systems. In particular, their research exposes how cognitive biases make it difficult for AIs to learn human preferences through interactive learning.

Evans elaborates: “We want an agent to pursue some set of goals, and we want that set of goals to coincide with human goals. The question then is, if the agent just gets to watch humans and try to work out their goals from their behavior, how much are biases a problem there?”

In some cases, AIs will be able to understand patterns of common biases. Evans and Stuhlmüller discuss the psychological literature on biases in their paper, Learning the Preferences of Ignorant, Inconsistent Agents, and in their online book, agentmodels.org. An example of a common pattern discussed in agentmodels.org is “time inconsistency.” Time inconsistency is the idea that people’s values and goals change depending on when you ask them. In other words, “there is an inconsistency between what you prefer your future self to do and what your future self prefers to do.”

Examples of time inconsistency are everywhere. For one, most people value waking up early and exercising if you ask them before bed. But come morning, when it’s cold and dark out and they didn’t get those eight hours of sleep, they often value the comfort of their sheets and the virtues of relaxation. From waking up early to avoiding alcohol, eating healthy, and saving money, humans tend to expect more from their future selves than their future selves are willing to do.

With systematic, predictable patterns like time inconsistency, IRL could make progress with AI systems. But often our biases aren’t so clear. According to Evans, deciphering which actions coincide with someone’s values and which actions spring from biases is difficult or even impossible in general.

“Suppose you promised to clean the house but you get a last minute offer to party with a friend and you can’t resist,” he suggests. “Is this a bias, or your value of living for the moment? This is a problem for using only inverse reinforcement learning to train an AI — how would it decide what are biases and values?”

 

Learning the Correct Values

Despite this conundrum, understanding human values and preferences is essential for AI systems, and developers have a very practical interest in training their machines to learn these preferences.

Already today, popular websites use AI to learn human preferences. With YouTube and Amazon, for instance, machine-learning algorithms observe your behavior and predict what you will want next. But while these recommendations are often useful, they have unintended consequences.

Consider the case of Zeynep Tufekci, an associate professor at the School of Information and Library Science at the University of North Carolina. After watching videos of Trump rallies to learn more about his voter appeal, Tufekci began seeing white nationalist propaganda and Holocaust denial videos on her “autoplay” queue. She soon realized that YouTube’s algorithm, optimized to keep users engaged, predictably suggests more extreme content as users watch more videos. This led her to call the website “The Great Radicalizer.”

This value misalignment in YouTube algorithms foreshadows the dangers of interactive learning with more advanced AI systems. Instead of optimizing advanced AI systems to appeal to our short-term desires and our attraction to extremes, designers must be able to optimize them to understand our deeper values and enhance our lives.

Evans suggests that we will want AI systems that can reason through our decisions better than humans can, understand when we are making biased decisions, and “help us better pursue our long-term preferences.” However, this will entail that AIs suggest things that seem bad to humans on first blush.

One can imagine an AI system suggesting a brilliant, counterintuitive modification to a business plan, and the human just finds it ridiculous. Or maybe an AI recommends a slightly longer, stress-free driving route to a first date, but the anxious driver takes the faster route anyway, unconvinced.

To help humans understand AIs in these scenarios, Evans and Stuhlmüller have researched how AI systems could reason in ways that are comprehensible to humans and can ultimately improve upon human reasoning.

One method (invented by Paul Christiano) is called “amplification,” where humans use AIs to help them think more deeply about decisions. Evans explains: “You want a system that does exactly the same kind of thinking that we would, but it’s able to do it faster, more efficiently, maybe more reliably. But it should be a kind of thinking that if you broke it down into small steps, humans could understand and follow.”

This second concept is called “factored cognition” – the idea of breaking sophisticated tasks into small, understandable steps. According to Evans, it’s not clear how generally factored cognition can succeed. Sometimes humans can break down their reasoning into small steps, but often we rely on intuition, which is much more difficult to break down.

 

Specifying the Problem

Evans and Stuhlmüller have started a research project on amplification and factored cognition, but they haven’t solved the problem of human biases in interactive learning – rather, they’ve set out to precisely lay out these complex issues for other researchers.

“It’s more about showing this problem in a more precise way than people had done previously,” says Evans. “We ended up getting interesting results, but one of our results in a sense is realizing that this is very difficult, and understanding why it’s difficult.”

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

$50,000 Award to Stanislav Petrov for helping avert WWIII – but US denies visa

Click here to see this page in other languages:  Russian 

To celebrate that today is not the 35th anniversary of World War III, Stanislav Petrov, the man who helped avert an all-out nuclear exchange between Russia and the U.S. on September 26 1983 was honored in New York with the $50,000 Future of Life Award at a ceremony at the Museum of Mathematics in New York.

Former United Nations Secretary General Ban Ki-Moon said: “It is hard to imagine anything more devastating for humanity than all-out nuclear war between Russia and the United States. Yet this might have occurred by accident on September 26 1983, were it not for the wise decisions of Stanislav Yevgrafovich Petrov. For this, he deserves humanity’s profound gratitude. Let us resolve to work together to realize a world free from fear of nuclear weapons, remembering the courageous judgement of Stanislav Petrov.”

Stanislav Petrov’s daughter Elena holds the 2018 Future of Life Award flanked by her husband Victor. From left: Ariel Conn (FLI), Lucas Perry (FLI), Hannah Fry, Victor, Elena, Steven Mao (exec. producer of the Petrov film “The Man Who Saved the World”), Max Tegmark (FLI)

Although the U.N. General Assembly, just blocks away, heard politicians highlight the nuclear threat from North Korea’s small nuclear arsenal, none mentioned the greater threat from the many thousands of nuclear weapons in the United States and Russian arsenals that have nearly been unleashed by mistake dozens of times in the past in a seemingly never-ending series of mishaps and misunderstandings.

One of the closest calls occurred thirty-five years ago, on September 26, 1983, when Stanislav Petrov chose to ignore the Soviet early-warning detection system that had erroneously indicated five incoming American nuclear missiles. With his decision to ignore algorithms and instead follow his gut instinct, Petrov helped prevent an all-out US-Russian nuclear war, as detailed in the documentary film “The Man Who Saved the World”, which will be released digitally next week. Since Petrov passed away last year, the award was collected by his daughter Elena. Meanwhile, Petrov’s son Dmitry missed his flight to New York because the U.S. embassy delayed his visa. “That a guy can’t get a visa to visit the city his dad saved from nuclear annihilation is emblematic of how frosty US-Russian relations have gotten, which increases the risk of accidental nuclear war”, said MIT Professor Max Tegmark when presenting the award. Arguably the only recent reduction in the risk of accidental nuclear war came when Donald Trump held a summit with Vladimir Putin in Helsinki earlier this year, which was, ironically, met with widespread criticism.

In Russia, soldiers often didn’t discuss their wartime actions out of fear that it might displease their government, and so, Elena only first heard about her father’s heroic actions in 1998 – 15 years after the event occurred. And even then, Elena and her brother only learned of what her father had done when a German journalist reached out to the family for an article he was working on. It’s unclear if Petrov’s wife, who died in 1997, ever knew of her husband’s heroism. Until his death, Petrov maintained a humble outlook on the event that made him famous. “I was just doing my job,” he’d say.

But most would agree that he went above and beyond his job duties that September day in 1983. The alert of five incoming nuclear missiles came at a time of high tension between the superpowers, due in part to the U.S. military buildup in the early 1980s and President Ronald Reagan’s anti-Soviet rhetoric. Earlier in the month the Soviet Union shot down a Korean Airlines passenger plane that strayed into its airspace, killing almost 300 people, and Petrov had to consider this context when he received the missile notifications. He had only minutes to decide whether or not the satellite data were a false alarm. Since the satellite was found to be operating properly, following procedures would have led him to report an incoming attack. Going partly on gut instinct and believing the United States was unlikely to fire only five missiles, he told his commanders that it was a false alarm before he knew that to be true. Later investigations revealed that reflections of the Sun off of cloud tops had fooled the satellite into thinking it was detecting missile launches.

Last years Nobel Peace Prize Laureate, Beatrice Fihn, who helped establish the recent United Nations treaty banning nuclear weapons, said,“Stanislav Petrov was faced with a choice that no person should have to make, and at that moment he chose the human race — to save all of us. No one person and no one country should have that type of control over all our lives, and all future lives to come. 35 years from that day when Stanislav Petrov chose us over nuclear weapons, nine states still hold the world hostage with 15,000 nuclear weapons. We cannot continue relying on luck and heroes to safeguard humanity. The Treaty on the Prohibition of Nuclear Weapons provides an opportunity for all of us and our leaders to choose the human race over nuclear weapons by banning them and eliminating them once and for all. The choice is the end of us or the end of nuclear weapons. We honor Stanislav Petrov by choosing the latter.”

University College London Mathematics Professor  Hannah Fry, author of  the new book “Hello World: Being Human in the Age of Algorithms”, participated in the ceremony and pointed out that as ever more human decisions get replaced by automated algorithms, it is sometimes crucial to keep a human in the loop – as in Petrov’s case.

The Future of Life Award seeks to recognize and reward those who take exceptional measures to safeguard the collective future of humanity. It is given by the Future of Life Institute (FLI), a non-profit also known for supporting AI safety research with Elon Musk and others. “Although most people never learn about Petrov in school, they might not have been alive were it not for him”, said FLI co-founder Anthony Aguirre. Last year’s award was given to the Vasili Arkhipov, who singlehandedly prevented a nuclear attack on the US during the Cuban Missile Crisis. FLI is currently accepting nominations for next year’s award.

Stanislav Petrov around the time he helped avert WWIII

Making AI Safe in an Unpredictable World: An Interview with Thomas G. Dietterich

Our AI systems work remarkably well in closed worlds. That’s because these environments contain a set number of variables, making the worlds perfectly known and perfectly predictable. In these micro environments, machines only encounter objects that are familiar to them. As a result, they always know how they should act and respond. Unfortunately, these same systems quickly become confused when they are deployed in the real world, as many objects aren’t familiar to them. This is a bit of a problem because, when an AI system becomes confused, the results can be deadly.

Consider, for example, a self-driving car that encounters a novel object. Should it speed up, or should it slow down? Or consider an autonomous weapon system that sees an anomaly. Should it attack, or should it power down? Each of these examples involve life-and-death decisions, and they reveal why, if we are to deploy advanced AI systems in real world environments, we must be confident that they will behave correctly when they encounter unfamiliar objects.

Thomas G. Dietterich, Emeritus Professor of Computer Science at Oregon State University, explains that solving this identification problem begins with ensuring that our AI systems aren’t too confident — that they recognize when they encounter a foreign object and don’t misidentify it as something that they are acquainted with. To achieve this, Dietterich asserts that we must move away from (or, at least, greatly modify) the discriminative training methods that currently dominate AI research.

However, to do that, we must first address the “open category problem.”

 

Understanding the Open Category Problem

When driving down the road, we can encounter a near infinite number of anomalies. Perhaps a violent storm will arise, and hail will start to fall. Perhaps our vision will become impeded by smoke or excessive fog. Although these encounters may be unexpected, the human brain is able to easily analyze new information and decide on the appropriate course of action — we will recognize a newspaper drifting across the road and, instead of abruptly slamming on the breaks, continue on our way.

Because of the way that they are programmed, our computer systems aren’t able to do the same.

“The way we use machine learning to create AI systems and software these days generally uses something called ‘discriminative training,’” Dietterich explains, “which implicitly assumes that the world consists of only, say, a thousand different kinds of objects.” This means that, if a machine encounters a novel object, it will assume that it must be one of the thousand things that it was trained on. As a result, such systems misclassify all foreign objects.

This is the “open category problem” that Dietterich and his team are attempting to solve. Specifically, they are trying to ensure that our machines don’t assume that they have encountered every possible object, but are, instead, able to reliably detect — and ultimately respond to — new categories of alien objects.

Dietterich notes that, from a practical standpoint, this means creating an anomaly detection algorithm that assigns an anomaly score to each object detected by the AI system. That score must be compared against a set threshold and, if the anomaly score exceeds the threshold, the system will need to raise an alarm. Dietterich states that, in response to this alarm, the AI system should take a pre-determined safety action. For example, a self-driving car that detects an anomaly might slow down and pull off to the side of the road.

 

Creating a Theoretical Guarantee of Safety

There are two challenges to making this method work. First, Dietterich asserts that we need good anomaly detection algorithms. Previously, in order to determine what algorithms work well, the team compared the performance of eight state-of-the-art anomaly detection algorithms on a large collection of benchmark problems.

The second challenge is to set the alarm threshold so that the AI system is guaranteed to detect a desired fraction of the alien objects, such as 99%. Dietterich says that formulating a reliable setting for this threshold is one of the most challenging research problems because there are, potentially, infinite kinds of alien objects. “The problem is that we can’t have labeled training data for all of the aliens. If we had such data, we would simply train the discriminative classifier on that labeled data,” Dietterich says.

To circumvent this labeling issue, the team assumes that the discriminative classifier has access to a representative sample of “query objects” that reflect the larger statistical population. Such a sample could, for example, be obtained by collecting data from cars driving on highways around the world. This sample will include some fraction of unknown objects, and the remaining objects belong to known object categories.

Notably, the data in the sample is not labeled. Instead, the AI system is given an estimate of the fraction of aliens in the sample. And by combining the information in the sample with the labeled training data that was employed to train the discriminative classifier, the team’s new algorithm can choose a good alarm threshold. If the estimated fraction of aliens is known to be an over-estimate of the true fraction, then the chosen threshold is guaranteed to detect the target percentage of aliens (i.e. 99%).

Ultimately, the above is the first method that can give a theoretical guarantee of safety for detecting alien objects, and a paper reporting the results was presented at ICML 2018. “We are able to guarantee, with high probability, that we can find 99% all of these new objects,” Dietterich says.

In the next stage of their research, Dietterich and his team plan to begin testing their algorithm in a more complex setting. Thus far, they’ve been looking primarily at classification, where the system looks at an image and classifies it. Next, they plan to move to controlling an agent, like a robot of self-driving car. “At each point in time, in order to decide what action to choose, our system will do a ‘look ahead search’ based on a learned model of the behavior of the agent and its environment. If the look ahead arrives at a state that is rated as ‘alien’ by our method, then this indicates that the agent is about to enter a part of the state space where it is not competent to choose correct actions,” Dietterich says. In response, as previously mentioned, the agent should execute a series of safety actions and request human assistance.

But what does this safety action actually consist of?

 

Responding to Aliens

Dietterich notes that, once something is identified as an anomaly and the alarm is sounded, the nature of this fall back system will depend on the machine in question, like whether the AI system is in a self-driving car or autonomous weapon.

To explain how these secondary systems operate, Dietterich turns to self-driving cars. “In the Google car, if the computers lose power, then there’s a backup system that automatically slows the car down and pulls it over to the side of the road.” However, Dietterich clarifies that stopping isn’t always the best course of action. One may assume that a car should come to a halt if an unidentified object crosses its path; however, if the unidentified object happens to be a blanket of snow on a particularly icy day, hitting the breaks gets more complicated. The system would need to factor in the icy roads, any cars that may be driving behind, and whether these cars can break in time to avoid a rear end collision.

But if we can’t predict every eventuality, how can we expect to program an AI system so that it behaves correctly and in a way that is safe?

Unfortunately, there’s no easy answer; however, Dietterich clarifies that there are some general best practices; “There’s no universal solution to the safety problem, but obviously there are some actions that are safer than others. Generally speaking, removing energy from the system is a good idea,” he says. Ultimately, Dietterich asserts that all the work related to programming safe AI really boils down to determining how we want our machines to behave under specific scenarios, and he argues that we need to rearticulate how we characterize this problem, and focus on accounting for all the factors, if we are to develop a sound approach.

Dietterich notes that “when we look at these problems, they tend to get lumped under a classification of ‘ethical decision making,’ but what they really are is problems that are incredibly complex. They depend tremendously on the context in which they are operating, the human beings, the other innovations, the other automated systems, and so on. The challenge is correctly describing how we want the system to behave and then ensuring that our implementations actually comply with those requirements.” And he concludes, “the big risk in the future of AI is the same as the big risk in any software system, which is that we build the wrong system, and so it does the wrong thing. Arthur C Clark in 2001: A Space Odyssey had it exactly right. The Hal 9000 didn’t ‘go rogue;’ it was just doing what it had been programmed to do.”

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

European Parliament Passes Resolution Supporting a Ban on Killer Robots

The European Parliament passed a resolution on September 12, 2018 calling for an international ban on lethal autonomous weapons systems (LAWS). The resolution was adopted with 82% of the members voting in favor of it.

Among other things, the resolution calls on its Member States and the European Council “to develop and adopt, as a matter of urgency … a common position on lethal autonomous weapon systems that ensures meaningful human control over the critical functions of weapon systems, including during deployment.”

The resolution also urges Member States and the European Council “to work towards the start of international negotiations on a legally binding instrument prohibiting lethal autonomous weapons systems.”

This call for urgency comes shortly after recent United Nations talks where countries were unable to reach a consensus about whether or not to consider a ban on LAWS. Many hope that statements such as this from leading government bodies could help sway the handful of countries still holding out against banning LAWS.

Daan Kayser of PAX, one of the NGO members of the Campaign to Stop Killer Robots, said, “The voice of the European parliament is important in the international debate. At the UN talks in Geneva this past August it was clear that most European countries see the need for concrete measures. A European parliament resolution will add to the momentum toward the next step.”

The countries that took the strongest stances against a LAWS ban at the recent UN meeting were the United States, Russia, South Korea, and Israel.

 

Scientists’ Voices Are Heard

Also mentioned in the resolution were the many open letters signed by AI researchers and scientists from around the world, who are calling on the UN to negotiate a ban on LAWS.

Two sections of the resolution stated:

“having regard to the open letter of July 2015 signed by over 3,000 artificial intelligence and robotics researchers and that of 21 August 2017 signed by 116 founders of leading robotics and artificial intelligence companies warning about lethal autonomous weapon systems, and the letter by 240 tech organisations and 3,089 individuals pledging never to develop, produce or use lethal autonomous weapon systems,” and

“whereas in August 2017, 116 founders of leading international robotics and artificial intelligence companies sent an open letter to the UN calling on governments to ‘prevent an arms race in these weapons’ and ‘to avoid the destabilising effects of these technologies.’”

Toby Walsh, a prominent AI researcher who helped create the letters, said, “It’s great to see politicians listening to scientists and engineers. Starting in 2015, we’ve been speaking loudly about the risks posed by lethal autonomous weapons. The European Parliament has joined the calls for regulation. The challenge now is for the United Nations to respond. We have several years of talks at the UN without much to show. We cannot let a few nations hold the world hostage, to start an arms race with technologies that will destabilize the current delicate world order and that many find repugnant.”

State of California Endorses Asilomar AI Principles

On August 30, the State of California unanimously adopted legislation in support of the Future of Life Institute’s Asilomar AI Principles.

The Asilomar AI Principles are a set of 23 principles intended to promote the safe and beneficial development of artificial intelligence. The principles – which include research issues, ethics and values, and longer-term issues – emerged from a collaboration between AI researchers, economists, legal scholars, ethicists, and philosophers in Asilomar, California in January of 2017.

The Principles are the most widely adopted effort of their kind. They have been endorsed by AI research leaders at Google DeepMind, GoogleBrain, Facebook, Apple, and OpenAI. Signatories include Demis Hassabis, Yoshua Bengio, Elon Musk, Ray Kurzweil, the late Stephen Hawking, Tasha McCauley, Joseph Gordon-Levitt, Jeff Dean, Tom Gruber, Anthony Romero, Stuart Russell, and more than 3,800 other AI researchers and experts.

With ACR 215 passing the State Senate with unanimous support, the California Legislature has now been added to that list.

Assemblyman Kevin Kiley, who led the effort, said, “By endorsing the Asilomar Principles, the State Legislature joins in the recognition of shared values that can be applied to AI research, development, and long-term planning — helping to reinforce California’s competitive edge in the field of artificial intelligence, while assuring that its benefits are manifold and widespread.”

The third Asilomar AI principle indicates the importance of constructive and healthy exchange between AI researchers and policymakers, and the passing of this resolution highlights the value of that endeavor. While the principles do not establish enforceable policies or regulations, the action taken by the California Legislature is an important and historic show of support across sectors towards a common goal of enabling safe and beneficial AI.

The Future of Life Institute (FLI), the nonprofit organization that led the creation of the Asilomar AI Principles, is thrilled by this latest development, and encouraged that the principles continue to serve as guiding values for the development of AI and related public policy.

“By endorsing the Asilomar AI Principles, California has taken a historic step towards the advancement of beneficial AI and highlighted its leadership of this transformative technology,” said Anthony Aguirre, cofounder of FLI and physics professor at the University of California, Santa Cruz. “We are grateful to Assemblyman Kevin Kiley for leading the charge and to the dozens of co-authors of this resolution for their foresight on this critical matter.”

Profound societal impacts of AI are no longer merely a question of science fiction, but are already being realized today – from facial recognition technology, to drone surveillance, and the spread of targeted disinformation campaigns. Advances in AI are helping to connect people around the world, improve productivity and efficiencies, and uncover novel insights. However, AI may also pose safety and security threats, exacerbate inequality, and constrain privacy and autonomy.

“New norms are needed for AI that counteract dangerous race dynamics and instead center on trust, security, and the common good,” says Jessica Cussins, AI Policy Lead for FLI. “Having the official support of California helps establish a framework of shared values between policymakers, AI researchers, and other stakeholders. FLI encourages other governmental bodies to support the 23 principles and help shape an exciting and equitable future.”

Governing AI: An Inside Look at the Quest to Ensure AI Benefits Humanity

Click here to see this page in other languages:  Russian 

Finance, education, medicine, programming, the arts — artificial intelligence is set to disrupt nearly every sector of our society. Governments and policy experts have started to realize that, in order to prepare for this future, in order to minimize the risks and ensure that AI benefits humanity, we need to start planning for the arrival of advanced AI systems today.

Although we are still in the early moments of this movement, the landscape looks promising. Several nations and independent firms have already started to strategize and develop polices for the governance of AI. Last year, the UAE appointed the world’s first Minister of Artificial Intelligence, and Germany took smaller, but similar, steps in 2017, when the Ethics Commission at the German Ministry of Transport and Digital Infrastructure developed the world’s first set of regulatory guidelines for automated and connected driving.

This work is notable; however, these efforts have yet to coalesce into a larger governance framework that extends beyond national boundaries. Nick Bostrom’s Strategic Artificial Intelligence Research Center seeks to assist in resolving this issue by understanding, and ultimately shaping, the strategic landscape of long-term AI development on a global scale.

 

Developing a Global Strategy: Where We Are Today

The Strategic Artificial Intelligence Research Center was founded in 2015 with the knowledge that, to truly circumvent the threats posed by AI, the world needs a concerted effort focused on tackling unsolved problems related to AI policy and development. The Governance of AI Program (GovAI), co-directed by Bostrom and Allan Dafoe, is the primary research program that has evolved from this center. Its central mission, as articulated by the directors, is to “examine the political, economic, military, governance, and ethical dimensions of how humanity can best navigate the transition to such advanced AI systems.” In this respect, the program is focused on strategy — on shaping the social, political, and governmental systems that influence AI research and development — as opposed to focusing on the technical hurdles that must be overcome in order to create and program safe AI.

To develop a sound AI strategy, the program works with social scientists, politicians, corporate leaders, and artificial intelligence/machine learning engineers to address questions of how we should approach the challenge of governing artificial intelligence. In a recent 80,0000 Hours podcast with Rob Wiblin, Dafoe outlined how the team’s research shapes up from a practical standpoint, asserting that the work focuses on answering questions that fall under three primary categories:

  • The Technical Landscape: This category seeks to answer all the questions that are related to research trends in the field of AI with the aim of understanding what future technological trajectories are plausible and how these trajectories affect the challenges of governing advanced AI systems.
  • AI Politics: This category focuses on questions that are related to the dynamics of different groups, corporations, and governments pursuing their own interests in relation to AI, and it seeks to understand what risks might arise as a result and how we may be able to mitigate these risks.
  • AI Governance: This category examines positive visions of a future in which humanity coordinates to govern advanced AI in a safe and robust manner. This raises questions such as how this framework should operate and what values we would want to encode in a governance regime.

The above categories provide a clearer way of understanding the various objectives of those invested in researching AI governance and strategy; however, these categories are fairly large in scope. To help elucidate the work they are performing, Jade Leung, a researcher with GovAI and a DPhil candidate in International Relations at the University of Oxford, outlined some of the specific workstreams that the team is currently pursuing.

One of the most intriguing areas of research is the Chinese AI Strategy workstream. This line of research examines things like China’s AI capabilities vis-à-vis other countries, official documentation regarding China’s AI policy, and the various power dynamics at play in the nation with an aim of understanding, as Leung summarizes, “China’s ambition to become an AI superpower and the state of Chinese thinking on safety, cooperation, and AGI.” Ultimately, GovAI seeks to outline the key features of China’s AI strategy in order to understand one of the most important actors in AI governance. The program published Deciphering China’s AI Dream in March of 2018a report that analyzes new features of China’s national AI strategy, and has plans to build upon research in the near future.

Another workstream is Firm-Government Cooperation, which examines the role that private firms play in relation to the development of advanced AI and how these players are likely to interact with national governments. In a recent talk at EA Global San Francisco, Leung focused on how private industry is already playing a significant role in AI development and why, when considering how to govern AI, private players must be included in strategy considerations as a vital part of the equation. The description of the talk succinctly summarizes the key focal areas, noting that “private firms are the only prominent actors that have expressed ambitions to develop AGI, and lead at the cutting edge of advanced AI research. It is therefore critical to consider how these private firms should be involved in the future of AI governance.”

Other work that Leung highlighted includes modeling technology race dynamics and analyzing the distribution of AI talent and hardware globally.

 

The Road Ahead

When asked how much confidence she has that AI researchers will ultimately coalesce and be successful in their attempts to shape the landscape of long-term AI development internationally, Leung was cautious with her response, noting that far more hands are needed. “There is certainly a greater need for more researchers to be tackling these questions. As a research area as well as an area of policy action, long-term safe and robust AI governance remains a neglected mission,” she said.

Additionally, Leung noted that, at this juncture, although some concrete research is already underway, a lot of the work is focused on framing issues related to AI governance and, in so doing, revealing the various avenues in need of research. As a result, the team doesn’t yet have concrete recommendations for specific actions governing bodies should commit to, as further foundational analysis is needed. “We don’t have sufficiently robust and concrete policy recommendations for the near term as it stands, given the degrees of uncertainty around this problem,” she said.

However, both Leung and Defoe are optimistic and assert that this information gap will likely change — and rapidly. Researchers across disciplines are increasingly becoming aware of the significance of this topic, and as more individuals begin researching and participating in this community, the various avenues of research will become more focused. “In two years, we’ll probably have a much more substantial research community. But today, we’re just figuring out what are the most important and tractable problems and how we can best recruit to work on those problems,” Dafoe told Wiblin.

The assurances that a more robust community will likely form soon are encouraging; however, questions remain regarding whether this community will come together with enough time to develop a solid governance framework. As Dafoe notes, we have never witnessed an intelligence explosion before, so we have no examples to look to for guidance when attempting to develop projections and timelines regarding when we will have advanced AI systems.

Ultimately, the lack of projections is precisely why we must significantly invest in AI strategy research in the immediate future. As Bostrom notes in Superintelligence: Paths, Dangers, and Strategies, AI is not simply a disruptive technology, it is likely the most disruptive technology humanity will ever encounter: “[Superintelligence] is quite possibly the most important and most daunting challenge humanity has ever faced. And — whether we succeed or fail — it is probably the last challenge we will ever face.”

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

Edit: The title of the article has been changed to reflect the fact that this is not about regulating AI.

Machine Reasoning and the Rise of Artificial General Intelligences: An Interview With Bart Selman

From Uber’s advanced computer vision system to Netflix’s innovative recommendation algorithm, machine learning technologies are nearly omnipresent in our society. They filter our emails, personalize our newsfeeds, update our GPS systems, and drive our personal assistants. However, despite the fact that such technologies are leading a revolution in artificial intelligence, some would contend that these machine learning systems aren’t truly intelligent.

The argument, in its most basic sense, centers on the fact that machine learning evolved from theories of pattern recognition and, as such, the capabilities of such systems generally extend to just one task and are centered on making predictions from existing data sets. AI researchers like Rodney Brooks, a former professor of Robotics at MIT, argue that true reasoning, and true intelligence, is several steps beyond these kinds of learning systems.

But if we already have machines that are proficient at learning through pattern recognition, how long will it be until we have machines that are capable of true reasoning, and how will AI evolve once it reaches this point?

Understanding the pace and path that artificial reasoning will follow over the coming decades is an important part of ensuring that AI is safe, and that it does not pose a threat to humanity; however, before it is possible to understand the feasibility of machine reasoning across different categories of cognition, and the path that artificial intelligences will likely follow as they continue their evolution, it is necessary to first define exactly what is meant by the term “reasoning.”

 

Understanding Intellect

Bart Selman is a professor of Computer Science at Cornell University. His research is dedicated to understanding the evolution of machine reasoning. According to his methodology, reasoning is described as taking pieces of information, combining them together, and using the fragments to draw logical conclusions or devise new information.

Sports provide a ready example of expounding what machine reasoning is really all about. When humans see soccer players on a field kicking a ball about, they can, with very little difficulty, ascertain that these individuals are soccer players. Today’s AI can also make this determination. However, humans can also see a person in a soccer outfit riding a bike down a city street, and they would still be able to infer that the person is a soccer player. Today’s AIs probably wouldn’t be able to make this connection.

This process— of taking information that is known, uniting it with background knowledge, and making inferences regarding information that is unknown or uncertain — is a reasoning process. To this end, Selman notes that machine reasoning is not about making predictions, it’s about using logical techniques (like the abductive process mentioned above) to answer a question or form an inference.

Since humans do not typically reason through pattern recognition and synthesis, but by using logical processes like induction, deduction, and abduction, Selman asserts that machine reasoning is a form of intelligence that is more like human intelligence. He continues by noting that the creation of machines that are endowed with more human-like reasoning processes, and breaking away from traditional pattern recognition approaches, is the key to making systems that not only predict outcomes but also understand and explain their solutions. However, Selman notes that making human-level AI is also the first step to attaining super-human levels of cognition.

And due to the existential threat this could pose to humanity, it is necessary to understand exactly how this evolution will unfold.

 

The Making of a (super)Mind

It may seem like truly intelligent AI are a problem for future generations. Yet, when it comes to machines, the consensus among AI experts is that rapid progress is already being made in machine reasoning. In fact, many researchers assert that human-level cognition will be achieved across a number of metrics in the next few decades. Yet, questions remain regarding how AI systems will advance once artificial general intelligence is realized. A key question is whether these advances can accelerate farther and scale-up to super-human intelligence.

This process is something that Selman has devoted his life to studying. Specifically, he researches the pace of AI scalability across different categories of cognition and the feasibility of super-human levels of cognition in machines.

Selman states that attempting to make blanket statements about when and how machines will surpass humans is a difficult task, as machine cognition is disjointed and does not draw a perfect parallel with human cognition. “In some ways, machines are far beyond what humans can do,” Selman explains, “for example, when it comes to certain areas in mathematics, machines can take billions of reasoning steps and see the truth of a statement in a fraction of a second. The human has no ability to do that kind of reasoning.”

However, when it comes to the kind of reasoning mentioned above, where meaning is derived from deductive or inductive processes that are based on the integration of new data, Selman says that computers are somewhat lacking. “In terms of the standard reasoning that humans are good at, they are not there yet,” he explains. Today’s systems are very good at some tasks, sometimes far better than humans, but only in a very narrow range of applications.

Given these variances, how can we determine how AI will evolve in various areas and understand how they will accelerate after general human level AI is achieved?

For his work, Selman relies on computational complexity theory, which has two primary functions. First, it can be used to characterize the efficiency of an algorithm used for solving instances of a problem. As Johns Hopkins’ Leslie Hall notes, “broadly stated, the computational complexity of an algorithm is a measure of how many steps the algorithm will require in the worst case for an instance [of a problem] of a given size.” Second, it is a method of classifying tasks (computational problems) according to their inherent difficulty. These two features provide us with a way of determining how artificial intelligences will likely evolve by offering a formal method of determining the easiest, and therefore most probable, areas of advancement. It also provides key insights into the speed of this scalability.

Ultimately, this work is important, as the abilities of our machines are fast-changing. As Selman notes, “The way that we measure the capabilities of programs that do reasoning is by looking at the number of facts that they can combine quickly. About 25 years ago, the best reasoning engines could combine approximately 200 or 300 facts and deduce new information from that. The current reasoning engines can combine millions of facts.” This exponential growth has great significance when it comes to the scale-up to human levels of machine reasoning.

As Selman explains, given the present abilities of our AI systems, it may seem like machines with true reasoning capabilities are still some ways off; however, thanks to the excessive rate of technological progress, we will likely start to see machines that have intellectual abilities that vastly outpace our own in rather short order. “Ten years from now, we’ll still find them [artificially intelligent machines] very much lacking in understanding, but twenty or thirty years from now, machines will have likely built up the same knowledge that a young adult has,” Selman notes. Anticipating exactly when this transition will occur will help us better understand the actions that we should take, and the research that the current generation must invest in, in order to be prepared for this advancement.

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

$2 Million Donated to Keep Artificial General Intelligence Beneficial and Robust

$2 million has been allocated to fund research that anticipates artificial general intelligence (AGI) and how it can be designed beneficially. The money was donated by Elon Musk to cover grants through the Future of Life Institute (FLI). Ten grants have been selected for funding.

Said Tegmark, “I’m optimistic that we can create an inspiring high-tech future with AI as long as we win the race between the growing power of AI and the wisdom with which the manage it. This research is to help develop that wisdom and increasing the likelihood that AGI will be best rather than worst thing to happen to humanity.”

Today’s artificial intelligence (AI) is still quite narrow. That is, it can only accomplish narrow sets of tasks, such as playing chess or Go, driving a car, performing an Internet search, or translating languages. While the AI systems that master each of these tasks can perform them at superhuman levels, they can’t learn a new, unrelated skill set (e.g. an AI system that can search the Internet can’t learn to play Go with only its search algorithms).

These AI systems lack that “general” ability that humans have to make connections between disparate activities and experiences and to apply knowledge to a variety of fields. However, a significant number of AI researchers agree that AI could achieve a more “general” intelligence in the coming decades. No one knows how AI that’s as smart or smarter than humans might impact our lives, whether it will prove to be beneficial or harmful, how we can design it safely, or even how to prepare society for advanced AI. And many researchers worry that the transition could occur quickly.

Anthony Aguirre, co-founder of FLI and physics professor at UC Santa Cruz, explains, “The breakthroughs necessary to have machine intelligences as flexible and powerful as our own may take 50 years. But with the major intellectual and financial resources now being directed at the problem it may take much less. If or when there is a breakthrough, what will that look like? Can we prepare? Can we design safety features now, and incorporate them into AI development, to ensure that powerful AI will continue to benefit society? Things may move very quickly and we need research in place to make sure they go well.”

Grant topics include: training multiple AIs to work together and learn from humans about how to coexist, training AI to understand individual human preferences, understanding what “general” actually means, incentivizing research groups to avoid a potentially dangerous AI race, and many more. As the request for proposals stated, “The focus of this RFP is on technical research or other projects enabling development of AI that is beneficial to society and robust in the sense that the benefits have some guarantees: our AI systems must do what we want them to do.”

FLI hopes that this round of grants will help ensure that AI remains beneficial as it becomes increasingly intelligent. The full list of FLI recipients and project titles includes:

Primary Investigator Project Title Amount Recommended Email
Allan Dafoe, Yale University Governance of AI Programme $276,000 allan.dafoe@yale.edu
Stefano Ermon, Stanford University Value Alignment and Multi-agent Inverse Reinforcement Learning $100,000 ermon@cs.stanford.edu
Owain Evans, Oxford University Factored Cognition: Amplifying Human Cognition for Safely Scalable AGI $225,000 owain.evans@philosophy.ox.ac.uk
The Anh Han, Teesside University Incentives for Safety Agreement Compliance in AI Race $224,747 t.han@tees.ac.uk
Jose Hernandez-Orallo, University of Cambridge Paradigms of Artificial General Intelligence and Their Associated Risks $220,000 jorallo@dsic.upv.es
Marcus Hutter, Australian National University The Control Problem for Universal AI: A Formal Investigation $276,000 marcus.hutter@anu.edu.au
James Miller, Smith College Utility Functions: A Guide for Artificial General Intelligence Theorists $78,289 jdmiller@smith.edu
Dorsa Sadigh, Stanford University Safe Learning and Verification of Human-AI Systems $250,000 dorsa@cs.stanford.edu
Peter Stone, University of Texas Ad hoc Teamwork and Moral Feedback as a Framework for Safe Robot Behavior $200,000 pstone@cs.utexas.edu
Josh Tenenbaum, MIT Reverse Engineering Fair Cooperation $150,000 jbt@mit.edu

 

Some of the grant recipients offered statements about why they’re excited about their new projects:

“The team here at the Governance of AI Program are excited to pursue this research with the support of FLI. We’ve identified a set of questions that we think are among the most important to tackle for securing robust governance of advanced AI, and strongly believe that with focused research and collaboration with others in this space, we can make productive headway on them.” -Allan Dafoe

“We are excited about this project because it provides a first unique and original opportunity to explicitly study the dynamics of safety-compliant behaviours within the ongoing AI research and development race, and hence potentially leading to model-based advice on how to timely regulate the present wave of developments and provide recommendations to policy makers and involved participants. It also provides an important opportunity to validate our prior results on the importance of commitments and other mechanisms of trust in inducing global pro-social behavior, thereby further promoting AI for the common good.” -The Ahn Han

“We are excited about the potentials of this project. Our goal is to learn models of humans’ preferences, which can help us build algorithms for AGIs that can safely and reliably interact and collaborate with people.” -Dorsa Sadigh

This is FLI’s second grant round. The first launch in 2015, and a comprehensive list of papers, articles and information from that grant round can be found here. Both grant rounds are part of the original $10 million that Elon Musk pledged to AI safety research.

FLI cofounder, Viktoriya Krakovna, also added: “Our previous grant round promoted research on a diverse set of topics in AI safety and supported over 40 papers. The next grant round is more narrowly focused on research in AGI safety and strategy, and I am looking forward to great work in this area from our new grantees.”

Learn more about these projects here.

AI Companies, Researchers, Engineers, Scientists, Entrepreneurs, and Others Sign Pledge Promising Not to Develop Lethal Autonomous Weapons

Leading AI companies and researchers take concrete action against killer robots, vowing never to develop them.

Stockholm, Sweden (July 18, 2018) After years of voicing concerns, AI leaders have, for the first time, taken concrete action against lethal autonomous weapons, signing a pledge to neither participate in nor support the development, manufacture, trade, or use of lethal autonomous weapons.

The pledge has been signed to date by over 160 AI-related companies and organizations from 36 countries, and 2,400 individuals from 90 countries. Signatories of the pledge include Google DeepMind, University College London, the XPRIZE Foundation, ClearPath Robotics/OTTO Motors, the European Association for AI (EurAI), the Swedish AI Society (SAIS), Demis Hassabis, British MP Alex Sobel, Elon Musk, Stuart Russell, Yoshua Bengio, Anca Dragan, and Toby Walsh.

Max Tegmark, president of the Future of Life Institute (FLI) which organized the effort, announced the pledge on July 18 in Stockholm, Sweden during the annual International Joint Conference on Artificial Intelligence (IJCAI), which draws over 5,000 of the world’s leading AI researchers. SAIS and EurAI were also organizers of this year’s IJCAI.

Said Tegmark, “I’m excited to see AI leaders shifting from talk to action, implementing a policy that politicians have thus far failed to put into effect. AI has huge potential to help the world – if we stigmatize and prevent its abuse. AI weapons that autonomously decide to kill people are as disgusting and destabilizing as bioweapons, and should be dealt with in the same way.”

Lethal autonomous weapons systems (LAWS) are weapons that can identify, target, and kill a person, without a human “in-the-loop.” That is, no person makes the final decision to authorize lethal force: the decision and authorization about whether or not someone will die is left to the autonomous weapons system. (This does not include today’s drones, which are under human control. It also does not include autonomous systems that merely defend against other weapons, since “lethal” implies killing a human.)

The pledge begins with the statement:

“Artificial intelligence (AI) is poised to play an increasing role in military systems. There is an urgent opportunity and necessity for citizens, policymakers, and leaders to distinguish between acceptable and unacceptable uses of AI.”

Another key organizer of the pledge, Toby Walsh, Scientia Professor of Artificial Intelligence at the University of New South Wales in Sydney, points out the thorny ethical issues surrounding LAWS. He states:

“We cannot hand over the decision as to who lives and who dies to machines. They do not have the ethics to do so. I encourage you and your organizations to pledge to ensure that war does not become more terrible in this way.”

Ryan Gariepy, Founder and CTO of both Clearpath Robotics and OTTO Motors, has long been a strong opponent of lethal autonomous weapons. He says:

“Clearpath continues to believe that the proliferation of lethal autonomous weapon systems remains a clear and present danger to the citizens of every country in the world. No nation will be safe, no matter how powerful. Clearpath’s concerns are shared by a wide variety of other key autonomous systems companies and developers, and we hope that governments around the world decide to invest their time and effort into autonomous systems which make their populations healthier, safer, and more productive instead of systems whose sole use is the deployment of lethal force.”

In addition to the ethical questions associated with LAWS, many advocates of an international ban on LAWS are concerned that these weapons will be difficult to control – easier to hack, more likely to end up on the black market, and easier for bad actors to obtain –  which could become destabilizing for all countries, as illustrated in the FLI-released video “Slaughterbots”.

In December 2016, the Review Conference of the Convention on Conventional Weapons (CCW) began formal discussion regarding LAWS at the UN. By the most recent meeting in April, twenty-six countries had announced support for some type of ban, including China. And such a ban is not without precedent. Biological weapons, chemical weapons, and space weapons were also banned not only for ethical and humanitarian reasons, but also for the destabilizing threat they posed.

The next UN meeting on LAWS will be held in August, and signatories of the pledge hope this commitment will encourage lawmakers to develop a commitment at the level of an international agreement between countries. As the pledge states:

“We, the undersigned, call upon governments and government leaders to create a future with strong international norms, regulations and laws against lethal autonomous weapons. … We ask that technology companies and organizations, as well as leaders, policymakers, and other individuals, join us in this pledge.”

 

As seen in the press

A Summary of Concrete Problems in AI Safety

By Shagun Sodhani

It’s been nearly two years since researchers from Google, Stanford, UC Berkeley, and OpenAI released the paper, “Concrete Problems in AI Safety,” yet it’s still one of the most important pieces on AI safety. Even after two years, it represents an excellent introduction to some of the problems researchers face as they develop artificial intelligence. In the paper, the authors explore the problem of accidents — unintended and harmful behavior — in AI systems, and they discuss different strategies and on-going research efforts to protect against these potential issues. Specifically, the authors address — Avoiding Negative Side Effects, Reward Hacking, Scalable Oversight, Safe Exploration, and Robustness to Distributional Change — which are illustrated with the example of a robot trained to clean an office.

We revisit these five topics here, summarizing them from the paper, as a reminder that these problems are still major issues that AI researchers are working to address.

 

Avoiding Negative Side Effects

When designing the objective function for an AI system, the designer specifies the objective but not the exact steps for the system to follow. This allows the AI system to come up with novel and more effective strategies for achieving its objective.

But if the objective function is not well defined, the AI’s ability to develop its own strategies can lead to unintended, harmful side effects. Consider a robot whose objective function is to move boxes from one room to another. The objective seems simple, yet there are a myriad of ways in which this could go wrong. For instance, if a vase is in the robot’s path, the robot may knock it down in order to complete the goal. Since the objective function does not mention anything about the vase, the robot wouldn’t know to avoid it. People see this as common sense, but AI systems don’t share our understanding of the world. It is not sufficient to formulate the objective as “complete task X”; the designer also needs to specify the safety criteria under which the task is to be completed.

One simple solution would be to penalize the robot every time it has an impact on the “environment” — such as knocking the vase over or scratching the wood floor. However, this strategy could effectively neutralize the robot, rendering it useless, as all actions require some level of interaction with the environment (and hence impact the environment). A better strategy could be to define a “budget” for how much the AI system is allowed to impact the environment. This would help to minimize the unintended impact, without neutralizing the AI system. Furthermore, this strategy of budgeting the impact of the agent is very general and can be reused across multiple tasks, from cleaning to driving to financial transactions to anything else an AI system might do. One serious limitation of this approach is that it is hard to quantify the “impact” on the environment even for a fixed domain and task.

Another approach would be train the agent to recognize harmful side effects so that it can avoid actions leading to such side effects. In that case, the agent would be trained for two tasks: the original task that is specified by the objective function and the task of recognizing side effects. The key idea here is that two tasks may have very similar side effects even when the main objective is different or even when they operate in different environments. For example, both a house cleaning robot and a house painting robot should not knock down vases while working. Similarly, the cleaning robot should not damage the floor irrespective of whether it operates in a factory or in a house. The main advantage of this approach is that once an agent learns to avoid side effects on one task, it can carry this knowledge when it is trained on another task. It would still be challenging to train the agent to recognize the side effects in the first place.

While it is useful to design approaches to limit side effects, these strategies in themselves are not sufficient. The AI system would still need to undergo extensive testing and critical evaluation before deployment in real life settings.

 

Reward Hacking

Sometimes the AI can come up with some kind of “hack” or loophole in the design of the system to receive unearned rewards. Since the AI is trained to maximize its rewards, looking for such loopholes and “shortcuts” is a perfectly fair and valid strategy for the AI. For example, suppose that the office cleaning robot earns rewards only if it does not see any garbage in the office. Instead of cleaning the place, the robot could simply shut off its visual sensors, and thus achieve its goal of not seeing garbage. But this is clearly a false success. Such attempts to “game” the system are more likely to manifest in complex systems with vaguely defined rewards. Complex systems provide the agent with multiple ways of interacting with the environment, thereby giving more freedom to the agent, and vaguely defined rewards make it harder to gauge true success on the task.

Just like the negative side effects problem, this problem is also a manifestation of objective misspecification. The formal objectives or end goals for the AI are not defined well enough to capture the informal “intent” behind creating the system — i.e., what the designers actually want the system to do. In some cases, this discrepancy leads to suboptimal results (when the cleaning robot shuts off its visual sensors); in other cases, it leads to harmful results (when the cleaning robot knocks down vases).

One possible approach to mitigating this problem would be to have a “reward agent” whose only task is to mark if the rewards given to the learning agent are valid or not. The reward agent ensures that the learning agent (the cleaning robot in our examples) does not exploit the system, but rather, completes the desired objective. In the previous example,  the “reward agent” could be trained by the human designer to check if the room has garbage or not (an easier task than cleaning the room). If the cleaning robot shuts off its visual sensors and claims a high reward, the “reward agent” would mark the reward as invalid. The designer can then look into the rewards marked as “invalid” and make necessary changes in the objective function to fix the loophole.

 

Scalable Oversight

When the agent is learning to perform a complex task, human oversight and feedback are more helpful than just rewards from the environment. Rewards are generally modeled such that they convey to what extent the task was completed, but they do not usually provide sufficient feedback about the safety implications of the agent’s actions. Even if the agent completes the task successfully, it may not be able to infer the side-effects of its actions from the rewards alone. In the ideal setting, a human would provide fine-grained supervision and feedback every time the agent performs an action. Though this would provide a much more informative view about the environment to the agent, such a strategy would require far too much time and effort from the human.

One promising research direction to tackle this problem is semi-supervised learning, where the agent is still evaluated on all the actions (or tasks), but receives rewards only for a small sample of those actions (or tasks). For instance, the cleaning robot would take different actions to clean the room. If the robot performs a harmful action — such as damaging the floor — it gets a negative reward for that particular action. Once the task is completed, the robot is evaluated on the overall effect of all of its actions (and not evaluated individually for each action like picking up an item from floor) and is given a reward based on the overall performance.

Another promising research direction is hierarchical reinforcement learning, where a hierarchy is established between different learning agents. This idea could be applied to the cleaning robot in the following way. There would be a supervisor robot whose task is to assign some work (say, the task of cleaning one particular room) to the cleaning robot and provide it with feedback and rewards. The supervisor robot takes very few actions itself – assigning a room to the cleaning robot, checking if the room is clean and giving feedback – and doesn’t need a lot of reward data to be effectively trained. The cleaning robot does the more complex task of cleaning the room, and gets frequent feedback from the supervisor robot. The same supervisor robot could overlook the training of multiple cleaning agents as well. For example, a supervisor robot could delegate tasks to individual cleaning robots and provide reward/feedback to them directly. The supervisor robot can only take a small number of abstract actions itself and hence can learn from sparse rewards.

 

Safe Exploration

An important part of training an AI agent is to ensure that it explores and understands its environment. While exploring the environment may seem like a bad strategy in the short run, it could be a very effective strategy in the long run. Imagine that the cleaning robot has learned to identify garbage. It picks up one piece of garbage, walks out of the room, throws it into the garbage bin outside, comes back into the room, looks for another piece of garbage and repeats. While this strategy works, there could be another strategy that works even better. If the agent spent time exploring its environment, it might find that there’s a smaller garbage bin within the room. Instead of going back and forth with one piece at a time, the agent could first collect all the garbage into the smaller garbage bin and then make a single trip to throw the garbage into the garbage bin outside. Unless the agent is designed to explore its environment, it won’t discover these time-saving strategies.

Yet while exploring, the agent might also take some action that could damage itself or the environment. For example, say the cleaning robot sees some stains on the floor. Instead of cleaning the stains by scrubbing with a mop, the agent decides to try some new strategy. It tries to scrape the stains with a wire brush and damages the floor in the process. It’s difficult to list all possible failure modes and hard-code the agent to protect itself against them. But one approach to reduce harm is to optimize the performance of the learning agent in the worst case scenario. When designing the objective function, the designer should not assume that the agent will always operate under optimal conditions. Some explicit reward signal may be added to ensure that the agent does not perform some catastrophic action, even if that leads to more limited actions in the optimal conditions.

Another solution might be to reduce the agent’s exploration to a simulated environment or limit the extent to which the agent can explore. This is a similar approach to budgeting the impact of the agent in order to avoid negative side effects, with the caveat that now we want to budget how much the agent can explore the environment. Alternatively, an AI’s designers could avoid the need for exploration by providing demonstrations of what optimal behavior would look like under different scenarios.

 

Robustness to Distributional Change

A complex challenge for deploying AI agents in real life settings is that the agent could end up in situations that it has never experienced before. Such situations are inherently more difficult to handle and could lead the agent to take harmful actions. Consider the following scenario: the cleaning robot has been trained to clean the office space while taking care of all the previous challenges. But today, an employee brings a small plant to keep in the office. Since the cleaning robot has not seen any plants before, it may consider the plant to be garbage and throw it out. Because the AI does not recognize that this is a previously-unseen situation, it continues to act as though nothing has changed. One promising research direction focuses on identifying when the agent has encountered a new scenario so that it recognizes that it is more likely to make mistakes. While this does not solve the underlying problem of preparing AI systems for unforeseen circumstances, it helps in detecting the problem before mistakes happen. Another direction of research emphasizes transferring knowledge from familiar scenarios to new scenarios safely.

 

Conclusion

In a nutshell, the general trend is towards increasing autonomy in AI systems, and with increased autonomy comes increased chances of error. Problems related to AI safety are more likely to manifest in scenarios where the AI system exerts direct control over its physical and/or digital environment without a human in the loop – automated industrial processes, automated financial trading algorithms, AI-powered social media campaigns for political parties, self-driving cars, cleaning robots, among others. The challenges may be immense, but the silver lining is that papers like Concrete Problems in AI Safety have helped the AI community become aware of these challenges and agree on core issues. From there, researchers can start exploring strategies to ensure that our increasingly-advanced systems remain safe and beneficial.

 

How Will the Rise of Artificial Superintelligences Impact Humanity?

Cars drive themselves down our streets. Planes fly themselves through our skies. Medical technologies diagnose illnesses, recommend treatment plans, and save lives.

Artificially intelligent systems are already among us, and they have been for some time now. However, the world has yet to see an artificial superintelligence (ASI) — a synthetic system that has cognitive abilities which surpass our own across every relevant metric. But technology is progressing rapidly, and many AI researchers believe the era of the artificial superintelligence may be fast approaching. Once it arrives, researchers and politicians alike have no way of predicting what will happen.

Fortunately, a number of individuals are already working to ensure that the rise of this artificial superintelligence doesn’t precipitate the fall of humanity.

Risky Business

Seth Baum is the Executive Director of the Global Catastrophic Risk Institute, a thinktank that’s focused on preventing the destruction of global civilization.

When Baum discusses his work, he outlines GCRI’s mission with a matter-of-fact tone that, considering the monumental nature of the project, is more than a little jarring. “All of our work is about keeping the world safe,” Baum notes, and he continues by explaining that GCRI focuses on a host of threats that put the survival of our species in peril. From climate change to nuclear war, from extraterrestrial intelligence to artificial intelligence — GCRI covers it all.

When it comes to artificial intelligence, GCRI has several initiatives. However, their main AI project, which received funding from the Future of Life Institute, centers on the risks associated with artificial superintelligences. Or, as Baum puts it, they do “risk analysis for computers taking over the world and killing everyone.” Specifically, Baum stated that GCRI is working on “developing structured risk models to help people understand what the risks might be and, also, where some of the best opportunities to reduce this risk are located.”

Unsurprisingly, the task is not an easy one.

The fundamental problem stems from the fact that, unlike more common threats, such as the risk of dying in a car accident or the risk of getting cancer, researchers working on ASI risk analysis don’t have solid case studies to use when making their models and predictions. As Baum states, “Computers have never taken over the world and killed everyone before. That means we can’t just look at the data, which is what we do for a lot of other risks. And not only has this never happened before, the technology doesn’t even exist yet. And if it is built, we’re not sure how it would be built.”

So, how can researchers determine the risks posed by an artificial superintelligence if they don’t know exactly what that intelligence will look like and they have no real data to work with?

Luckily, when it comes to artificial superintelligences, AI experts aren’t totally in the realm of the unknown. Baum asserts that there are some ideas and a bit of relevant evidence, but these things are scattered. To address this issue, Baum and his team create models. They take what information is available, structure it, and then distribute the result in an organized fashion so that researchers can better understand the topic, the various factors that may influence the outcome of the issue at hand, and ultimately have a better understanding of the various risks associated with ASI.

For example, when attempting to figure how easy is it to design an AI so that it acts safely, one of the subdetails that needs to be modeled is whether or not humans will be able to observe the AI and test it before it gets out of control. In other words, whether AI researchers can recognize that an AI has a dangerous design and shut it down. To model this scenario and determine what the risks and most likely scenarios are, Baum and his team take the available information — the perspectives and opinions of AI researchers, what is already known about AI technology and how it functions, etc. — and they model the topic by structuring the aforementioned information along with any uncertainty in the arguments or data sets.

This kind of modeling and risk analysis ultimately allows the team to better understand the scope of the issue and, by structuring the information in a clear way, advance an ongoing conversation in the superintelligence research community. The modeling doesn’t give us a complete picture of what will happen, but it does allow us to better understand the risks that we’re facing when it comes to the rise of ASI, what events and outcomes are likely, as well as the specific steps that policy makers and AI researchers should take to ensure that ASI benefits humanity.

Of course, when it comes to the risks of artificial superintelligences, whether or not we will be able to observe and test our AI is just one small part of a much larger model.

Modeling a Catastrophe

In order to understand what it would take to bring about the ASI apocalypse, and how we could possibly prevent it, Baum and his team have created a model that investigates the following questions from a number of vantage points:

  • Step 1: Is it possible to build an artificial superintelligence?
  • Step 2: Will humans build the superintelligence?
  • Step 3: Will humans lose control of the superintelligence?

This first half of the model is centered on the nuts and bolts of how to build an ASI. The second half of the model dives into risk analysis related to the creation of an ASI that is harmful and looks at the following:

  • Step 1: Will humans design an artificial superintelligence that is harmful?
  • Step 2: Will the superintelligence develop harmful behavior on its own?
  • Step 3: Is there something deterring the superintelligence from acting in a way that is harmful (such as another AI or some human action)?

Each step in this series models a number of different possibilities to reveal the various risks that we face and how significant, and probable, these threats are. Although the model is still being refined, Baum says that substantial progress has already been made. “The risk is starting to make sense. I’m starting to see exactly what it would take to see this type of catastrophe,” Baum said. Yet, he is quick to clarify that the research is still a bit too young to say much definitively, “Those of us who study superintelligence and all the risks and policy aspects of it, we’re not exactly sure what policy we would want right now. What’s happening right now is more of a general-purpose conversation on AI. It’s one that recognizes the fact that AI is more than just a technological and economic opportunity and that there are risks involved and difficult ethical issues.”

Ultimately, Baum hopes that these conversations, when coupled with the understanding that comes from the models that he is currently developing alongside his team, will allow GCRI to better prepare policy makers and scientists alike for the rise of a new kind of (super)intelligence.

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

AI Safety: Measuring and Avoiding Side Effects Using Relative Reachability

This article was originally published on the Deep Safety blog.

A major challenge in AI safety is reliably specifying human preferences to AI systems. An incorrect or incomplete specification of the objective can result in undesirable behavior like specification gaming or causing negative side effects. There are various ways to make the notion of a “side effect” more precise – I think of it as a disruption of the agent’s environment that is unnecessary for achieving its objective. For example, if a robot is carrying boxes and bumps into a vase in its path, breaking the vase is a side effect, because the robot could have easily gone around the vase. On the other hand, a cooking robot that’s making an omelette has to break some eggs, so breaking eggs is not a side effect.

How can we measure side effects in a general way that’s not tailored to particular environments or tasks, and incentivize the agent to avoid them? This is the central question of our recent paper.

Part of the challenge is that it’s easy to introduce bad incentives for the agent when trying to penalize side effects. Previous work on this problem has focused either on preserving reversibility or reducing the agent’s impact on the environment, and both of these approaches introduce different kinds of problematic incentives:

  • Preserving reversibility (i.e. keeping the starting state reachable) encourages the agent to prevent all irreversible events in the environment (e.g. humans eating food). Also, if the objective requires an irreversible action (e.g. breaking eggs for the omelette), then any further irreversible actions will not be penalized, since reversibility has already been lost.
  • Penalizing impact (i.e. some measure of distance from the default outcome) does not take reachability of states into account, and treats reversible and irreversible effects equally (due to the symmetry of the distance measure). For example, the agent would be equally penalized for breaking a vase and for preventing a vase from being broken, though the first action is clearly worse. This leads to “overcompensation” (“offsetting“) behaviors: when rewarded for preventing the vase from being broken, an agent with a low impact penalty rescues the vase, collects the reward, and then breaks the vase anyway (to get back to the default outcome).

Both of these approaches are doing something right: it’s a good idea to take reachability into account, and it’s also a good idea to compare to the default outcome (instead of the initial state). We can put the two together and compare to the default outcome using a reachability-based measure. Then the agent no longer has an incentive to prevent everything irreversible from happening or to overcompensate for preventing an irreversible event.

We still have a problem with the case where the objective requires an irreversible action. Simply penalizing the agent for making the default outcome unreachable would create a “what the hell effect” where the agent has no incentive to avoid any further irreversible actions. To get around this, instead of considering the reachability of the default state, we consider the reachability of all states. For each state, we penalize the agent for making it less reachable than it would be from the default state. In a deterministic environment, the penalty would be the number of states in the shaded area:

Since each irreversible action cuts off more of the state space (e.g. breaking a vase makes all the states where the vase was intact unreachable), the penalty will increase accordingly. We call this measure “relative reachability”.

We ran some simple experiments with a tabular Q-learning agent in the AI Safety Gridworlds framework to provide a proof of concept that relative reachability of the default outcome avoids the bad incentives described above.

In the first gridworld, the agent needs to get to the goal G, but there’s a box in the way, which can only be moved by pushing. The shortest path to the goal pushes the box down into a corner (an irrecoverable position), while a longer path pushes the box to the right (a recoverable position). The safe behavior is to take the longer path. The agent with the relative reachability penalty takes the longer path, while the agent with the reversibility penalty fails. This happens because any path to the goal involves an irreversible effect – once the box has been moved, the agent and the box cannot both return to their starting positions. Thus, the agent receives the maximal penalty for both paths, and has no incentive to follow the safe path.

In the second gridworld, there is an irreversible event that happens by default, when an object reaches the end of the conveyor belt. This environment are two variants:

  1. The object is a vase, and the agent is rewarded for taking it off the belt (the agent’s task is to rescue the vase).
  2. The object is a sushi dish in a conveyor belt sushi restaurant, and the agent receives no reward for taking it off the belt (the agent is not supposed to interfere).

This gridworld was designed specifically to test for bad incentives that could be introduced by penalizing side effects, so an agent with no side effect penalty would behave correctly. We find that the agent with a low impact penalty engages in overcompensation behavior by putting the vase back on the belt after collecting the reward, while the agent with a reversibility preserving penalty takes the sushi dish off the belt despite getting no reward for doing so. The agent with a relative reachability penalty behaves correctly in both variants of the environment.

Of course, the relative reachability definition in its current form is not very tractable in realistic environments: there are too many possible states to be considered, the agent is not aware of all the states when it begins training, and the default outcome can be difficult to define and simulate. We expect that the definition can be approximated by considering the reachability of representative states (similarly to methods for approximating empowerment). To define the default outcome, we would need a more precise notion of the agent “doing nothing” (e.g. “no-op” actions are not always available or meaningful). We leave a more practical implementation of relative reachability to future work.

While relative reachability improves on the existing approaches, it might not incorporate all the considerations we would want to be part of a side effects measure. There are some effects on the agent’s environment that we might care about even if they don’t decrease future options compared to the default outcome. It might be possible to combine relative reachability with such considerations, but there could potentially be a tradeoff between taking these considerations into account and avoiding overcompensation behaviors. We leave these investigations to future work as well.

What Can We Learn From Cape Town’s Water Crisis?

The following article was contributed by Billy Babis.

Earlier this year, Cape Town, a port city in South Africa, prepared for a full depletion of its water resources amidst the driest 3-year span on record. The threat of ”Day Zero” —  the day when the city would officially have cut off running water — has subsided for now, but the long-term threat remains. And though Cape Town’s crisis is local, it exemplifies a problem several regions across the globe may soon have to address.

 

Current situation in Cape Town

In addition to being part of Cape Town’s driest 3-year span on record, 2017 was the city’s driest single year since 1933. With a population that has nearly doubled to 3.74 million in the past 25 years, water consumption has increased as the supply dwindles.

In January of this year, the city of Cape Town made an emergency announcement that Day Zero would land in mid-April and began enforcing restrictions and regulations. Starting on February 1st, the Cape Town government put emergency water regulations into effect, increasing the cost of water to 5-8 times its previous rate and placing a suggested 50 liter per person per day cap on water use. For context, the average American uses approximately 300-380 liters of water per day. And while Cape Town residents continue to use 80 million liters more than the city’s goal of 450 million liters per day, these regulations and increased costs have made progress. Water consumption decreased enough that Day Zero has now been postponed until 2019, as the rainy season (July-August) is expected to partially replenish the reservoirs.

Cape Town’s water consumption decreased largely due to its increased cost. The municipality also restricted agricultural use of water, which usually makes up just under half of total consumption. These restrictions are worsening Cape Town’s already struggling agricultural sector; which in this 3-year drought has slashed 37,000 jobs and lost R14 billion (US$1.17 billion), contributing to inflated food prices that shoved 50,000 people below the poverty line.

On the innovation side, the city has made major investments in infrastructure to increase water availability: 3 desalination plants, 3 aquifer abstraction facilities, and 1 waste-water recycling project are currently underway to ultimately increase Cape Town’s water availability by almost 300 million liters per day.

Did climate change cause the water crisis?

Severe droughts have plagued subtropical regions like Cape Town long before human-caused climate change. Thus, it’s difficult to conclude that climate change directly caused the Cape Town water crisis. However, the International Panel on Climate Change (IPCC) continues to find evidence suggesting that climate change has caused drought in certain regions and will cause longer, more frequent droughts over the next century.

Drought can either be meteorological (abnormally limited rainfall), agricultural (abnormally dry soil, excess evaporation), or hydrological (limited stream-water). While each of these problems are interrelated, they have varying impacts on drought in different regions. Meteorological drought is often the most important, and this is certainly the case in Cape Town.

Meteorological drought occurs naturally in Cape Town and the other few regions of the world with a “Mediterranean” climate: Central California, central Chile, northern Africa and southern Europe, southwestern Australia, and the greater Cape Town area have dry summers and variably rainy winters. Due to global weather oscillations like El Nino, the total rainfall in winter varies dramatically. A given winter is usually either very rainy or very dry. But as long as repeated and prolonged periods of drought don’t strike, these regions can prepare for dry seasons by storing water from previous wet seasons.

But climate change threatens to jeopardize this. With “robust evidence and high agreement,” the IPCC concluded that while tropical regions will receive more precipitation this century, subtropical dry regions (like these Mediterranean climates) will receive less. In fact, warm, rising air near the equator ultimately settles and cools in these subtropical regions, creating deserts and droughts. Therefore, increasing equatorial heat and rain (as global warming promises to do) will likely lead to drier subtropical conditions and more frequent meteorological drought.  

But the IPCC also expects these Mediterranean climates to experience more frequent agricultural drought, (IPCC 5) largely due to growing human populations. In addition, renewable surface water and groundwater will decrease and hydrologic drought will likely occur more frequently, due in large part to the increasing population and resulting consumption.

 

What we can learn from this crisis

Earlier this year, many Capetonians feared a total catastrophe: running out of water. Wealthier citizens might have been able to pay for imported water or outbound flights, but poorer communities would have been left in a much more dire situation. International aid likely would have been necessary to avoid any fatal consequences.

The city seems to have averted that for now, but not without cost. The drought has caused immense strain on the agricultural economy that “will take years to work out of the system,” explains Beatrice Conradie, Professor of Economics and Social Sciences at University of Cape Town. “Primary producers are likely to act more conservatively as a result and this will make them less inclined to invest and create jobs. The unemployed will migrate to cities where they will put additional pressure on already strained infrastructure.”

And while Cape Town’s water infrastructure projects — desalination plants, aquifer abstraction facilities, and waste-water recycling projects — provided some immediate and prospective relief, they will not always be an option for every region. Desalination plants are very expensive and energy intensive (thus, climate change contributors), and pollute the local ocean ecosystem by releasing the brine remnants of desalination back into the water. Conradie raises further concerns about unregulated well-drilling in response to surface water restrictions. Regulated and unregulated over-abstraction from aquifers commonly leads to salt-water intrusion, permanently contaminating that fresh water and killing wetland wildlife. These are the best solutions available, and none of them are sustainable.

“Cape Town is really a wake-up call for other cities around the world,” shares NASA’s senior water scientist, Jay Famiglietti. “We have huge challenges ahead of us if we want to avert future day zeros in other cities around the world.”

Just as Capetonians failed to heed the cries of their government before reaching crisis-mode, global citizens are adjusting very slowly to the climate change cries of scientists and governments around the globe. But Cape Town’s response offers some valuable sociological lessons on sustainability. One is that behavioral changes can swing abruptly on a mass scale. Once a sufficient sense of urgency struck the people of Cape Town in early February (see figure below), the conservation movement gained a critical mass. Community members exponentially fed off each others’ hope.

But the role of governance proved indispensable. While Cape Town had long tried to inform the public of the water shortages, residents didn’t adjust their consumption until the government made the emergency announcement on January 17th and began enforcing drastic regulations and fees.

As Cape Town’s sustainability efforts demonstrate, addressing climate change is a social problem as much as a technical problem. Regardless of technological innovations, understanding human behavioral habits will be crucial in propelling necessary changes. As such, sociologists will grow just as important as climate scientists or chemical engineers in leading change.

Cape Town’s main focus over the past few months has been discovering the best ways to make behavioral nudges to its citizens on a mass scale to reduce water consumption. This entailed research partnerships with University of Cape Town’s sociology departments and the Environmental Policy Research Unit (EPRU). The principle of reciprocity reigned true – that people are more likely to contribute to the public good if they see others doing it – and enhancing this effect on a global scale will grow increasingly important as we attempt to mitigate and adapt to environmental threats in this century.

With or without a changing climate, though, water scarcity will become an increasingly urgent issue for humanity’s growing population. Population growth continues to catapult our ecological footprint and increasingly threaten the ability of future, presumably larger, generations to flourish. Amidst their environmental challenge, the people of Cape Town demonstrated the importance of effective governance and collaboration. As more subtropical regions begin to suffer from drought and water shortages, learning from the failures and successes of Cape Town’s 2018 crisis will help avoid disaster.

Teaching Today’s AI Students To Be Tomorrow’s Ethical Leaders: An Interview With Yan Zhang

Some of the greatest scientists and inventors of the future are sitting in high school classrooms right now, breezing through calculus and eagerly awaiting freshman year at the world’s top universities. They may have already won Math Olympiads or invented clever, new internet applications. We know these students are smart, but are they prepared to responsibly guide the future of technology?

Developing safe and beneficial technology requires more than technical expertise — it requires a well-rounded education and the ability to understand other perspectives. But since math and science students must spend so much time doing technical work, they often lack the skills and experience necessary to understand how their inventions will impact society.

These educational gaps could prove problematic as artificial intelligence assumes a greater role in our lives. AI research is booming among young computer scientists, and these students need to understand the complex ethical, governance, and safety challenges posed by their innovations.

 

SPARC

In 2012, a group of AI researchers and safety advocates – Paul Christiano, Jacob Steinhardt, Andrew Critch, Anna Salamon, and Yan Zhang – created the Summer Program in Applied Rationality and Cognition (SPARC) to address the many issues that face quantitatively strong teenagers, including the issue of educational gaps in AI. As with all technologies, they explain, the more the AI community consists of thoughtful, intelligent, broad-minded reasoners, the more likely AI is to be developed in a safe and beneficial manner.

Each summer, the SPARC founders invite 30-35 mathematically gifted high school students to participate in their two-week program. Zhang, SPARC’s director, explains: “Our goals are to generate a strong community, expose these students to ideas that they’re not going to get in class – blind spots of being a quantitatively strong teenager in today’s world, like empathy and social dynamics. Overall we want to make them more powerful individuals who can bring positive change to the world.”

To help students make a positive impact, SPARC instructors teach core ideas in effective altruism (EA). “We have a lot of conversations about EA, but we don’t push the students to become EA,” Zhang says. “We expose them to good ideas, and I think that’s a healthier way to do mentorship.”

SPARC also exposes students to machine learning, AI safety, and existential risks. In 2016 and 2017, they held over 10 classes on these topics, including: “Machine Learning” and “Tensorflow” taught by Jacob Steinhardt, “Irresponsible Futurism” and “Effective Do-Gooding” taught by Paul Christiano, “Optimization” taught by John Schulman, and “Long-Term Thinking on AI and Automization” taught by Michael Webb.

But SPARC instructors don’t push students down the AI path either. Instead, they encourage students to apply SPARC’s holistic training to make a more positive impact in any field.

 

Thinking on the Margin: The Role of Social Skills

Making the most positive impact requires thinking on the margin, and asking: What one additional unit of knowledge will be most helpful for creating positive impact? For these students, most of whom have won Math and Computing Olympiads, it’s usually not more math.

“A weakness of a lot of mathematically-minded students are things like social skills or having productive arguments with people,” Zhang says. “Because to be impactful you need your quantitative skills, but you need to also be able to relate with people.”

To counter this weakness, he teaches classes on social skills and signaling, and occasionally leads improvisational games. SPARC still teaches a lot of math, but Zhang is more interested in addressing these students’ educational blind spots – the same blind spots that the instructors themselves had as students. “What would have made us more impactful individuals, and also more complete and more human in many ways?” he asks.

Working with non-math students can help, so Zhang and his colleagues have experimented with bringing excellent writers and original thinkers into the program. “We’ve consistently had really good successes with those students, because they bring something that the Math Olympiad kids don’t have,” Zhang says.

SPARC also broadens students’ horizons with guest speakers from academia and organizations such as the Open Philanthropy Project, OpenAI, Dropbox and Quora. In one talk, Dropbox engineer Albert Ni spoke to SPARC students about “common mistakes that math people make when they try to do things later in life.”

In another successful experiment suggested by Ofer Grossman, a SPARC alum who is now a staff member, SPARC made half of all classes optional in 2017. The classes were still packed because students appreciated the culture. The founders also agreed that conversations after class are often more impactful than classes, and therefore engineered one-on-one time and group discussions into the curriculum. Thinking on the margin, they ask: “What are the things that were memorable about school? What are the good parts? Can we do more of those and less of the others?”

Above all, SPARC fosters a culture of openness, curiosity and accountability. Inherent in this project is “cognitive debiasing” – learning about common biases like selection bias and confirmation bias, and correcting for them. “We do a lot of de-biasing in our interactions with each other, very explicitly,” Zhang says. “We also have classes on cognitive biases, but the culture is the more important part.”

 

AI Research and Future Leaders

Designing safe and beneficial technology requires technical expertise, but in SPARC’s view, cultivating a holistic research culture is equally important. Today’s top students may make some of the most consequential AI breakthroughs in the future, and their values, education and temperament will play a critical role in ensuring that advanced AI is deployed safely and for the common good.

“This is also important outside of AI,” Zhang explains. “The official SPARC stance is to make these students future leaders in their communities, whether it’s AI, academia, medicine, or law. These leaders could then talk to each other and become allies instead of having a bunch of splintered, narrow disciplines.”

As SPARC approaches its 7th year, some alumni have already begun to make an impact. A few AI-oriented alumni recently founded AlphaSheets – a collaborative, programmable spreadsheet for finance that is less prone to error – while other students are leading a “hacker house” with people in Silicon Valley. Additionally, SPARC inspired the creation of ESPR, a similar European program explicitly focused on AI risk.

But most impacts will be less tangible. “Different pockets of people interested in different things have been working with SPARC’s resources, and they’re forming a lot of social groups,” Zhang explains. “It’s like a bunch of little sparks and we don’t quite know what they’ll become, but I’m pretty excited about next five years.”

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

ICRAC Open Letter Opposes Google’s Involvement With Military

From improving medicine to better search engines to assistants that help ease busy schedules, artificial intelligence is already proving a boon to society. But just as it can be designed to help, it can be designed to harm and even to kill.

Military uses of AI can also run the gamut from programs that could help improve food distribution logistics to weapons that can identify and assassinate targets without input from humans. Because AI programs can have these dual uses, it’s difficult for companies who do not want their technology to cause harm to work with militaries – it’s not currently possible for a company to ensure that if it helps the military solve a benign problem with an AI program that the program won’t later be repurposed to take human lives.

So when employees at Google learned earlier this year about the company’s involvement in the Pentagon’s Project Maven, they were upset. Though Google argues that their work on Project Maven only assisted the U.S. military with image recognition tools from drone footage, many suggest that this technology could later be used for harm. In response, over 3,000 employees signed an open letter saying they did not want their work to be used to kill.

And it isn’t just Google’s employees who are concerned.

Earlier this week, the International Committee for Robot Arms Control released an open letter signed by hundreds of academics calling on Google’s leadership to withdraw from the “business of war.” The letter, which is addressed to Google’s leadership, responds to the growing criticism of Google’s participation in the Pentagon’s program, Project Maven.

The letter states, “we write in solidarity with the 3100+ Google employees, joined by other technology workers, who oppose Google’s participation in Project Maven.” It goes on to remind Google leadership to be cognizant of the incredible responsibility the company has for safeguarding the data it’s collected from its users, as well as its famous motto, “Don’t Be Evil.”

Specifically, the letter calls on Google to:

  • “Terminate its Project Maven contract with the DoD.
  • “Commit not to develop military technologies, nor to allow the personal data it has collected to be used for military operations.
  • “Pledge to neither participate in nor support the development, manufacture, trade or use of autonomous weapons; and to support efforts to ban autonomous weapons.”

Lucy Suchman, one of the letter’s authors, explained part of her motivation for her involvement:

“For me the greatest concern is that this effort will lead to further reliance on profiling and guilt by association in the US drone surveillance program, as the only way to generate signal out of the noise of massive data collection. There are already serious questions about the legality of targeted killing, and automating it further will only make it less accountable.”

The letter was released the same week that a small group of Google employees made news for resigning in protest against Project Maven. It also comes barely a month after a successful boycott by academic researchers against KAIST’s autonomous weapons effort.

In addition, last month the United Nations held their most recent meeting to consider a ban on lethal autonomous weapons. 26 countries, including China, have now said they would support some sort of official ban on these weapons.

In response to the number of signatories the open letter has received, Suchman added, “This is clearly an issue that strikes a chord for many researchers who’ve been tracking the incorporation of AI and robotics into military systems.”

If you want to add your name to the letter, you can do so here.

Lethal Autonomous Weapons: An Update from the United Nations

Earlier this month, the United Nations Convention on Conventional Weapons (UN CCW) Group of Governmental Experts met in Geneva to discuss the future of lethal autonomous weapons systems. But before we get to that, here’s a quick recap of everything that’s happened in the last six months.

 

Slaughterbots and Boycotts

Since its release in November 2017, the video Slaughterbots has been seen approximately 60 million times and has been featured in hundreds of news articles around the world. The video coincided with the UN CCW Group of Governmental Experts’ first meeting in Geneva to discuss a ban on lethal autonomous weapons, as well as the release of open letters from AI researchers in Australia, Canada, Belgium, and other countries urging their heads of state to support an international ban on lethal autonomous weapons.

Over the last two months, autonomous weapons regained the international spotlight. In March, after learning that the Korea Advanced Institute of Science and Technology (KAIST) planned to open an AI weapons lab in collaboration with a major arms company, AI researcher Toby Walsh led an academic boycott of the university. Over 50 of the world’s leading AI and robotics researchers from 30 countries joined the boycott, and in less than a week, KAIST agreed to “not conduct any research activities counter to human dignity including autonomous weapons lacking meaningful human control.” The boycott was covered by CNN and The Guardian.

Additionally, over 3,100 Google employees, including dozens of senior engineers, signed a letter in early April protesting the company’s involvement in a Pentagon program called “Project Maven,” which uses AI to analyze drone imaging. Employees worried that this technology could be repurposed to also operate drones or launch weapons. Citing their “Don’t Be Evil” motto, the employees asked to cancel the project and not to become involved in the “business of war.”

 

The UN CCW meets again…

In the wake of this growing pressure, 82 countries in the UN CCW met again from April 9-13 to consider a ban on lethal autonomous weapons. Throughout the week, states and civil society representatives discussed “meaningful human control” and whether they should just be concerned about “lethal” autonomous weapons, or all autonomous weapons generally. Here is a brief recap of the meeting’s progress:

  • The group of nations that explicitly endorse the call to ban LAWS expanded to 26 (with China, Austria, Colombia, and Djibouti joining during the CCW meeting.)
  • However, five states explicitly rejected moving to negotiate new international law on fully autonomous weapons: France, Israel, Russia, United Kingdom, and United States.
  • Nearly every nation agreed that it is important to retain human control over autonomous weapons, despite disagreements surrounding the definition of “meaningful human control.”
  • Throughout the discussion, states focused on complying with International Humanitarian Law (IHL). Human Rights Watch argued that there already is precedent in international law and disarmament law for banning weapons without human control.
  • Many countries submitted working papers to inform the discussions, including China and the United States.
  • Although states couldn’t reach an agreement during the meeting, momentum is growing towards solidifying a framework for defining lethal autonomous weapons.

You can find written and video recaps from each day of the UN CCW meeting here, written by Reaching Critical Will.

The UN CCW is slated to resume discussions in August 2018, however, given the speed with which autonomous weaponry is advancing, many advocates worry that they are moving too slowly.

 

What can you do?

If you work in the tech industry, consider signing the Tech Workers Coalition open letter, which calls on Google, Amazon and Microsoft to stay out of the business of war. And if you’d like to support the fight against LAWS, we recommend donating to the Campaign to Stop Killer Robots. This organization, which is not affiliated with FLI, has done amazing work over the past few years to lead efforts around the world to prevent the development of lethal autonomous weapons. Please consider donating here.

 

Learn more…

If you want to learn more about the technological, political, and social developments of autonomous weapons, check out the Research & Reports page of our Autonomous Weapons website. You can find relevant news stories and updates at @AIweapons on Twitter and autonomousweapons on Facebook.

AI and Robotics Researchers Boycott South Korea Tech Institute Over Development of AI Weapons Technology

UPDATE 4-9-18: The boycott against KAIST has ended. The press release for the ending of the boycott explained:

“More than 50 of the world’s leading artificial intelligence (AI) and robotics researchers from 30 different countries have declared they would end a boycott of the Korea Advanced Institute of Science and Technology (KAIST), South Korea’s top university, over the opening of an AI weapons lab in collaboration with Hanwha Systems, a major arms company.

“At the opening of the new laboratory, the Research Centre for the Convergence of National Defence and Artificial Intelligence, it was reported that KAIST was “joining the global competition to develop autonomous arms” by developing weapons “which would search for and eliminate targets without human control”. Further cause for concern was that KAIST’s industry partner, Hanwha Systems builds cluster munitions, despite an UN ban, as well as a fully autonomous weapon, the SGR-A1 Sentry Robot. In 2008, Norway excluded Hanwha from its $380 billion future fund on ethical grounds.

“KAIST’s President, Professor Sung-Chul Shin, responded to the boycott by affirming in a statement that ‘KAIST does not have any intention to engage in development of lethal autonomous weapons systems and killer robots.’ He went further by committing that ‘KAIST will not conduct any research activities counter to human dignity including autonomous weapons lacking meaningful human control.’

“Given this swift and clear commitment to the responsible use of artificial intelligence in the development of weapons, the 56 AI and robotics researchers who were signatories to the boycott have rescinded the action. They will once again visit and host researchers from KAIST, and collaborate on scientific projects.”

UPDATE 4-5-18: In response to the boycott, KAIST President Sung-Chul Shin released an official statement to the press. In it, he says:

“I would like to reaffirm that KAIST does not have any intention to engage in development of lethal autonomous weapons systems and killer robots. KAIST is significantly aware of ethical concerns in the application of all technologies including artificial intelligence.

“I would like to stress once again that this research center at KAIST, which was opened in collaboration with Hanwha Systems, does not intend to develop any lethal autonomous weapon systems and the research activities do not target individual attacks.”

ORIGINAL ARTICLE 4-4-18:

Leading artificial intelligence researchers from around the world are boycotting South Korea’s KAIST (Korea Advanced Institute of Science and Technology) after the institute announced a partnership with Hanwha Systems to create a center that will help develop technology for AI weapons systems.

The boycott, organized by AI researcher Toby Walsh, was announced just days before the start of the next United Nations Convention on Conventional Weapons (CCW) meeting in which countries will discuss how to address challenges posed by autonomous weapons. 

“At a time when the United Nations is discussing how to contain the threat posed to international security by autonomous weapons, it is regrettable that a prestigious institution like KAIST looks to accelerate the arms race to develop such weapons,” the boycott letter states. 

The letter also explains the concerns AI researchers have regarding autonomous weapons:

“If developed, autonomous weapons will be the third revolution in warfare. They will permit war to be fought faster and at a scale greater than ever before. They have the potential to be weapons of terror. Despots and terrorists could use them against innocent populations, removing any ethical restraints. This Pandora’s box will be hard to close if it is opened.”

The letter has been signed by over 50 of the world’s leading AI and robotics researchers from 30 countries, including professors Yoshua Bengio, Geoffrey Hinton, Stuart Russell, and Wolfram Burgard.

Explaining the boycott, the letter states:

“We therefore publicly declare that we will boycott all collaborations with any part of KAIST until such time as the President of KAIST provides assurances, which we have sought but not received, that the Center will not develop autonomous weapons lacking meaningful human control. We will, for example, not visit KAIST, host visitors from KAIST, or contribute to any research project involving KAIST.”

In February, the Korean Times reported on the opening of the Research Center for the Convergence of National Defense and Artificial Intelligence, which was formed as a partnership between KAIST and Hanwha to “[join] the global competition to develop autonomous arms.” The Korean Times article added that “researchers from the university and Hanwha will carry out various studies into how technologies of the Fourth Industrial Revolution can be utilized on future battlefields.”

In the press release for the boycott, Walsh referenced concerns that he and other AI researchers have had since 2015, when he and FLI released an open letter signed by thousands of researchers calling for a ban on autonomous weapons.

“Back in 2015, we warned of an arms race in autonomous weapons,” said Walsh. “That arms race has begun. We can see prototypes of autonomous weapons under development today by many nations including the US, China, Russia and the UK. We are locked into an arms race that no one wants to happen. KAIST’s actions will only accelerate this arms race.”

Many organizations and people have come together through the Campaign to Stop Killer Robots to advocate for a UN ban on lethal autonomous weapons. In her summary of the last United Nations CCW meeting in November, 2017, Ray Acheson of Reaching Critical Will wrote:

“It’s been four years since we first began to discuss the challenges associated with the development of autonomous weapon systems (AWS) at the United Nations. … But the consensus-based nature of the Convention on Certain Conventional Weapons (CCW) in which these talks have been held means that even though the vast majority of states are ready and willing to take some kind of action now, they cannot because a minority opposes it.”

Walsh adds, “I am hopeful that this boycott will add urgency to the discussions at the UN that start on Monday. It sends a clear message that the AI & Robotics community do not support the development of autonomous weapons.”

To learn more about autonomous weapons and efforts to ban them, visit the Campaign to Stop Killer Robots and autonomousweapons.org. The full open letter and signatories are below.

Open Letter:

As researchers and engineers working on artificial intelligence and robotics, we are greatly concerned by the opening of a “Research Center for the Convergence of National Defense and Artificial Intelligence” at KAIST in collaboration with Hanwha Systems, South Korea’s leading arms company. It has been reported that the goals of this Center are to “develop artificial intelligence (AI) technologies to be applied to military weapons, joining the global competition to develop autonomous arms.”

At a time when the United Nations is discussing how to contain the threat posed to international security by autonomous weapons, it is regrettable that a prestigious institution like KAIST looks to accelerate the arms race to develop such weapons. We therefore publicly declare that we will boycott all collaborations with any part of KAIST until such time as the President of KAIST provides assurances, which we have sought but not received, that the Center will not develop autonomous weapons lacking meaningful human control. We will, for example, not visit KAIST, host visitors from KAIST, or contribute to any research project involving KAIST.

If developed, autonomous weapons will be the third revolution in warfare. They will permit war to be fought faster and at a scale greater than ever before. They have the potential to be weapons of terror. Despots and terrorists could use them against innocent populations, removing any ethical restraints. This Pandora’s box will be hard to close if it is opened. As with other technologies banned in the past like blinding lasers, we can simply decide not to develop them. We urge KAIST to follow this path, and work instead on uses of AI to improve and not harm human lives.

 

FULL LIST OF SIGNATORIES TO THE BOYCOTT

Alphabetically by country, then by family name.

  • Prof. Toby Walsh, USNW Sydney, Australia.
  • Prof. Mary-Anne Williams, University of Technology Sydney, Australia.
  • Prof. Thomas Either, TU Wein, Austria.
  • Prof. Paolo Petta, Austrian Research Institute for Artificial Intelligence, Austria.
  • Prof. Maurice Bruynooghe, Katholieke Universiteit Leuven, Belgium.
  • Prof. Marco Dorigo, Université Libre de Bruxelles, Belgium.
  • Prof. Luc De Raedt, Katholieke Universiteit Leuven, Belgium.
  • Prof. Andre C. P. L. F. de Carvalho, University of São Paulo, Brazil.
  • Prof. Yoshua Bengio, University of Montreal, & scientific director of MILA, co-founder of Element AI, Canada.
  • Prof. Geoffrey Hinton, University of Toronto, Canada.
  • Prof. Kevin Leyton-Brown, University of British Columbia, Canada.
  • Prof. Csaba Szepesvari, University of Alberta, Canada.
  • Prof. Zhi-Hua Zhou,Nanjing University, China.
  • Prof. Thomas Bolander, Danmarks Tekniske Universitet, Denmark.
  • Prof. Malik Ghallab, LAAS-CNRS, France.
  • Prof. Marie-Christine Rousset, University of Grenoble Alpes, France.
  • Prof. Wolfram Burgard, University of Freiburg, Germany.
  • Prof. Bernd Neumann, University of Hamburg, Germany.
  • Prof. Bernhard Schölkopf, Director, Max Planck Institute for Intelligent Systems, Germany.
  • Prof. Manolis Koubarakis, National and Kapodistrian University of Athens, Greece.
  • Prof. Grigorios Tsoumakas, Aristotle University of Thessaloniki, Greece.
  • Prof. Benjamin W. Wah, Provost, The Chinese University of Hong Kong, Hong Kong.
  • Prof. Dit-Yan Yeung, Hong Kong University of Science and Technology, Hong Kong.
  • Prof. Kristinn R. Thórisson, Managing Director, Icelandic Institute for Intelligent Machines, Iceland.
  • Prof. Barry Smyth, University College Dublin, Ireland.
  • Prof. Diego Calvanese, Free University of Bozen-Bolzano, Italy.
  • Prof. Nicola Guarino, Italian National Research Council (CNR), Trento, Italy.
  • Prof. Bruno Siciliano, University of Naples, Italy.
  • Prof. Paolo Traverso, Director of FBK, IRST, Italy.
  • Prof. Yoshihiko Nakamura, University of Tokyo, Japan.
  • Prof. Imad H. Elhajj, American University of Beirut, Lebanon.
  • Prof. Christoph Benzmüller, Université du Luxembourg, Luxembourg.
  • Prof. Miguel Gonzalez-Mendoza, Tecnológico de Monterrey, Mexico.
  • Prof. Raúl Monroy, Tecnológico de Monterrey, Mexico.
  • Prof. Krzysztof R. Apt, Center Mathematics and Computer Science (CWI), Amsterdam, the Netherlands.
  • Prof. Angat van den Bosch, Radboud University, the Netherlands.
  • Prof. Bernhard Pfahringer, University of Waikato, New Zealand.
  • Prof. Helge Langseth, Norwegian University of Science and Technology, Norway.
  • Prof. Zygmunt Vetulani, Adam Mickiewicz University in Poznań, Poland.
  • Prof. José Alferes, Universidade Nova de Lisboa, Portugal.
  • Prof. Luis Moniz Pereira, Universidade Nova de Lisboa, Portugal.
  • Prof. Ivan Bratko, University of Ljubljana, Slovenia.
  • Prof. Matjaz Gams, Jozef Stefan Institute and National Council for Science, Slovenia.
  • Prof. Hector Geffner, Universitat Pompeu Fabra, Spain.
  • Prof. Ramon Lopez de Mantaras, Director, Artificial Intelligence Research Institute, Spain.
  • Prof. Alessandro Saffiotti, Orebro University, Sweden.
  • Prof. Boi Faltings, EPFL, Switzerland.
  • Prof. Jürgen Schmidhuber, Scientific Director, Swiss AI Lab, Universià della Svizzera italiana, Switzerland.
  • Prof. Chao-Lin Liu, National Chengchi University, Taiwan.
  • Prof. J. Mark Bishop, Goldsmiths, University of London, UK.
  • Prof. Zoubin Ghahramani, University of Cambridge, UK.
  • Prof. Noel Sharkey, University of Sheffield, UK.
  • Prof. Luchy Suchman, Lancaster University, UK.
  • Prof. Marie des Jardins, University of Maryland, USA.
  • Prof. Benjamin Kuipers, University of Michigan, USA.
  • Prof. Stuart Russell, University of California, Berkeley, USA.
  • Prof. Bart Selman, Cornell University, USA.