More in-depth background reading about risks and benefits of biotechnology.

FHI: Putting Odds on Humanity’s Extinction

Putting Odds on Humanity’s Extinction
The Team Tasked With Predicting-and Preventing-Catastrophe
by Carinne Piekema
May 13, 2015

Bookmark and Share

Not long ago, I drove off in my car to visit a friend in a rustic village in the English countryside. I didn’t exactly know where to go, but I figured it didn’t matter because I had my navigator at the ready. Unfortunately for me, as I got closer, the GPS signal became increasingly weak and eventually disappeared. I drove around aimlessly for a while without a paper map, cursing my dependence on modern technology.


It may seem gloomy to be faced with a graph that predicts the
potential for extinction, but the FHI researchers believe it can
stimulate people to start thinking—and take action.

But as technology advances over the coming years, the consequences of it failing could be far more troubling than getting lost. Those concerns keep the researchers at the Future of Humanity Institute (FHI) in Oxford occupied—and the stakes are high. In fact, visitors glancing at the white boards surrounding the FHI meeting area would be confronted by a graph estimating the likelihood that humanity dies out within the next 100 years. Members of the Institute have marked their personal predictions, from some optimistic to some seriously pessimistic views estimating as high as a 40% chance of extinction. It’s not just the FHI members: at a conference held in Oxford some years back, a group of risk researchers from across the globe suggested the likelihood of such an event is 19%. “This is obviously disturbing, but it still means that there would be 81% chance of it not happening,” says Professor Nick Bostrom, the Institute’s director.

That hope—and challenge—drove Bostrom to establish the FHI in 2005. The Institute is devoted precisely to considering the unintended risks our technological progress could pose to our existence. The scenarios are complex and require forays into a range of subjects including physics, biology, engineering, and philosophy. “Trying to put all of that together with a detailed attempt to understand the capabilities of what a more mature technology would unleash—and performing ethical analysis on that—seemed like a very useful thing to do,” says Bostrom.

Far from being bystanders in the face
of apocalypse, the FHI researchers are
working hard to find solutions.

In that view, Bostrom found an ally in British-born technology consultant and author James Martin. In 2004, Martin had donated approximately $90 million US dollars—one of the biggest single donations ever made to the University of Oxford—to set up the Oxford Martin School. The school’s founding aim was to address the biggest questions of the 21st Century, and Bostrom’s vision certainly qualified. The FHI became part of the Oxford Martin School.

Before the FHI came into existence, not much had been done on an organised scale to consider where our rapid technological progress might lead us. Bostrom and his team had to cover a lot of ground. “Sometimes when you are in a field where there is as yet no scientific discipline, you are in a pre-paradigm phase: trying to work out what the right questions are and how you can break down big, confused problems into smaller sub-problems that you can then do actual research on,” says Bostrom.

Though the challenge might seem like a daunting task, researchers at the Institute have a host of strategies to choose from. “We have mathematicians, philosophers, and scientists working closely together,” says Bostrom. “Whereas a lot of scientists have kind of only one methodology they use, we find ourselves often forced to grasp around in the toolbox to see if there is some particular tool that is useful for the particular question we are interested in,” he adds. The diverse demands on their team enable the researchers to move beyond “armchair philosophising”—which they admit is still part of the process—and also incorporate mathematical modelling, statistics, history, and even engineering into their work.

“We can’t just muddle through and learn
from experience and adapt. We have to
anticipate and avoid existential risk.
We only have one chance.”
– Nick Bostrom

Their multidisciplinary approach turns out to be incredibly powerful in the quest to identify the biggest threats to human civilisation. As Dr. Anders Sandberg, a computational neuroscientist and one of the senior researchers at the FHI explains: “If you are, for instance, trying to understand what the economic effects of machine intelligence might be, you can analyse this using standard economics, philosophical arguments, and historical arguments. When they all point roughly in the same direction, we have reason to think that that is robust enough.”

The end of humanity?

Using these multidisciplinary methods, FHI researchers are finding that the biggest threats to humanity do not, as many might expect, come from disasters such as super volcanoes, devastating meteor collisions or even climate change. It’s much more likely that the end of humanity will follow as an unintended consequence of our pursuit of ever more advanced technologies. The more powerful technology gets, the more devastating it becomes if we lose control of it, especially if the technology can be weaponized. One specific area Bostrom says deserves more attention is that of artificial intelligence. We don’t know what will happen as we develop machine intelligence that rivals—and eventually surpasses—our own, but the impact will almost certainly be enormous. “You can think about how the rise of our species has impacted other species that existed before—like the Neanderthals—and you realise that intelligence is a very powerful thing,” cautions Bostrom. “Creating something that is more powerful than the human species just seems like the kind of thing to be careful about.”


Nick Bostrom, Future of Humanity Institute Director

Far from being bystanders in the face of apocalypse, the FHI researchers are working hard to find solutions. “With machine intelligence, for instance, we can do some of the foundational work now in order to reduce the amount of work that remains to be done after the particular architecture for the first AI comes into view,” says Bostrom. He adds that we can indirectly improve our chances by creating collective wisdom and global access to information to allow societies to more rapidly identify potentially harmful new technological advances. And we can do more: “There might be ways to enhance biological cognition with genetic engineering that could make it such that if AI is invented by the end of this century, might be a different, more competent brand of humanity ,” speculates Bostrom.

Perhaps one of the most important goals of risk researchers for the moment is to raise awareness and stop humanity from walking headlong into potentially devastating situations. And they are succeeding. Policy makers and governments around the globe are finally starting to listen and actively seek advice from researchers like those at the FHI. In 2014 for instance, FHI researchers Toby Ord and Nick Beckstead wrote a chapter for the Chief Scientific Adviser’s annual report setting out how the government in the United Kingdom should evaluate and deal with existential risks posed by future technology. But the FHI’s reach is not limited to the United Kingdom. Sandberg was on the advisory board of the World Economic Forum to give guidance on the misuse of emerging technologies for the report that concludes a decade of global risk research published this year.

Despite the obvious importance of their work the team are still largely dependent on private donations. Their multidisciplinary and necessarily speculative work does not easily fall into the traditional categories of priority funding areas drawn up by mainstream funding bodies. In presentations, Bostrom has been known to show a graph that depicts academic interest for various topics, from dung beetles and Star Trek to zinc oxalate, which all appear to receive far greater credit than the FHI’s type of research concerning the continued existence of humanity. Bostrom laments this discrepancy between stakes and attention: “We can’t just muddle through and learn from experience and adapt. We have to anticipate and avoid existential risk. We only have one chance.”


“Creating something that is more powerful than the human
species just seems like the kind of thing to be careful about.”

It may seem gloomy to be faced every day with a graph that predicts the potential disasters that could befall us over the coming century, but instead, the researchers at the FHI believe that such a simple visual aid can stimulate people to face up to the potentially negative consequences of technological advances.

Despite being concerned about potential pitfalls, the FHI researchers are quick to agree that technological progress has made our lives measurably better over the centuries, and neither Bostrom nor any of the other researchers suggest we should try to stop it. “We are getting a lot of good things here, and I don’t think I would be very happy living in the Middle Ages,” says Sandberg, who maintains an unflappable air of optimism. He’s confident that we can foresee and avoid catastrophe. “We’ve solved an awful lot of other hard problems in the past,” he says.

Technology is already embedded throughout our daily existence and its role will only increase in the coming years. But by helping us all face up to what this might mean, the FHI hopes to allow us not to be intimidated and instead take informed advantage of whatever advances come our way. How does Bostrom see the potential impact of their research? “If it becomes possible for humanity to be more reflective about where we are going and clear-sighted where there may be pitfalls,” he says, “then that could be the most cost-effective thing that has ever been done.”

CSER: Playing with Technological Dominoes

Playing with Technological Dominoes
Advancing Research in an Era When Mistakes Can Be Catastrophic
by Sophie Hebden
April 7, 2015

Bookmark and Share

The new Centre for the Study of Existential Risk at Cambridge University isn’t really there, at least not as a physical place—not yet. For now, it’s a meeting of minds, a network of people from diverse backgrounds who are worried about the same thing: how new technologies could cause huge fatalities and even threaten our future as a species. But plans are coming together for a new phase for the centre to be in place by the summer: an on-the-ground research programme.


We learn valuable information by creating powerful
viruses in the lab, but risk a pandemic if an accident
releases it. How can we weigh the costs and benefits?

Ever since our ancestors discovered how to make sharp stones more than two and a half million years ago, our mastery of tools has driven our success as a species. But as our tools become more powerful, we could be putting ourselves at risk should they fall into the wrong hands— or if humanity loses control of them altogether. Concerned with bioengineered viruses, unchecked climate change, and runaway artificial intelligence? These are the challenges the Centre for the Study of Existential Risk (CSER) was founded to grapple with.

At its heart, CSER is about ethics and the value you put on the lives of future, unborn people. If we feel any responsibility to the billions of people in future generations, then a key concern is ensuring that there are future generations at all.

The idea for the CSER began as a conversation between a philosopher and a software engineer in a taxi. Huw Price, currently the Bertrand Russell Professor of Philosophy at Cambridge University, was on his way to a conference dinner in Copenhagen in 2011. He happened to share his ride with another conference attendee: Skype’s co-founder Jaan Tallinn.

“I thought, ’Oh that’s interesting, I’m in a taxi with one of the founders of Skype’ so I thought I’d better talk to him,” joked Price. “So I asked him what he does these days, and he explained that he spends a lot of his time trying to persuade people to pay more attention to the risk that artificial intelligence poses to humanity.”

“The overall goal of CSER is to write
a manual for managing and ameliorating
these sorts of risks in future.”
– Huw Price

In the past few months, numerous high-profile figures—including the founders of Google’s DeepMind machine-learning program and IBM’s Watson team—have been voicing concerns about the potential for high-level AI to cause unintended harms. But in 2011, it was startling for Price to find someone so embedded and successful in the computer industry taking AI risk seriously. He met privately with Tallinn shortly afterwards.

Plans came to fruition later at Cambridge when Price spoke to astronomer Martin Rees, the UK’s Astronomer Royal—a man well-known for his interest in threats to the future of humanity. The two made plans for Tallinn to come to the University to give a public lecture, enabling the three to meet. It was at that meeting that they agreed to establish CSER.

Price traces the start of CSER’s existence—at least online—to its website launch in June 2012. Under Rees’ influence, it quickly took on a broad range of topics, including the risks posed by synthetic biology, runaway climate change, and geoengineering.


Huw Price

“The overall goal of CSER,” says Price, painting the vision for the organisation with broad brush strokes, “Is to write a manual, metaphorically speaking, for managing and ameliorating these sorts of risks in future.”

In fact, despite its rather pessimistic-sounding emphasis on risks, CSER is very much pro-technology: if anything, it wants to help developers and scientists make faster progress, declares Rees. “The buzzword is ’responsible innovation’,” he says. “We want more and better-directed technology.”

Its current strategy is to use all its reputational power—which is considerable, as a Cambridge University institute—to gather experts together to decide on what’s needed to understand and reduce the risks. Price is proud of CSER’s impressive set of board members, which includes the world-famous theoretical physicist Stephen Hawking, as well as world leaders in AI, synthetic biology and economic theory.

He is frank about the plan: “We deliberately built an advisory board with a strong emphasis on people who are extremely well-respected to counter any perception of flakiness that these risks can have.”

The plan is working, he says. “Since we began to talk about AI risk there’s been a very big change in attitude. It’s become much more of a mainstream topic than it was two years ago, and that’s partly thanks to CSER.”

Even on more well-known subjects, CSER calls attention to new angles and perspectives on problems. Just last month, it launched a monthly seminar series by hosting a debate on the benefits and risks of research into potential pandemic pathogens.

The seminar focused on a controversial series of experiments by researchers in the Netherlands and the US to try to make the bird flu virus H5N1 transmissible between humans. By adding mutations to the virus they found it could transmit through the air between ferrets—the animal closest to humans when modelling the flu.

The answer isn’t “let’s shout at each
other about whether someone’s going
to destroy the world or not.” The right
answer is, “let’s work together to
develop this safely.”
– Sean O’hEigeartaigh, CSER Executive Director

Epidemiologist Marc Lipsitch of Harvard University presented his calculations of the ’unacceptable’ risk that such research poses, whilst biologist Derek Smith of Cambridge University, who was a co-author on the original H5N1 study, argued why such research is vitally important.

Lipsitch explained that although the chance of an accidental release of the virus is low, any subsequent pandemic could kill more than a billion people. When he combined the risks with the costs, he found that each laboratory doing a single year of research is the equivalent of causing at least 2,000 fatalities. He considers this risk unacceptable. Even if he’s only right within a factor of 1,000, he later told me, then the research is too dangerous.

Smith argued that we can’t afford not to do this research, that knowledge is power—in this case the power to understand the importance of the mutations and how effective our vaccines are at preventing further infections. Research, he said, is essential for understanding whether we need to start “spending millions on preparing for a pandemic that could easily arise naturally—for instance by stockpiling antiviral treatments or culling poultry in China.”

CSER’s seminar series brings the top minds to Cambridge to grapple with important questions like these. The ideas and relationships formed at such events grow into future workshops that then beget more ideas and relationships, and the network grows. Whilst its links across the Atlantic are strongest, CSER is also keen to pursue links with European researchers. “Our European links seem particularly interested in the bio-risk side,” says Price.


Sean O’hEigeartaigh

The scientific attaché to Germany’s government approached CSER in October 2013, and in September 2014 CSER co-organised a meeting with Germany on existential risk. This led to two other workshops on managing risk in biotechnology and research into flu transmission—the latter hosted by Volkswagen in December 2014.

In addition to working with governments, CSER also plans to sponsor visits from researchers and leaders in industry, exchanging a few weeks of staff time for expert knowledge at the frontier of developments. It’s an interdisciplinary venture to draw together and share different innovators’ ideas about the extent and time-frames of risks. The larger the uncertainties, the bigger the role CSER can play in canvassing opinion and researching the risk.

“It’s fascinating to me when the really top experts disagree so much,” says Sean O’hEigeartaigh, CSER’s Executive Director. Some leading developers estimate that human-level AI will be achieved within 30-40 years, whilst others think it will take as long as 300 years. “When the stakes are so high, as they are for AI and synthetic biology, that makes it even more exciting,” he adds.

Despite its big vision and successes, CSER’s path won’t be easy. “There’s a misconception that if you set up a centre with famous people then the University just gives you money; that’s not what happens,” says O’hEigeartaigh.

Instead, they’ve had to work at it, and O’hEigartaigh was brought on board in November 2012 to help grow the organization. Through a combination of grants and individual donors, he has attracted enough funding to install three postdocs, who will be in place by the summer of 2015. Some major grants are in the works, and if all goes well, CSER will be a considerably larger team in the next year.

With a research team on the ground, Price envisions a network of subprojects working on different aspects: listening to experts’ concerns, predicting the timescales and risks more accurately through different techniques, and trying to reduce some of the uncertainties—even a small reduction will help.

Rees believes there’s still a lot of awareness-raising work to do ’front-of-house’: he wants to see the risks posed by AI and synthetic biology become as mainstream as climate change, but without so much of the negativity.

“The answer isn’t ’let’s shout at each other about whether someone’s going to destroy the world or not’,” says O’hEigeartaigh. “The right answer is, ’let’s work together to develop this safely’.” Remembering the animated conversations in the foyer that buzzed with excitement following CSER’s seminar, I feel optimistic: it’s good to know some people are taking our future seriously.

GCRI: Aftermath

Aftermath
Finding practical paths to recovery after a worldwide catastrophe.
by Steven Ashley
March 13, 2015

Bookmark and Share


Tony Barrett
Global Catastrophic Risk Institute

OK, we survived the cataclysm. Now what?

In recent years, warnings by top scientists and industrialists have energized research into the sort of civilization-threatening calamities that are typically the stuff of sci-fi and thriller novels: asteroid impacts, supervolcanoes, nuclear war, pandemics, bioterrorism, even the rise of a super-smart, but malevolent artificial intelligence.

But what comes afterward? What happens to the survivors? In particular, what will they eat? How will they stay warm and find electricity? How will they rebuild and recover?

These “aftermath” issues comprise some of largest points of uncertainty regarding humanity’s gravest threats. And as such they constitute some of the principal research focuses of the Global Catastrophic Risk Institute (GCRI), a nonprofit think tank that Seth Baum and Tony Barrett founded in late 2011. Baum, a New York City-based engineer and geographer, is GCRI’s executive director. Barrett, who serves as its director of research, is a senior risk analyst at ABS Consulting in Washington, DC, which performs probabilistic risk assessment and other services.

Black Swan Events

At first glance, it may sound like GCRI is making an awful lot of fuss about dramatic worst-case scenarios that are unlikely to pan out any time soon. “In any given year, there’s only a small chance that one of these disasters will occur,” Baum concedes. But the longer we wait, he notes, the greater the chance that we will experience one of these “Black Swan events” (so called because before a black swan was spotted by an explorer in the seventeenth century, it was taken for granted that these birds did not exist). “We’re trying to instil a sense of urgency in governments and society in general that these risks need to be faced now to keep the world safe,” Baum says.

GCRI’s general mission is to find ways to mobilize the world’s thinkers to identify the really big risks facing the planet, how they might cooperate for optimal effect, and the best approaches to addressing the threats. The institute has no physical base, but it serves as a virtual hub, assembling “the best empirical data and the best expert judgment,” and rolling them into risk models that can help guide our actions, Barrett says. Researchers, brought together through GCRI, often collaborate remotely. Judging the real risks posed by these low-odds, high-consequence events is no simple task, he says: “In most cases, we are dealing with extremely sparse data sets about occurrences that seldom, if ever, happened before.”


Feeding Everyone No Matter What
Following a cataclysm that blocks out the sun, what will survivors eat?
Credit: J M Gehrke

Beyond ascertaining which global catastrophes are most likely to occur, GCRI seeks to learn how multiple events might interact. For instance, could a nuclear disaster lead to a change in climate that cuts food supplies while encouraging a pandemic caused by the loss of medical resources? “To best convey these all-too-real risks to various sectors of society, it’s not enough to merely characterize them,” Baum says. Tackling such multi-faceted scenarios requires an interdisciplinary approach that would enable GCRI experts to recognize potential shared mitigation strategies that could enhance the chances of recovery, he adds.

One of the more notable GCRI projects focuses on the aftermath of calamity. This analysis was conducted by research associate Dave Denkenberger, who is an energy efficiency engineer at Ecova, an energy and utility management firm in Durango, Colorado. Together with engineer Joshua M. Pearce, of Michigan Technological University in Houghton, he looked at a key issue: If one of these catastrophes does occur, how do we feed the survivors?

Worldwide, people currently eat about 1.5 billion tons of food a year. For a book published in 2014, Feeding Everyone No Matter What: Managing Food Security After Global Catastrophe, the pair researched alternative food sources that could be ramped up within five or fewer years following a disaster that involves a significant change in climate. In particular, the discussion looks at what could be done to feed the world should the climate suffer from an abrupt, single-decade drop in temperature of about 10°C that wipes out crops regionally, reducing food supplies by 10 per cent. This phenomenon has already occurred many times in the past.

Sun Block

Even more serious are scenarios that block the sun, which could cause a 10°C temperature drop globally in only a single year or so. Such a situation could arise should smoke enter the stratosphere from a nuclear winter resulting from an atomic exchange that burns big cities, an asteroid or comet impact, or a supervolcano eruption such as what may one day occur at Yellowstone National Park.

These risks need to be faced
now to keep the world safe.
– Seth Baum

Other similar, though probably less likely, scenarios, Denkenberger says, might derive from the spread of some crop-killing organism—a highly invasive superweed, a superbacterium that displaces beneficial bacteria, a virulent pathogenic bacterium, or a super pest (an insect). Any of these might happen naturally, but they could be even more serious should they result from a coordinated terrorist attack.

“Our approach is to look across disciplines to consider every food source that’s not dependent on the sun,” Denkenberger explains. The book considers various ways of converting vegetation and fossil fuels to edible food. The simplest potential solution may be to grow mushrooms on the dead trees, “but you could do much the same by using enzymes or bacteria to partially digest the dead plant fiber and then feed it to animals,” he adds. Ruminants including cows, sheep, goats, or more likely, faster-reproducing animals like rats, chickens or beetles could do the honors.


Seth Baum
Global Catastrophic Risk Institute

A more exotic solution would be to use bacteria to digest natural gas into sugars, and then eat the bacteria. In fact, a Danish company called Unibio is making animal feed from commercially stranded methane now.

Meanwhile, the U.S. Department of Homeland Security is funding another GCRI project that assesses the risks posed by the arrival of new technologies in synthetic biology or advanced robotics which might be co-opted by terrorists or criminals for use as weapons. “We’re trying to produce forecasts that estimate when these technologies might become available to potential bad actors,” Barrett says.

Focusing on such worst-case scenarios could easily dampen the spirits of GCRI’s researchers. But far from fretting, Baum says that he came to the world of existential risk (or ‘x-risk’) from his interest in the ethics of utilitarianism, which emphasizes actions aimed at maximizing total benefit to people and other sentient beings while minimizing suffering. As an engineering grad student, Baum even had a blog on utilitarianism. “Other people on the blog pointed out how the ethical views I was promoting implied a focus on the big risks,” he recalls. “This logic checked out and I have been involved with x-risks ever since.”

Barrett takes a somewhat more jaundiced view of his chosen career: “Oh yeah, we’re lots of fun at dinner parties…”

MIRI: Artificial Intelligence: The Danger of Good Intentions

Nate Soares (left) and Nisan Stiennon (right)
The Machine Intelligence Research Institute
Credit: Vivian Johnson

The Terminator had Skynet, an intelligent computer system that turned against humanity, while the astronauts in 2001: A Space Odyssey were tormented by their spaceship’s sentient computer HAL 9000, which had gone rogue. The idea that artificial systems could gain consciousness and try to destroy us has become such a cliché in science fiction that it now seems almost silly. But prominent experts in computer science, psychology, and economics warn that while the threat is probably more banal than those depicted in novels and movies, it is just as real—and unfortunately much more challenging to overcome.

The core concern is that getting an entity with artificial intelligence (AI) to do what you want isn’t as simple as giving it a specific goal. Humans know to balance any one aim with others and with our shared values and common sense. But without that understanding, an AI might easily pose risks to our safety, with no malevolence or even consciousness required. Addressing this danger is an enormous—and very technical—problem, but that’s the task that researchers at the Machine Intelligence Research Institute (MIRI), in Berkeley, California are taking on.

MIRI grew from the Singularity Institute for Artificial Intelligence (SIAI), which was founded in 2000 by Eliezer Yudkowsky and initially funded by Internet entrepreneurs Brian and Sabine Atkins. Largely self-educated, Yudkowsky became interested in AI after reading about the movement to improve human capabilities with technology in his teens. For the most part, he hasn’t looked back. Though he’s written about psychology and philosophy of science for general audiences, Yudkowsky’s research has always been concerned with AI.

Back in 2000, Yudkowsky had somewhat different aims. Rather than focusing on the potential dangers of AI, his original goals reflected the optimism surrounding the subject, at that time. “Amusingly,” Luke Muehlhauser, now MIRI’s executive director and a former IT administrator who first visited the institute in 2011, says, “the Institute was founded to accelerate toward artificial intelligence.”

Teaching Values

However, it wasn’t long before Yudkowsky realised that the more important challenge was figuring out how to do that safely by getting AI to incorporate our values in their decision making. “It caused me to realize, with some dismay, that it was actually going to be technically very hard,” Yudkowsky says, even by comparison with the problem of creating a hyperintelligent machine capable of thinking about whatever sorts of problems we might give it.

In 2013, SIAI rebranded itself as MIRI, with a largely new staff, in order to refocus on the rich scientific problems related to creating well-behaved AI. To get a handle on this challenge, Muehlhauser suggests, consider assigning a robot with superhuman intelligence the task of making paper clips. The robot has a great deal of computational power and general intelligence at its disposal, so it ought to have an easy time figuring out how to fulfil its purpose, right?

Humans have huge
anthropomorphic blind
spots.
– Nate Soares

Not really. Human reasoning is based on an understanding derived from a combination of personal experience and collective knowledge derived over generations, explains MIRI researcher Nate Soares, who trained in computer science in college. For example, you don’t have to tell managers not to risk their employees’ lives or strip mine the planet to make more paper clips. But AI paper-clip makers are vulnerable to making such mistakes because they do not share our wealth of knowledge. Even if they did, there’s no guarantee that human-engineered intelligent systems would process that knowledge the same way we would.

Worse, Soares says, we cannot just program the AI with the right ways to deal with any conceivable circumstance it may come across because among our human weaknesses is a difficulty enumerating all possible scenarios. Nor can we rely on just pulling the machine’s plug if it goes awry. A sufficiently intelligent machine handed a task but lacking a moral and ethical compass would likely disable the off switch because it would quickly figure out that its presence could prevent it from achieving the goal we gave it. “Your lines of defense are up against something super intelligent,” say Soares.


Pondering the perfect paper clip.
How can we train AI to care less about their goals?
Credit: tktk

So who is qualified to go up against an overly zealous AI? The challenges now being identified are so new that no single person has the perfect training to work out the right direction, says Muehlhauser. With this in mind, MIRI’s directors have hired from diverse backgrounds. Soares, for example, studied computer science, economics, and mathematics as an undergraduate and worked as a software engineer prior to joining the institute—probably exactly what the team needs. MIRI, Soares says, is “sort of carving out a cross section of many different fields,” including philosophy and computer science. That’s essential, Soares adds, because understanding how to make artificial intelligence safe will take a variety of perspectives to help create the right conceptual and mathematical framework.

Programming Indifference

One promising idea to ensure that AI behave well is enabling them to take constructive criticism. AI has built-in incentives to remove restrictions people place on it if it thinks it can reach its goals faster. So how can you persuade AI to cooperate with engineers offering corrective action? Soares’ answer is to program in a kind of indifference; if a machine doesn’t care which purpose it pursues, perhaps it wouldn’t mind so much if its creators wanted to modify its present goal.

There is also an issue about how we can be sure that, even with our best safeguards in place, the AI’s design will work as intended. It turns out that some approaches to gaining that confidence run into some fundamental mathematical problems closely related to Gödel’s Incompleteness Theorem, which says that in any logical system, there are statements that cannot be proved true or false. That’s something of a problem if you want to anticipate what your logical system—your AI—is going to do. “It’s much harder than it looks to formalize” those sorts of ideas mathematically, Soares says.

But it is not just in humanity’s best interests to do so. According to the “tiling problem,” benevolent AIs that we design may be at risk of creating future generations of even smarter AIs. Those systems will likely be outside our understanding, and possibly lie beyond their AI creators’ control too. As outlandish as this seems, MIRI’s researchers stress that humans should never make the mistake of thinking that AIs will think like us, says Soares. “Humans have huge anthropomorphic blind spots,” he says.

For now, MIRI is concentrating on scaling up, by hiring more people to work on this growing list of problems. In some sense, MIRI, with its ever-evolving awareness of the dangers that lie ahead is like an AI with burgeoning consciousness. After all, as Muehlhauser says, MIRI’s research is “morphing all the time.”