Tay the Racist Chatbot: Who is responsible when a machine learns to be evil?


By far the most entertaining AI news of the past week was the rise and rapid fall of Microsoft’s teen-girl-imitation Twitter chatbot, Tay, whose Twitter tagline described her as “Microsoft’s AI fam* from the internet that’s got zero chill.”

(* Btw, I’m officially old–I had to consult Urban Dictionary to confirm that I was correctly understanding what “fam” and “zero chill” meant. “Fam” means “someone you consider family” and “no chill” means “being particularly reckless,” in case you were wondering.)

The remainder of the tagline declared: “The more you talk the smarter Tay gets.”

Or not.  Within 24 hours of going online, Tay started saying some weird stuff.  And then some offensive stuff.  And then some really offensive stuff.  Like calling Zoe Quinn a “stupid whore.”  And saying that the Holocaust was “made up.”  And saying that black people (she used a far more offensive term) should be put in concentration camps.  And that she supports a Mexican genocide.  The list goes on.

So what happened?  How could a chatbot go full Goebbels within a day of being switched on?  Basically, Tay was designed to develop its conversational skills by using machine learning, most notably by analyzing and incorporating the language of tweets sent to her by human social media users. What Microsoft apparently did not anticipate is that Twitter trolls would intentionally try to get Tay to say offensive or otherwise inappropriate things.  At first, Tay simply repeated the inappropriate things that the trolls said to her.  But before too long, Tay had “learned” to say inappropriate things without a human goading her to do so.  This was all but inevitable given that, as Tay’s tagline suggests, Microsoft designed her to have no chill.

Now, anyone who is familiar with the social media cyberworld should not be surprised that this happened–of course a chatbot designed with “zero chill” would learn to be racist and inappropriate because the Twitterverse is filled with people who say racist and inappropriate things.  But fascinatingly, the media has overwhelmingly focused on the people who interacted with Tay rather than on the people who designed Tay when examining why the Degradation of Tay happened.

Read more

News of the Week: AI Writes

The obvious news of the week is Microsoft’s debacle with TayTweets, the twitter bot that, as @geraldmellor explained in his own popular tweet, “went from ‘humans are super cool’ to full nazi in <24 hrs.” But we have guest bloggers working on upcoming articles about Tay that will hopefully go live in the next few days.

Instead, I want to take a look at the AI that coauthored a short novel, which successfully passed through the first round of a science fiction literary competition in Japan.

Novels are arguably among mankind’s greatest achievements. Depending on who one asks, the novel format dates back to the 16th century, though Don Quixote by Cervantes, published in 1605, is often considered to be the first novel of the modern age. Since then, authors have stretched their imaginations, using fiction as a means of drawing attention to political issues, swaying public opinion, and simply entertaining.

While many reports have come out recently about massive job loss as a result of automation, most have anticipated the jobs lost would be lower-skilled work. When AI enthusiasts imagine a world in which general artificial intelligence has been achieved, many anticipate a world in which people will have more free time to follow creative pursuits.

What happens if AI can become as creative as humans? Massive job loss is a big problem we have to tackle, but in most cases, the loss of income is really the underlying issue, and that’s something that we can overcome if we choose to.

Creative pursuits are an entirely different story. Some people have spent their lives dreaming of becoming a novelist. How do we avoid creating AI that crushes people’s life-long dreams?

Happily, in the short term, this particular AI program could be a boon to busy writers. The program relied heavily on the input of the researchers to form the basis of the the story, and, as Satoshi Hase, a Japanese science fiction novelist, noted:

“I was surprised at the work because it was a well-structured novel. But there are still some problems [to overcome] to win the prize, such as character descriptions.”

I can envision programs in the near future in which the novelist comes up with a story idea, inputs that into a program, and the program does the heavy lifting of putting just the first draft together. Then the human wordsmith can go back through the story to turn it into the literary magic that humans are so good at creating. Since getting that first draft together can be the hardest part of writing a book, this could be an incredibly helpful tool for struggling – or even successful – writers.

However, the goal of true AGI is to create intelligence that can mimic, replicate or improve upon all aspects of human intelligence, including imagination and creativity. As Hitoshi Matsubara, the creator of this novel-writing AI said,

“So far, AI programs have often been used to solve problems that have answers, such as Go and shogi. In the future, I’d like to expand AI’s potential [so it resembles] human creativity.”

What would a future look like in which artificial intelligence can outperform humans in even the most creative of tasks?



AlphaGo and AI Fears

Two important pieces of AI news came out last week. The first, of course, was AlphaGo beating the world champion Go player, Lee Se-dol, 4-1 in their well publicized match up. The other was a survey by the British Science Association (BSA) that showed 60% of the public fear that AI will take over jobs, while 36% fear AI will destroy humanity.

The success of AlphaGo, though exciting and impressive, did not help to ease fears of an AI takeover. According to news reports, many in South Korea now fear artificial intelligence even more.

I spoke with a few people in AI, economics and anthropology to understand more about what the true implications of AlphaGo are and whether the public needs to be as worried as the BSA survey indicates.

AlphaGo is an interesting case because, as of a few months ago, most AI researchers would have said that a computer that could beat the world’s top humans at the game of Go was probably still a decade away. Yet, as AI expert Stuart Russell pointed out,

From 10,000 feet there isn’t anything really radically new in the way that AlphaGo is designed.

A variety of methods were employed by AlphaGo in order for it to learn to play Go as well as it does, but none of the methods stand out as new or even unique. They just hadn’t been put together like this before, and this level of computing power is greater than existed previously.

For example, the idea of using a minimax tree search, which essentially searches through all possible moves of a game (branches of the tree), was first considered for chess back in 1913. Evaluation functions were developed in the 1940s to take over when those tree searches got too big because too many moves were involved. Over the next twenty years, tree-pruning methods were developed and improved so programs could determine which sections of the tree represented moves that should be considered and which sections were unnecessary.

The 1940s also saw the development of neural nets, another common AI technique, which were inspired by biological neural architecture. In the 1950s, reinforcement learning was established, which maximizes rewards over time and creates an improvement loop for the evaluation function. By 1959, machine learning pioneer Aurther Samuel used these techniques to develop the first computer program that learned to play checkers.

Russell explained that the newest technique was that of roll-outs, which have greatly improved the tree search capabilities. Even this technique, though, has been around for at least a decade, if not a couple decades, depending on who one asks.

This combination of techniques allowed AlphaGo to learn the rules of Go and play against itself to continuously improve its strategy and ability.

If all of these techniques have been around for so long, why did AlphaGo come as such a surprise to the AI community?

Go is an incredibly large and complicated game, with more possible sequences of moves than there are atoms in the universe. According to Russell, the assumption until recently was that these more standard search and evaluation techniques wouldn’t be as effective for Go as they have been for chess. Researchers believed that in order for a computer program to successfully play Go, it would have to decompose the board game into parts, just as a human player does, and then pull each of those parts back together to analyze the quality of various moves. Research into Go AI hit a plateau for many years, and most AI experts expected that some new technique or capability would have to be developed in order for a program to perform this decomposition of possible outcomes.

Russell, in fact, had been hoping for such a decomposition result. This method would be much more consistent with how an AI would likely have to interact with the real world.

The success of AlphaGo may at least provide a deeper understanding of just how capable current AI methods and techniques are. In response to AlphaGo’s wins, Russell said,

“It tells you that the techniques may be more powerful than we thought.” He also added, “I’m glad it wasn’t 5-0.”

Victoria Krakovna, a co-founder of FLI and a machine learning researcher, was also impressed by the capabilities exhibited by these known programming techniques.

“To me,” Krakovna said, “this is another indication that AI is progressing faster than even many experts anticipated. The hybrid architecture of AlphaGo suggests that large advances can be made by combining existing components in the right way, in this case deep reinforcement learning and tree search. It’s possible that fewer novel breakthroughs remain than previously believed to reach general AI.”

But does this mean advanced AI is something we should fear?

While AI progress is something we want to move forward with in a safe and robust manner, AI researchers aren’t worried that the development of AlphaGo would soon lead to the destruction of humanity. The subject of jobs, though, is another story.

Even prominent AI researchers who aren’t worrying about AI as an existential risk, like Baidu’s Andrew Ng, are worried about the impact AI will have on the job market. However, that doesn’t mean all hope is lost.

In an email, Erik Brynjolfsson, an MIT economist and coauthor of The Second Machine Age, explained,

“Technology using AI will surely make it possible to eliminate some jobs but it will also make it possible to create new ones. The big question is whether the people who lost the old jobs will be able to do the new jobs. The answer will depend on our choices as individuals and as a society.”

Madeleine Clare Elish is a cultural anthropologist at Columbia University who focuses on the evolving role of humans in large-scale automated and autonomous systems. Her reaction to the BSA survey was similar to Brynjolfsson’s. She said:

“When the media focuses on AI in this way, it appears as if the technology develops on its own when, in fact, it is the product of human choices and values. And that is what we need to be talking about: what are the values that are driving AI innovation, and how can we make better choices that protect the public interest?

“When we rile up public fear about AI in the future, we miss all the things that are happening today. We need more people talking about the challenges facing the implementation of AI technologies, like preventing bias or other forms of unfairness in systems and issues of security and privacy.”

Both Brynjolfsson and Elish made a point of noting the choices we have before us. If we as a society can come together, even just a little, we can steer our future in a direction with a much more optimistic outlook for AI and humanity.



Have Nuclear Weapons Kept the Peace?

Miliatry.com published a piece this week about the Pentagon leaders who recently went to Congress to defend spending $1 trillion to overhaul the nuclear triad. Among the quotes mentioned, one by Army Chief of Staff Gen. Mark Milley stood out:

“I just want to be clear, I don’t have a part of the triad, but I can tell you that in my view … that nuclear triad has kept the peace since nuclear weapons were introduced and has sustained the test of time,” Milley said. “That is not unimportant and the system is deteriorating, Congressman, and it needs to be revamped. It needs to be overhauled.”

Yet, since 1945, when the U.S. dropped atomic bombs on Japan, America has been involved in the Korean War, the Vietnam War, the first Gulf War, the Afghanistan War, and the second Gulf War. Meanwhile, we’ve had a military presence in Cuba, the Dominican Republic, Lebanon, Grenada, Iran, Panama, Somalia, Haiti, Bosnia, Yemen, Kosovo, Libya, Pakistan, Syria, Nigeria, and others.

So, it’s not entirely clear what Milley means when he says the “nuclear triad has kept the peace.”

Perhaps he means more specifically that we haven’t had a direct war against Russia since 1945, but that’s not a particularly meaningful argument, given that we hadn’t gone to war with Russia before 1945 either. One could argue that conflict with Russia escalated as a result of nuclear weapons. According to The Atomic Bomb Website,

“On August 29th, the Soviet Union detonated its first atomic bomb, at the Semipalatinsk Test Site in Kazakhstan. This event ends America’s monopoly of atomic weaponry and launches the Cold War.”

In fact, a large number of the wars and conflicts mentioned above pitted the U.S against the Soviet Union, making it hard to argue that nuclear weapons really kept the peace between the two countries, even if they didn’t declare all-out war.

Perhaps what Milley meant was that the U.S. hasn’t been directly attacked since the Japanese bombed Pearl Harbor. However, we hadn’t really been attacked prior to that either. It’s not clear that nuclear weapons have offered any more deterrence than the Atlantic and Pacific Oceans provide. After all, Canada and Mexico also haven’t been attacked. Plus, while no country has launched strikes that targeted U.S. soil, we have been attacked by terrorists.

Maybe Milley just meant that there haven’t been nuclear strikes against us because we could retaliate with equally strong force. That may be true. However, if it is, then perhaps the money would be better spent toward research and public education.

Just a small fraction of that $1 trillion could be used to help us better understand the impact of nuclear winter. Current weather models already predict that even a small nuclear war could cause worldwide temperatures to plummet, killing as many as 1 billion people globally. That includes people within the fighting countries.

Perhaps if countries involved understood how devastating a nuclear war would be for their own citizens, they’d prefer reducing rather than upgrading their nuclear arsenals.

Cheap Lasers and Bad Math; The Coming Revolution in Robot Perception

In 2011, I took a ride in Google’s self-driving car. It was an earlier prototype with a monitor mounted on the passenger side where you could see in real time what the car was seeing. It was set for a pretty aggressive test and was laying rubber on the road as we weaved through cones on the closed track. Plainly put, it drove better than any human I’ve ever driven with, and as it drove I could see on the monitor everything that it saw. The entire scene in 360 degrees was lit up. It saw everything around it in a way humans never could. This was the future of auto safety.

As I got out of the car and watched it take other excited passengers around the track I was convinced self-driving cars were right around the corner. Clearly that did not happen. The reason why was not something I would learn about until I got into computer vision work several years later. That beautiful 360 degree situational awareness was driven primarily by one sensor, the Velodyne LIDAR unit.

LIDAR is laser ranging. It uses lasers to measure the distance to objects. It has been used for years by the military and in industrial applications. But it hasn’t really been a consumer technology, experienced by the public. One major reason has been the cost. The LIDAR unit on top of Google’s car in 2011 cost roughly $75,000.

In the years since I took that ride, the size and cost of LIDAR units has decreased. The smaller units can be seen on many robots including Boston Dynamics‘ famous BigDog and Atlas robots. These cost about $30,000. In 2015 that cost came down to about $8,000 for a unit with fewer lasers. These units were small enough to comfortably be carried by small consumer drones and were used in robotics projects globally.

As exciting as that was, 2016 puts us on the precipice of an even greater breakthrough, both from established providers and new upstarts on Kickstarter. The sub $500 LIDAR sensor is about to hit the market. Usually when prices for a technology fall that fast you see it start to pop up in new and interesting places. Along with advances in other vision technologies, such as structured light and structure through motion, LIDAR sensing is about to give robots the ability to navigate through the world around them in ways that were hitherto expensive and difficult.

There are some amazing feats of robot cooperation and precise movement in pre-mapped and controlled environments, from drone swarms navigating quickly through small openings, to high speed precision welding. But this capability of a robot learning its three-dimensional environments, known to roboticists as SLAM, Simultaneous Localization and Mapping, hasn’t been widely available for use in low-cost robots. That is about to change.

Unfortunately the availability of low-cost sensors is only half the problem. Processing power is also a limiting factor in three-dimensional SLAM. Vision processing takes a significant amount of processing power, a difficult challenge in smaller robots and drones. You can do some of this work with off-board computing, but in high speed challenges, like autonomous flight or driving, the risk of communications lag is too great. For safety, it is best to do this computing on board the robot.

A few years ago this presented difficulties, but now we find new hardware on the cusp of solving these problems as well. NVIDIA, long known for its graphics cards, has begun the drive towards greater GPU processing power in embedded systems and robots. The NVIDIA Jetson embedded platform is targeted at Deep Learning AI researchers, but holds great promise for computer vision processing problems.

NVIDIA is not alone. New chips are also being developed by startups with computer vision as a target. For example, Singular Computing is taking the interesting approach of designing chips that are imprecise. Inspired by human neurons, which only fire together when wired together about 90% of the time, Singular explored what could be done in chip architecture if you allowed up to a 1% error in mathematical results. The answer? You could pack about 1000 times the processors in the same space. For doing vision processing this means a picture that is nearly identical to one produced by a traditional computer, in a fraction of the time and for a fraction of the power. The ability for robots to both see and reason could eventually be powered by bad math.

To date most robots have either had a tie to an operator who controlled the robot, or they operated in a very controlled environment such as a factory floor. The idea of robots navigating freely and interacting with people is still very new, and reserved for a few test cases like the Google self-driving car. The coming reduction in cost and increase in capabilities in robotic perception in the next few years are about to forever alter the landscape of where we see robots in our daily lives. So, is the world ready for volumes of self-navigating robots with narrow AIs entering our streets, homes, workplaces, schools, or battlefields in the next five to ten years? This is a question we should all be considering.

*Photo credit Peter Haas.

Eliezer Yudkowsky on AlphaGo’s Wins

The following insightful article was written by Eliezer Yudkowsky and originally posted on his Facebook page.

(Note from the author about this post: Written after AlphaGo’s Game 2 win, posted as it was winning Game 3, and at least partially invalidated by the way AlphaGo lost Game 4. We may not know for several years what truly superhuman Go play against a human 9-pro looks like; and in particular, whether truly superhuman play shares the AlphaGo feature focused on here, of seeming to play games that look even to human players but that the machine player wins in the end. Some of the other points made do definitely go through for computer chess, in which unambiguously superhuman and non-buggy play has been around for years. Nonetheless, take with a grain of salt – I and a number of other people were surprised by AlphaGo’s loss in Game 4, and we don’t know as yet exactly how it happened and to what extent it’s a fundamental flaw versus a surface bug. What we might call the Kasparov Window – machine play that’s partially superhuman but still flawed enough for a human learner to understand, exploit, and sometimes defeat – may be generally wider than I thought.)



As I post this, AlphaGo seems almost sure to win the third game and the match.

At this point it seems likely that Sedol is actually far outclassed by a superhuman player. The suspicion is that since AlphaGo plays purely for probability of long-term victory rather than playing for points, the fight against Sedol generates boards that can falsely appear to a human to be balanced even as Sedol’s probability of victory diminishes. The 8p and 9p pros who analyzed games 1 and 2 and thought the flow of a seemingly Sedol-favoring game ‘eventually’ shifted to AlphaGo later, may simply have failed to read the board’s true state. The reality may be a slow, steady diminishment of Sedol’s win probability as the game goes on and Sedol makes subtly imperfect moves that humans think result in even-looking boards. (E.g., this analysis.)

For all we know from what we’ve seen, AlphaGo could win even if Sedol were allowed a one-stone handicap. But AlphaGo’s strength isn’t visible to us – because human pros don’t understand the meaning of AlphaGo’s moves; and because AlphaGo doesn’t care how many points it wins by, it just wants to be utterly certain of winning by at least 0.5 points.

IF that’s what was happening in those 3 games – and we’ll know for sure in a few years, when there’s multiple superhuman machine Go players to analyze the play – then the case of AlphaGo is a helpful concrete illustration of these concepts:

Edge instantiation.
Extremely optimized strategies often look to us like ‘weird’ edges of the possibility space, and may throw away what we think of as ‘typical’ features of a solution. In many different kinds of optimization problem, the maximizing solution will lie at a vertex of the possibility space (a corner, an edge-case).

In the case of AlphaGo, an extremely optimized strategy seems to have thrown away the ‘typical’ production of a visible point lead that characterizes human play. Maximizing win-probability in Go, at this level of play against a human 9p, is not strongly correlated with what a human can see as visible extra territory – so that gets thrown out even though it was previously associated with ‘trying to win’ in human play.

Unforeseen maximum.
Humans thought that a strong opponent would have more visible territory earlier – building up a lead seemed like an obvious way to ensure a win. But ‘gain more territory’ wasn’t explicitly encoded into AlphaGo’s utility function, and turned out not to be a feature of the maximum of AlphaGo’s actual utility function of ‘win the game’, contrary to human expectations of where that maximum would lie.

Instrumental efficiency.
The human pros thought AlphaGo was making mistakes. Ha ha.

AlphaGo doesn’t actually play God’s Hand. Similarly, liquid stock prices sometimes make big moves. But human pros can’t detect AlphaGo’s departures from God’s Hand, and you can’t personally predict the net direction of stock price moves.

If you think the best move is X and AlphaGo plays Y, we conclude that X had lower expected winningness than you thought, or that Y had higher expected winningness than you thought. We don’t conclude that AlphaGo made an inferior move.

Thinking you can spot AlphaGo’s mistakes is like thinking your naked eye can see an exploitable pattern in S&P 500 price moves – we start out with a very strong suspicion that you’re mistaken, overriding the surface appearance of reasonable arguments.

– Convergence to apparent consequentialism / explanation by final causes.
Early chess-playing programs would do things that humans could interpret in terms of the chess-playing program having particular preferences or weaknesses, like “The program doesn’t understand center strategy very well” or (much earlier) “The program has a tendency to move its queen a lot.”

This ability to explain computer moves in ‘psychological’ terms vanished as computer chess improved. For a human master looking at a modern chess program, their immediate probability distribution on what move the chess algorithm outputs, should be the same as their probability distribution on the question “Which move will in fact lead to a future win?” That is, if there were a time machine that checked the (conditional) future and output a move such that it would in fact lead to a win for the chess program, then your probability distribution on the time machine’s next immediate move, and your probability distribution on the chess program’s next immediate move, would be the same.

Of course chess programs aren’t actually as powerful as time machines that output a Path to Victory; the actual moves output aren’t the same. But from a human perspective, there’s no difference in how we predict the next move, at least if we have to do it using our own intelligence without computer help. At this point in computer chess, a human might as well give up on every part of the psychological explanation for any move a chess program makes, like “It has trouble understanding the center” or “It likes moving its queen”, leaving only, “It output that move because that is the move that leads to a future win.”

This is particularly striking in the case of AlphaGo because of the stark degree to which “AlphaGo output that move because the board will later be in a winning state” sometimes doesn’t correlate with conventional Go goals like taking territory or being up on points. The meaning of AlphaGo’s moves – at least some of the moves – often only becomes apparent later in the game. We can best understand AlphaGo’s output in terms of the later futures to which it leads, treating it like a time machine that follows a Path to Victory.

Of course, in real life, there’s a way AlphaGo’s move was computed and teleological retrocausation was not involved. But you can’t relate the style of AlphaGo’s computation to the style of AlphaGo’s move in any way that systematically departs from just reiterating “that output happened because it will lead to a winning board later”. If you could forecast a systematic departure between what those two explanations predict in terms of immediate next moves, you would know an instrumental inefficiency in AlphaGo.

This is why the best way to think about a smart paperclip maximizer is to imagine a time machine whose output always happens to lead to the greatest number of paperclips. A real-world paperclip maximizer wouldn’t actually have that exactly optimal output, and you can expect that in the long run the real-world paperclip maximizer will get less paperclips than an actual time machine would get. But you can never forecast a systematic difference between your model of the paperclip maximizer’s strategy, and what you imagine the time machine would do – that’s postulating a publicly knowable instrumental inefficiency. So if we’re trying to imagine whether a smart paperclip maximizer would do X, we ask “Does X lead to the greatest possible number of expected paperclips, without there being any alternative Y that leads to more paperclips?” rather than imagining the paperclip maximizer as having a psychology.

And even then your expectation of the paperclip maximizer actually doing X should be no stronger than your belief that you can forecast AlphaGo’s exact next move, which by Vingean uncertainty cannot be very high. If you knew exactly where AlphaGo would move, you’d be that smart yourself. You should, however, expect the paperclip maximizer to get at least as many paperclips as you think could be gained from X, unless there’s some unknown-to-you flaw in X and there’s no better alternative.

Cognitive uncontainability.
Human pros can’t predict where AlphaGo will move because AlphaGo searches more possibilities than human pros have time to consider. It’s not just that AlphaGo estimates value differently, but that the solution AlphaGo finds that maximizes AlphaGo’s estimated value, is often outside the set of moves whose value you were calculating.

Strong cognitive uncontainability.
Even after the human pros saw AlphaGo’s exact moves, the humans couldn’t see those moves as powerful strategies, not in advance and sometimes not even after the fact, because the humans lacked the knowledge to forecast the move’s consequences.

Imagine someone in the 11th century trying to figure out how people in the 21st century might cool their houses. Suppose that they had enough computing power to search lots and lots of possible proposals, but had to use only their own 11th-century knowledge of how the universe worked to evaluate those proposals. Suppose they had so much computing power that at some point they randomly considered a proposal to construct an air conditioner. If instead they considered routing water through a home and evaporating the water, that might strike them as something that could possibly make the house cooler, if they saw the analogy to sweat. But if they randomly consider the mechanical diagram of an air conditioner as a possible solution, they’ll toss it off as a randomly generated arcane diagram. They can’t understand why this would be an effective strategy for cooling their house, because they don’t know enough about thermodynamics and the pressure-heat relation.

The gap between the 11th century and the 21st century isn’t just the computing power to consider more alternatives. Even if the 11th century saw the solutions we used, they wouldn’t understand why they’d work – lacking other reasons to trust us, they’d look at the air conditioner diagram and say “Well that looks stupid.”

Similarly, it’s not just that humans lack the computing power to search as many moves as AlphaGo, but that even after AlphaGo plays the move, we don’t understand its consequences. Sometimes later in the game we see the consequences of a good move earlier, but that’s only one possible way that Sedol played out the game, so we don’t understand the value of many other moves. We don’t realize how much expected utility is available to AlphaGo, not just because AlphaGo searches a wider space of possibilities, but because we lack the knowledge needed to understand what AlphaGo’s moves will do.

This is the kind of cognitive uncontainability that would apply if the 11th century was trying to forecast how much cooling would be produced by the best 21st-century solution for cooling a house. From an 11th-century perspective, the 21st century has ‘magic’ solutions that do better than their best imaginable solutions and that they wouldn’t understand even if they had enough computing power to consider them as possible actions.

Go is a domain much less rich than the real world, and it has rigid laws we understand in toto. So superhuman Go moves don’t contain the same level of sheer, qualitative magic that the 21st century has from the perspective of the 19th century. But Go is rich enough to demonstrate strong cognitive uncontainability on a small scale. In a rich and complicated domain whose rules aren’t fully known, we should expect even more magic from superhuman reasoning – solutions that are better than the best solution we could imagine, operating by causal pathways we wouldn’t be able to foresee even if we were told the AI’s exact actions.

For an example of an ultra-complicated poorly understood domain where we should reasonably expect that a smarter intelligence can deploy ‘magic’ in this sense, consider, say, the brain of a human gatekeeper trying to keep an AI in a box. Brains are very complicated, and we don’t understand them very well. So superhuman moves on that gameboard will look to us like magic to a much greater extent than AlphaGo’s superhuman Go moves.

– Patience.
A paperclip maximizer doesn’t bother to make paperclips until it’s finished doing all the technology research and has gained control of all matter in its vicinity, and only then does it switch to an exploitation strategy. Similarly, AlphaGo has no need to be “up on (visible) points” early. It simply sets up the thing it wants, win probability, to be gained at the time it wants it.

Context change and sudden turns.
By sheer accident of the structure of Go and the way human 9ps play against superior opponents – namely, giving away probability margins they don’t understand while preserving their apparent territory – we’ve ended up with an AI that is apparently not being superhumanly dangerous until, you know, it just happens to win at the end.

Now in this case, that’s happening because of a coincidence of the game structure, not because AlphaGo models human minds and hides how far it’s ahead. I mean, maybe DeepMind deliberately built this version of AlphaGo to exploit human opponents, or a similar pattern emerged from trial-and-error uncovering systems that fought particularly well against human players. But if the architecture is still basically like the October AlphaGo architecture, which seems more probable, then AlphaGo acts as if it’s playing another AlphaGo; that’s how all of the internal training worked and how all of its future forecasts worked in the October version. AlphaGo probably has no model of humans and no comprehension that this time it’s fighting Sedol instead of another computer. So AlphaGo’s underplayed strength isn’t deliberate… probably.

So this is not the same phenomenon as the expected convergent incentive, following a sufficiently cognitively powerful AI noticing a divergence between what it wants and what the programmers want, for that AI to deceive the programmers about how smart it is. Or the convergent instrumental incentive for that AI to not strike out, or even give any sign whatsoever that anything is wrong, until it’s ready to win with near certainty.

But AlphaGo is still a nice accidental illustration that when you’ve been placed in an adversarial relation to something smarter than you, you don’t always know that you’ve lost, or that anything is even wrong, until the end.

Rapid capability gain and upward-breaking curves.
“Oh, look,” I tweeted, “it only took 5 months to go from landing one person on Mars to Mars being overpopulated.” (In reference to Andrew Ng’s claim that worrying about AGI outcomes is like worrying about overpopulation on Mars.)

The case of AlphaGo serves as a possible rough illustration of what might happen later. Later on, there’s an exciting result in a more interesting algorithm that operates on a more general level (I’m not being very specific here, for the same reason I don’t talk about my ideas for building really great bioweapons). The company dumps in a ton of research effort and computing power. 5 months later, a more interesting outcome occurs.

Martian population growth doesn’t always work on smooth, predictable curves that everyone can see coming in advance. The more powerful the AI technology, the more it makes big jumps driven by big insights. As hardware progress goes on, those big insights can be applied over more existing hardware to produce bigger impacts. We’re not even in the recursive regime yet, and we’re still starting to enter the jumpy unpredictable phase where people are like “What just happened?”

Local capability gain.
So far as I can tell, if you look at everything that Robin Hanson said about distributed FOOM and everything I said about local FOOM in the Hanson-Yudkowsky FOOM debate, everything about AlphaGo worked out in a way that matches the “local” model of how things go.

One company with a big insight jumped way ahead of everyone else. This is true even though, since the world wasn’t at stake this time, DeepMind actually published their recipe for the October version of their AI.

AlphaGo’s core is built around a similar machine learning technology to DeepMind’s Atari-playing system – the single, untweaked program that was able to learn superhuman play on dozens of different Atari games just by looking at the pixels, without specialization for each particular game. In the Atari case, we didn’t see a bunch of different companies producing gameplayers for all the different varieties of game. The Atari case was an example of an event that Robin Hanson called “architecture” and doubted, and that I called “insight.” Because of their big architectural insight, DeepMind didn’t need to bring in lots of different human experts at all the different Atari games to train their universal Atari player. DeepMind just tossed all pre-existing expertise because it wasn’t formatted in a way their insightful AI system could absorb, and besides, it was a lot easier to just recreate all the expertise from scratch using their universal Atari-learning architecture.

The October version of AlphaGo did initially seed one of the key components by training it to predict a big human database of games. But Demis Hassabis has suggested that next up after this competition will be getting DeepMind to train itself in Go entirely from scratch, tossing the 2500-year human tradition right out the window.

More importantly, so far as I know, AlphaGo wasn’t built in collaboration with any of the commercial companies that built their own Go-playing programs for sale. The October architecture was simple and, so far as I know, incorporated very little in the way of all the particular tweaks that had built up the power of the best open-source Go programs of the time. Judging by the October architecture, after their big architectural insight, DeepMind mostly started over in the details (though they did reuse the widely known core insight of Monte Carlo Tree Search). DeepMind didn’t need to trade with any other Go companies or be part of an economy that traded polished cognitive modules, because DeepMind’s big insight let them leapfrog over all the detail work of their competitors.

Frankly, this is just how things have always worked in the AI field and I’m not sure anyone except Hanson expects this to change. But it’s worth noting because Hanson’s original reply, when I pointed out that no modern AI companies were trading modules as of 2008, was “That’s because current AIs are terrible and we’ll see that changing as AI technology improves.” DeepMind’s current AI technology is less terrible. The relevant dynamics haven’t changed at all. This is worth observing.

– Human-equivalent competence is a small and undistinguished region in possibility-space.
As I tweeted early on when the first game still seemed in doubt, “Thing that would surprise me most about ‪#‎alphago‬ vs. ‪#‎sedol‬: for either player to win by three games instead of four or five.”
Since DeepMind picked a particular challenge time in advance, rather than challenging at a point where their AI seemed just barely good enough, it was improbable that they’d make exactly enough progress to give Sedol a nearly even fight.

AI is either overwhelmingly stupider or overwhelmingly smarter than you. The more other AI progress and the greater the hardware overhang, the less time you spend in the narrow space between these regions. There was a time when AIs were roughly as good as the best human Go-players, and it was a week in late January.

[UPDATE]: AlphaGo’s strange, losing play in the 4th game suggests that playing seemingly-near-even games might possibly be a ‘psychological’ feature of the Monte Carlo algorithm rather than alien-efficient play. But again, we’ll know for sure in a few years when there’s debugged, unambiguously superhuman machine Go players.

That doesn’t mean AlphaGo is only slightly above Lee Sedol, though. It probably means it’s “superhuman with bugs”. That’s one of the Interesting scenarios that MIRI hasn’t been trying to think through in much detail, because (1) it’s highly implementation-dependent and I haven’t thought of anything general to say, and (2) it only shows up in AGI scenarios with limited or no AI self-improvement, and we’re only just starting to turn our attention to those. As it stands, it seems AlphaGo plays a mix of mostly ‘stupid’ moves that are too smart for humans to comprehend, plus a few ‘stupid’ moves that are actually stupid. Let’s try to avoid this scenario in real-world AGI.

Note that machine chess has been out of the flawed superhuman regime and well into the pure superhuman regime for years now. So everything above about instrumental efficiency goes through for machine chess without amendment – we say ‘ha ha’ to any suggestion that the smartest human can see a flaw in play with their naked eye, if you think X is best and the machine player does Y then we conclude you were wrong, and so on.

Who’s to Blame (Part 6): Potential Legal Solutions to the AWS Accountability Problem

The law abhors a vacuum.  So it is all but certain that, sooner or later, international law will come up with mechanisms for fixing the autonomous weapon system (AWS) accountability problem.  How might the current AWS accountability gap be filled?

The simplest solution—and the one advanced by Human Rights Watch (HRW) and the not-so-subtly-named Campaign to Stop Killer Robots (CSKR)—is to ban “fully autonomous” weapon systems completely.  As noted in the second entry in this series, the HRW defines such an AWS as one that can select and engage targets without specific orders from a human commander (that is, without human direction) and operate without real-time human supervision (that is, monitoring and control). One route to such a ban would be adding an AWS-specific protocol to the Convention on Certain Conventional Weapons (CCW), which covers incendiary weapons, landmines, and a few other categories of conventional (i.e., not nuclear, biological, or chemical) weapons. The signatories to the CCW held informal meetings on AWSs in May 2014 and April 2015, but it does not appear that the addition of an AWS protocol to the CCW is under formal consideration.

In any event, there is ample reason to question whether the CCW would be an effective vehicle for regulating AWSs. The current CCW contains few outright bans on the weapons it covers (the CCW protocol on incendiary weapons does not bar the napalming of enemy forces) and has no mechanisms whatsoever for verification or enforcement.  The CCW’s limited impact on landmines is illustrated by the fact that the International Campaign to Ban Landmines (which, incidentally, seriously needs to hire someone to design a new logo) was created nine years after the CCW’s protocol covering landmines went into effect.

Moreover, even an outright ban on “fully” autonomous weapons does not adequately account for the fact that weapon systems can have varying types and degrees of autonomy.  Serious legal risks would still accompany the deployment of AWSs with only limited autonomy, but those risks would not be covered by a ban on fully autonomous weapons.

A more balanced solution might require continuous human monitoring and adequate means of control whenever an AWS is deployed in combat, with a presumption of negligence (and therefore command responsibility) attaching to the commander responsible for monitoring and controlling an AWS that commits an illegal act.  That presumption could only be overcome if the human being shows   This would ensure that at least one human being would always have a strong legal incentive to supervise an AWS that is engaged in combat operations.

An even stronger form of command responsibility based on strict liability might seem tempting at first, but applying a strict liability standard to command responsibility for AWSs would be problematic because, as noted in the previous entry in this series, multiple officers in the chain of command may play a role in deciding whether, when, where, and how to deploy an AWS during a particular operation (to say nothing of the personnel responsible for designing and programming the AWS).  It would be difficult to fairly determine how far up (or down) the chain of command and how far back in time criminal responsibility should attach.

Much, much more can and will be said about each of the above topics in the coming weeks and months.  For now, here are a few recommendations for deeper discussions on the legal accountability issues surrounding AWSs:

  • Human Rights Watch, Mind the Gap: The Lack of Accountability for Killer Robots (2015)
  • International Committee of the Red Cross, Autonomous weapon systems technical, military, legal and humanitarian aspects (2014)
  • Michael N. Schmitt & Jeffrey S. Thurnher, “Out of the Loop”: Autonomous Weapon Systems and the Law of Armed Conflict, 4 Harv. Nat’l Sec. J. 231 (2013)
  • Gary D. Solis, The Law of Armed Conflict: International Humanitarian Law in War (2015), chapters 10 (“Command Responsibility and Respondeat Superior“) and 16 (“The 1980 Certain Conventional Weapons Convention”)
  • U.S. Department of Defense Directive No. 3000.09 (“Autonomy in Weapon Systems”), issued Nov. 21, 2012
  • Wendell Wallach & Colin Allen, Framing Robot Arms Control, 15 Ethics and Information Technology 125 (2013)

The Increasing Risk of Nuclear War

On Saturday, February 27, thousands of activists marched through London in opposition to the country’s nuclear weapons policy. Meanwhile, in the United States, Russia, and now China, nuclear tensions may be escalating.

“It takes about 30 minutes for a missile to fly between the United States and Russia,” a recent report by the Union of Concerned Scientists (UCS) reminded readers. The U.S. and Russia keep their missiles on hair-trigger alerts which allow them to be launched within minutes, and both countries have been investing in upgrading their nuclear arsenals.

Another recent UCS report, China’s Military Calls for Putting Its Nuclear Forces on Alert, analyzes the risk that China’s military leaders may also soon call for putting their own nuclear weapons on high-alert. Gregory Kulacki, author of the China report, argues that this could be incredibly dangerous. He explains:

“The experience with U.S. and Soviet/Russian warning systems, especially early in their deployment and operation when hardware and procedures were not yet reliable, illustrates the dangers of maintaining the option to launch on warning. Such risks are especially acute in a crisis.”

In fact, there have been over two dozen known nuclear close calls involving the U.S. and Russia, and likely many more that haven’t been declassified. One of the scariest examples occurred in 1980, during a time of tension between Russia and the U.S. A Soviet satellite showed five land-based missiles heading straight for the Soviet Union, and Stanislav Petrov, the officer on duty, had only a few minutes to decide whether or not it was a false alarm. Fortunately, he disregarded all evidence to the contrary and concluded it was false. Investigations later found that the sun’s reflection off the clouds had tricked the satellite into detecting a missile launch. Hair-trigger alert policies only increase the chances that accidents like this could occur, inadvertently triggering a nuclear war.

These policies were originally put in place as an act of deterrence. However, as information about more of these close calls has been released, many are concerned about the Cold-War policies that are still in effect.

As David Wright with the UCS says, “Twenty-five years after the end of the Cold War, the United Sates and Russia continue to keep nearly 2,000 nuclear weapons constantly on high alert, ready to be launched in minutes.”

Given the concerns associated with the hair-trigger alert policies of the U.S. and Russia, it’s no surprise that adding a third country to the mix would only exacerbate the situation.

More surprising might be why China’s military leaders are reconsidering their policy. According to Kulacki,

“The nuclear weapons policies of the United States are the most prominent external factors influencing Chinese advocates for raising the alert level of China’s nuclear forces.”

The risks that could potentially arise if China changes its policy are hard to measure, but with the close calls already seen between the U.S. and Soviet Russia, a third country with similar policies would only further increase the risk of a devastating, accidental nuclear exchange.

When they originally began their nuclear program, Chinese leaders committed to a no-first-use policy, and they’ve stuck to that, keeping their nuclear warheads separate from their missiles. Only if they’re attacked first, will the Chinese assemble their nuclear weapons and strike back. However, the Chinese are becoming increasingly concerned with what they perceive the U.S. nuclear stance against them to be. Kulacki explains:

“The authors of the 2013 Academy of Military Sciences’ textbook The Science of Military Strategy clearly believe U.S. actions are calling into question the credibility of China’s ability to retaliate after a U.S. nuclear attack, and that an effective way to respond would be to raise the alert level of China’s nuclear forces so they can be launched on warning of an incoming nuclear attack.”

With the recent developments of U.S. high-precision conventional and nuclear weapons, along with the Chinese belief that the U.S. is unwilling to recognize joint vulnerability, China’s leaders may decide to change their own policies to match those of the U.S. In the China report, Kulacki recommends five steps for the United States to take to improve their relationship with China:

“Acknowledge mutual vulnerability with China.”
“Reject rapid-launch options.”
“Adopt a ‘sole purpose’ nuclear doctrine.”
“Limit ballistic missile defenses.”
“Discuss impacts of new conventional capabilities.”

“U.S. officials have to realize that China is contemplating these changes because it believes the United States is unwilling to reduce the role of nuclear weapons in its national security strategy— what President Obama promised to do in his famous speech in Prague in 2009,” Kulacki added. “What the U.S. says and does regarding nuclear weapons has a profound effect on Chinese thinking. And right now, we’re pushing China in the wrong direction.”

The Human Dethroning Continues: AlphaGo Beats Go Champion Se-dol in First Match

March 10: AlphaGo again proved its prowess at the game of Go, beating world champion Lee Se-dol in game two of the five-game match up, which was aired live on YouTube. AlphaGo made some surprise moves during the game that many commentators suspected were errors, but which later turned out to give the program an advantage.

Speaking after the game, DeepMind founder Demis Hassabis said, “I think I’m a bit speechless actually. It was a really amazing game today, unbelievably exciting and incredibly tense … Quite surprising and quite beautiful moves, according to the commentators, which was pretty amazing to see.”

Lee Se-dol also expressed his awe of the program, saying, “Yesterday I was surprised, but today, more than that, I am quite speechless … From the very beginning of the game, there was not a moment in time that I felt that I was leading … Today, I really feel that AlphaGo had played a near perfect game.”

Michael Redmond, a 9-dan Go champion, has been the English-speaking commentator for the games. Yesterday, he expressed his hope that he would see more advanced moves from AlphaGo than what he saw in the October games against Fan Hui. After this second match, he said this was, in fact, “different from the games played in October.” He also added, “I was very, very impressed by the way AlphaGo did play some innovative and adventurous, dangerous looking moves, then actually made them work.”

When asked about DeepMind’s confidence in AlphaGo’s chances of winning as the game progressed, Hassabis explained, “AlphaGo has an estimate all the way through the game of how it thinks it’s doing. It’s not always correct though, of course.” While AlphaGo was showing confidence around the middle of the game, the commentators were much less certain about the program’s actions, and “the team wasn’t very confident.” However, he added, “AlphaGo seemed to know what was happening.”

Lee Se-dol chuckled at the question about AlphaGo’s weaknesses, saying, “Well, I guess I lost the game because I wasn’t able to find out what the weakness is.” Though Hassabis expressed hope that games like these with people as talented as Se-dol would help expose weaknesses.

Se-dol ended the conference saying, “The third game is not going to be easy for me, but I am going to exert my best efforts so I can win.”


March 9: Over 1.2 million people have already tuned in to YouTube to see DeepMind’s AlphaGo beat world champion Lee Se-dol in their first of five Go matches. AI is growing stronger, but at the end of 2015, most AI experts had anticipated it would be at least a decade before a computer program could tackle the game of Go. Most AI experts, that is, except those at DeepMind.

This particular game comes after DeepMind’s publication in January about AlphaGo’s triumph over the European champion, Fan Hui, in October of 2015. After mastering old Atari video games, DeepMind set its deep learning skills on Go, which is considered to be the most complex board game ever invented – there are more possible moves in Go than there are atoms in the universe.

Traditional search trees, which are often used in AI to consider all possible options, were not feasible for the game of Go, so AlphaGo was designed to combine advanced search trees and deep neural networks. The result was a program that bested a regional champion initially, and then continued to improve its skill for five months to successfully challenge (at least for game one) the world champion.

Se-dol initially claimed he didn’t believe AlphaGo would be ready to beat him by March, though he acknowledged the program would be able to beat him eventually. However, many news outlets reported that after he learned how the AlphaGo algorithms worked he grew more concerned, changing his original prediction of beating AlphaGo 5-0 to 4-1.

It was a close match throughout. Commentator Michael Redmond, also a 9-dan (highest level) Go champion, repeatedly mentioned how much more aggressively AlphaGo played in this match up as compared to the games in October against Fan Hui. Redmond was uncertain as to whether AlphaGo had improved its technique, or if the program changed its playing style to match its opponent.

By the end of this first match, the Go community was in shock:

“A computer has just beaten a 9-dan professional,” said stunned commentator Chris Garlock

“It’s happened once, it’s probably going to happen again now,” responded Redmond

The AI community was equally impressed. In an article in January about AlphaGo’s first success, Bart Selman, Francesca Rossi, and Stuart Russell all weighed in on what this meant for AI progress. We followed up with them again to get their input on what this new defeat by AlphaGo means for AI progress.

“AlphaGo’s win in its first game against Lee Sedol is very impressive. The win suggests that progress in AI is clearly accelerating, especially given that AlphaGo is a clear demonstration of the power of combining advances in deep learning with more traditional AI techniques such as randomized game tree search,” said Selman.

Russell explained, “This provides further evidence that the core techniques of deep learning are surprisingly powerful. Perhaps even more impressive than the victory is the fact that AlphaGo’s ability to evaluate board positions means that it’s better than all previous programs *even with search turned off*, i.e., when it’s not looking ahead at all. I would also imagine that there are much greater improvements yet to be found in its ability to direct its search process to get the best decision from the least amount of work.”

Rossi is also interested to see how programs like AlphaGo will work with the public in the future, rather than in competition. She explained, “It is certainly very exciting to follow the series of matches between AlphaGo and Lee Sedol. No matter who will win at the end of the 5 match series, this is an occasion for deep and fruitful discussion on innovative AI techniques, as well as on where AI should focus its efforts. Life is certainly more complex than Go, as my IBM Watson colleagues know from the work they are already doing in healthcare, education and other areas of great importance to society. I hope that the techniques used by AlphaGo to master this game eventually can be useful also to solve real life scenarios, where the key will be cooperation, rather than competition, between humans and intelligent machines.”

At the start of the game, Redmond pointed out:

“It’d be interesting to see if some computer program might come up with something different … I’m really interested to see a computer program, which will eventually not be influenced so much by humans, and could come up with something that’s brand new.”

Perhaps that’s next?

We’ll continue to update this article to cover the games, each of which can be viewed live on YouTube at 04:00 GMT (11:00 PM EST).

AlphaGo and AI Progress

Tomorrow, March 9, DeepMind’s AlphaGo begins its quest to beat the reigning world champion of Go, Lee Se-dol. In anticipation of the event, we’re pleased to feature this excellent overview of the impact of AlphaGo on the AI field, written by Miles Brundage. Don’t forget to tune in to Youtube March 9-15 for the full tournament!


AlphaGo’s victory over Fan Hui has gotten a lot of press attention, and relevant experts in AI and Go have generally agreed that it is a significant milestone. For example, Jon Diamond, President of the British Go Association, called the victory a “large, sudden jump in strength,” and AI researchers Francesca Rossi, Stuart Russell, and Bart Selman called it “important,” “impressive,” and “significant,” respectively.

How large/sudden and important/impressive/significant was AlphaGo’s victory? Here, I’ll try to at least partially answer this by putting it in a larger context of recent computer Go history, AI progress in general, and technological forecasting. In short, it’s an impressive achievement, but considering it in this larger context should cause us to at least slightly decrease our assessment of its size/suddenness/significance in isolation. Still, it is an enlightening episode in AI history in other ways, and merits some additional commentary/analysis beyond the brief snippets of praise in the news so far. So in addition to comparing the reality to the hype, I’ll try to distill some general lessons from AlphaGo’s first victory about the pace/nature of AI progress and how we should think about its upcoming match against Lee Sedol.

What happened

AlphaGo, a system designed by a team of 15-20 people[1] at Google DeepMind, beat Fan Hui, three-time European Go champion, in 5 out of 5 formal games of Go. Hui also won 2 out of 5 informal games with less time per move (for more interesting details often unreported in press accounts, see also the relevant Nature paper). The program is stronger at Go than all previous Go engines (more on the question of how much stronger below).

How it was done

AlphaGo was developed by a relatively large team (compared to those associated with other computer Go programs), using significant computing resources (more on this below). The program combines neural networks and Monte Carlo tree search (MCTS) in a novel way, and was trained in multiple phases involving both supervised learning and self-play. Notably from the perspective of evaluating its relation to AI progress, it was not trained end-to-end (though according to Demis Hassabis at AAAI 2016, they may try to do this in the future). It also used some hand-crafted features for the MCTS component (another point often missed by observers). The claimed contributions of the relevant paper are the ideas of value and policy networks, and the way they are integrated with MCTS. Data in the paper indicate that the system was stronger with these elements than without them.

Overall AI performance vs. algorithm-specific progress

Among other insights that can be gleaned from a careful study of the AlphaGo Nature paper, one is particularly relevant for assessing the broader significance of this result: the critical role that hardware played in improving AlphaGo’s performance. Consider the figures below, which I’ll try to contextualize.


This figure shows the estimated Elo rating and rank of a few different computer Go programs and Fan Hui. Elo ratings indicate the expected probability of defeating higher/lower ranking opponents – so, e.g. a player with 200 points more than her opponent is expected to win about three quarters of the time. Already, we can note some interesting things. Ignoring the pink bars (which indicate performance with the advantage of extra stones), we can see that AlphaGo, distributed or otherwise, is significantly stronger than Crazy Stone and Zen, previously among the best Go programs. AlphaGo is in the low professional range (“p” on the right hand side) and the others are in the high amateur range (“d” for “dan” on the right hand side). Also, we can see that while distributed AlphaGo is just barely above the range of estimated skill levels for Fan Hui, non-distributed AlphaGo is not (distributed AlphaGo is the one that actually played against Fan Hui). It looks like Fan Hui may have won at least some, if not all, games against non-distributed AlphaGo.

I’ll say more about the differences between these two, and other AlphaGo variants, below, but for now, note one thing that’s missing from this figure: very recent Go programs. In the weeks and months leading up to AlphaGo’s victory, there was significant activity and enthusiasm (though by much smaller terms, e.g. 1-2 at Facebook) in the Go community about two Go engines – darkforest (and its variants, with the best being darkfmcts3) made by researchers at Facebook, and Zen19X, a new and experimental version of the highly ranked Zen program. Note that in January of this year, Zen19X was briefly ranked in the 7d range on the KGS Server (used for human and computer Go), reportedly due to the incorporation of neural networks. Darkfmcts3 achieved a solid 5d ranking, a 2-3 dan improvement over where it was just a few months earlier, and the researchers behind it indicated in papers that there were various readily available ways to improve it. Indeed, in the most recent KGS Computer Go tournament, according to the most recent version of their paper on these programs, Tian and Zhu said that they would have won against a Zen variant if not for a glitch (contra Hassabis who said darkfmcts3 lost to Zen – he may not have read the relevant footnote!). Computer Go, to summarize, was already seeing a lot of progress via the incorporation of deep learning prior to AlphaGo, and this would slightly reduce the delta in the figure above (which was probably produced a few months ago), but not eliminate it entirely.

So, back to the hardware issue. Silver and Huang et al. at DeepMind evaluated many variants of AlphaGo, summarized as AlphaGo and AlphaGo Distributed in the figure above. But this does not give a complete picture of the variation driven by hardware differences, which the next figure (also from the paper) sheds light on.



This figure shows the estimated Elo rating of several variants of AlphaGo. The 11 light blue bars are from “single machine” variants, and the dark blue ones involve distributing AlphaGo across multiple machines. But what is this machine exactly? The “threads” indicated here are search threads, and by looking in a later figure in the paper, we can find that the least computationally intensive AlphaGo version (the shortest bar shown here) used 48 CPUs and 1 GPU. For reference, Crazy Stone does not use any GPUs, and uses slightly fewer CPUs. After a brief search into the clusters currently used for different Go programs, I was unable to find any using more than 36 or so CPUs. Facebook’s darkfmcts3 is the only version I know of that definitely uses GPUs, and it uses 64 GPUs in the biggest version and 8 CPUs (so, more GPUs than single machine AlphaGo, but fewer CPUs).  The single machine AlphaGo bar used in the previous figure, which indicated a large delta over prior programs, was based on the 40 search thread/48 CPU/8 GPU variant. If it were to show the 48 CPU/1 GPU version, it would be only slightly higher than Crazy Stone and Zen – and possibly not even higher than the very latest Zen19X version, which may have improved since January.

Perhaps the best comparison to evaluate AlphaGo against would be darkfmcts3 on equivalent hardware, but they use different configurations of CPUs/GPUs and darkfmcts3 is currently offline following AlphaGo’s victory. It would also be interesting to try scaling up Crazy Stone or Zen19X to a cluster comparable to AlphaGo Distributed, to further parse the relative gains in hardware-adjusted performance discussed earlier. In short, it’s not clear how much of a gain in performance there was over earlier Go programs for equivalent hardware – probably some, but certainly not as great as between earlier Go programs on small clusters and AlphaGo on the massive cluster ultimately used, which we turn to next.

AlphaGo Distributed, in its largest variant, used 280 GPUs and 1920 CPUs. This is significantly more computational power than any prior reported Go program used, and a lot of hardware in absolute terms. The size of this cluster is noteworthy for two reasons. First, it calls into question the extent of the hardware-adjusted algorithmic progress that AlphaGo represents, and relatedly, the importance of the value and policy networks. If, as I’ve suggested in a recent AAAI workshop paper, “Modeling Progress in AI,” we should keep track of multiple states of the art in AI as opposed to a singular state of the art, then comparing AlphaGo Distributed to, e.g. CrazyStone, is to compare two distinct states of the art – performance given small computational power (and a small team, for that matter) and performance given massive computational power and the efforts of over a dozen of the best AI researchers in the world.

Second, it is notable that hardware alone enabled AlphaGo to span a very large range of skill levels (in human terms) – at the lowest reported level, around an Elo score of 2200, up to well over 3000, which is the difference between amateur and pro level skills. This may suggest (an issue I’ll return to again below) that in the space of possible skill levels, humans occupy a fairly small band. It seems possible that if this project had been carried out, say, 10 or 20 years from now, the skill level gap traversed thanks to hardware could have been from amateur to superhuman (beyond pro level) in one leap, with the same algorithmic foundation. Moreover, 10 or 20 years ago, even with the same algorithms, it would likely not have been possible to develop a superhuman Go agent using this set of algorithms. Perhaps it was only around now that the AlphaGo project made sense to undertake, given progress in hardware (though other developments in recent years also made a difference, like neural network improvements and MCTS).

Additionally, as also discussed briefly in “Modeling Progress in AI,” we should take into account the relationship between AI performance and the data used for training when assessing the rate of progress. AlphaGo used a large game dataset from the KGS servers – I have not yet looked carefully at what data other comparable AIs have used to train on in the past, but it seems possible that this dataset, too, helped enable AlphaGo’s performance. Hassabis at AAAI indicated DeepMind’s intent to try to train AlphaGo entirely with self-play. This would be more impressive, but until that happens, we may not know how much of AlphaGo’s performance depended on the availability of this dataset, which DeepMind gathered on its own from the KGS servers.

Finally, in addition to adjusting for hardware and data, we should also adjust for effort in assessing how significant an AI milestone is. With Deep Blue, for example, significant domain expertise was used to develop the AI that beat Gary Kasparov, rather than a system learning from scratch and thus demonstrating domain-general intelligence. Hassabis at AAAI and elsewhere has argued that AlphaGo represents more general progress in AI than did Deep Blue, and that the techniques used were general purpose. However, the very development of the policy and value network ideas for this project, as well as the specific training regimen used (a sequence of supervised learning and self-play, rather than end-to-end learning), was itself informed by the domain-specific expertise of researchers like David Silver and Aja Huang, who have substantial computer Go and Go expertise. While AlphaGo ultimately exceeded their skill levels, the search for algorithms in this case was informed by this specific domain (and, as mentioned earlier, part of the algorithm encoded domain-specific knowledge – namely, the MCTS component). Also, the team was large –15-20 people, significantly more than prior Go engines that I’m aware of, and more comparable to large projects like Deep Blue or Watson in terms of effort than anything else in computer Go history. So, if we should reasonably expect a large team of some of the smartest, most expert people in a given area working on a problem to yield progress on that problem, then the scale of this effort suggests we should slightly update downwards our impression of the significance of the AlphaGo milestone. This is in contrast to what we should have thought if, e.g. DeepMind had simply taken their existing DQN algorithm, applied it to Go, and achieved the same result. At the same time, innovations inspired by a specific domain may have broad relevance, and value/policy networks may be a case of this. It’s still a bit early to say.

In conclusion, while it may turn out that value and policy networks represent significant progress towards more general and powerful AI systems, we cannot necessarily infer that just from AlphaGo having performed well, without first adjusting for hardware, data, and effort. Also, regardless of whether we see the algorithmic innovations as particularly significant, we should still interpret these results as signs of the scalability of deep reinforcement learning to larger hardware and more data, as well as the tractability of previously-seen-as-difficult problems in the face of substantial AI expert effort, which themselves are important facts about the world to be aware of.

Expert judgment and forecasting in AI and Go

In the wake of AlphaGo’s victory against Fan Hui, much was made of the purported suddenness of this victory relative to expected computer Go progress. In particular, people at DeepMind and elsewhere have made comments to the effect that experts didn’t think this would happen for another decade or more. One person who said such a thing is Remi Coulom, designer of CrazyStone, in a piece in Wired magazine. However, I’m aware of no rigorous effort to elicit expert opinion on the future of computer Go, and it was hardly unanimous that this milestone was that long off. I and others, well before AlphaGo’s victory was announced, said on Twitter and elsewhere that Coulom’s pessimism wasn’t justified. Alex Champandard noted that at a gathering of game AI experts a year or so ago, it was generally agreed that Go AI progress could be accelerated by a concerted effort by Google or others. At AAAI last year, I also asked Michael Bowling, who knows a thing or two about game AI milestones (having developed the AI that essentially solved limit heads-up Texas Hold Em), how long it would take before superhuman Go AI existed, and he gave it a maximum of five years. So, again, this victory being sudden was not unanimously agreed upon, and claims that it was long off are arguably based on cherry-picked and unscientific expert polls.

Still, it did in fact surprise some people, including AI experts, and people like Remi Coulom are hardly ignorant of Go AI. So, if this was a surprise to experts, should that itself be surprising? No. Expert opinion on the future of AI has long been known to be unreliable. I survey some relevant literatures on this issue in “Modeling Progress in AI,” but briefly, we already knew that model-based forecasts beat intuitive judgments, that quantitative technology forecasts generally beat qualitative ones, and various other things that should have led us to not take specific gut feelings (as opposed to formal models/extrapolations thereof) about the future of Go AI that seriously. And among the few actual empirical extrapolations that were made of this, they weren’t that far off.

Hiroshi Yamashita extrapolated the trend of computer Go progress as of 2011 into the future andpredicted a crossover point to superhuman Go in 4 years, which was one year off. In recent years, there was a slowdown in the trend (based on highest KGS rank achieved) that probably would have lead Yamashita or others to adjust their calculations if they had redone them, say, a year ago, but in the weeks leading up to AlphaGo’s victory, again, there was another burst of rapid computer Go progress. I haven’t done a close look at what such forecasts would have looked like at various points in time, but I doubt they would have suggested 10 years or more to a crossover point, especially taking into account developments in the last year. Perhaps AlphaGo’s victory was a few years ahead of schedule based on reported performance, but it should always have been possible to anticipate some improvement beyond the (small team/data/hardware-based) trend based on significant new effort, data, and hardware being thrown at the problem. Whether AlphaGo deviated from the appropriately-adjusted trend isn’t obvious, especially since there isn’t really much effort going into rigorously modeling such trends today. Until that changes and there are regular forecasts made of possible ranges of future progress in different domains given different effort/data/hardware levels, “breakthroughs” may seem more surprising than they really should be.

Lessons re: the nature/pace of AI progress in general

The above suggested that we should at least slightly downgrade our extent of surprise/impressedness regarding the AlphaGo victory. However, I still think it is an impressive achievement, even if wasn’t sudden or shocking. Rather, it is yet another sign of all that has already been achieved in AI, and the power of various methods that are being used.

Neural networks play a key role in AlphaGo. That they are applicable to Go isn’t all that surprising, since they’re broadly applicable – a neural network can in principle represent any computable function. But AlphaGo is another sign that they can not only in principle learn to do a wide range of things, but can do so relatively efficiently, i.e. in a human-relevant amount of time, with the hardware that currently exists, on tasks that are often considered to require significant human intelligence. Moreover, they are able to not just do things commonly (and sometimes dismissively) referred to as “pattern recognition” but also represent high level strategies, like those required to excel at Go. This scalability of neural networks (not just to larger data/computational power but to different domains of cognition) is indicated by not just AlphaGo but various other recent AI results. Indeed, even without MCTS, AlphaGo outperformed all existing systems with MCTS, one of the most interesting findings here and one that has been omitted in some analyses of AlphaGo’s victory. AlphaGo is not alone in showing the potential of neural networks to do things generally agreed upon as being “cognitive” – another very recent paper showed neural networks being applied to other planning tasks.

It’s too soon to say whether AlphaGo can be trained just with self-play, or how much of its performance can be traced to the specific training regimen used. But the hardware scaling studies shown in the paper give us additional reason to think that AI can, with sufficient hardware and data, extend significantly beyond human performance. We already knew this from recent ImageNet computer vision results, where human level performance in some benchmarks has been exceeded, along with some measures of speech recognition and many other results. But AlphaGo is an important reminder that “human-level” is not a magical stopping point for intelligence, and that many existing AI techniques are highly scalable, perhaps especially the growing range of techniques researchers at DeepMind and elsewhere have branded as “deep reinforcement learning.”

I’ve also looked in some detail at progress in Atari AI (perhaps a topic for a future blog post), which has led me to similar conclusion: there was only a very short period in time when Atari AI was roughly in the ballpark of human performance, namely around 2014/2015. Now, median human-scaled performance across games is well above 100%, and the mean is much higher – around 600%. There is only a small number of games in which human-level performance has not yet been shown, and in those where it has, super-human performance has usually followed soon after.

In addition to lessons we may draw from AlphaGo’s victory, there are also some questions raised: e.g. what areas of cognition are not amenable to substantial gains in performance achieved through huge computational resources, data, and expert effort? Theories of what’s easy/hard to automate in the economy abound, but rarely do such theories look beyond the superficial question of where AI progress has already been, to the harder question of what we can say in a principled way about easy/hard cognitive problems in general. In addition, there’s the empirical question of which domains there exist sufficient data/computational resources for (super)human level performance in already, or where there soon will be. For example, should we be surprised if Google soon announced that they have a highly linguistically competent personal assistant, trained in part from their massive datasets and with the latest deep (reinforcement) learning techniques? That’s difficult to answer. These and other questions, including long-term AI safety, in my view, call for more rigorous modeling of AI progress across cognitive/economically-relevant domains.

The Lee Sedol match and other future updates



In the spirit of model-based extrapolation versus intuitive judgments, I made the above figure using the apparent relationship between GPUs and Elo scores in DeepMind’s scaling study (the graph for CPUs looks similar). I extended the trend out to the rough equivalent of 5 minutes of calculation per move, closer to what will be the case in the Lee Sedol match, as opposed to 2 seconds per move as used in the scaling study. This assumes returns to hardware remain about the same at higher levels of skill (which may not be the case, but as indicated in the technology forecasting literature, naive models often beat no models!). This projection indicates that just scaling up hardware/giving AlphaGo more time to think may be sufficient to reach Lee Sedol-like performance (in the upper right, around 3500). However, this is hardly the approach DeepMind is banking on – in addition to more time for AlphaGo to compute the best move than in their scaling study, there will also be significant algorithmic improvements. Hassabis said at AAAI that they are working on improving AlphaGo in every way. Indeed, they’ve hired Fan Hui to help them. These and other considerations such as Hassabis’s apparent confidence (and he has access to relevant data, like current-AlphaGo’s performance against October-AlphaGo) suggest AlphaGo has a very good chance of beating Lee Sedol. If this happens, we should further update our confidence regarding the scalability of deep reinforcement learning, and perhaps of value/policy networks. If not, it may suggest some aspects of cognition are less amenable to deep reinforcement learning and hardware scaling than we thought. Likewise if self-play is ever shown to be sufficient to enable comparable performance, and/or if value/policy networks enable superhuman performance in other games, we should similarly increase our assessment of the scalability and generality of modern AI techniques.

One final note on the question of “general AI.” As noted earlier, Hassabis emphasized the purported generality of value/policy networks over the purported narrowness of Deep Blue’s design. While the truth is more complex than this dichotomy (remember, AlphaGo used some hand-crafted features for MCTS), there is still the point above about the generality of deep reinforcement learning. Since DeepMind’s seminal 2013 paper on Atari, deep reinforcement learning has been applied to a wide range of tasks in real-world robotics as well as dialogue. There is reason to think that these methods are fairly general purpose, given the range of domains to which they have been successfully applied with minimal or no hand-tuning of the algorithms. However, in all the cases discussed here, progress so far has largely been toward demonstrating general approaches for building narrow systemsrather than general approaches for building general systems. Progress toward the former does not entail substantial progress toward the latter. The latter, which requires transfer learning among other elements, has yet to have its Atari/AlphaGo moment, but is an important area to keep an eye on going forward, and may be especially relevant for economic/safety purposes. This suggests that an important element of rigorously modeling AI progress may be formalizing the idea of different levels of generality of operating AI systems (as opposed to the generality of the methods that produce them, though that is also important). This is something I’m interested in possibly investigating more in the future and I’d be curious to hear people’s thoughts on it and the other issues raised above.

This article was originally posted on milesbrundage.com.

[1] The 15 number comes from a remark by David Silver in one of the videos on DeepMind’s website. The 20 number comes from the number of authors on the relvant Nature paper.

MIRI March Newsletter

Research updates

General updates

  • MIRI and other Future of Life Institute (FLI) grantees participated in a AAAI workshop on AI safety this month.
  • MIRI researcher Eliezer Yudkowsky discusses Ray Kurzweil, the Bayesian brain hypothesis, and an eclectic mix of other topics in a new interview.
  • Alexei Andreev and Yudkowsky are seeking investors for Arbital, a new technology for explaining difficult topics in economics, mathematics, computer science, and other disciplines. As a demo, Yudkowsky has written a new and improved guide to Bayes’s Rule.

News and links

X-risk or X-hope? AI Learns From Books & an Autonomous Accident

X-risk = Existential Risk. The risk that we could accidentally (hopefully accidentally) wipe out all of humanity.
X-hope = Existential Hope. The hope that we will all flourish and live happily ever after.

The importance of STEM training and research in schools has been increasingly apparent in recent years, as the tech industry keeps growing. Yet news coming out of Google and Stanford this week gives reason to believe that hope for the future may be found in books.

In an effort to develop better language recognition software, researchers at Google’s Natural Language Understanding research group trained their deep neural network to predict the next sentence an author would write, given some input. The team used classic literature found on Project Gutenberg. Initially, they provided the program with sentences but no corresponding author ID, and the program was able to predict what the following sentence would be with a 12.8% error rate. When the author ID was given to the system, the error rate dropped to 11.1%.

Then the team upped the ante. Using the writing samples and author ID, the researchers had the program apply the Meyers Brigg personality test to determine personality characteristics about the authors. The program identified Shakespeare as a private person and Mark Twain as outgoing.

Ultimately, this type of machine learning can enable AI to better understand both language and human nature. Though for now, as the team explains on their blog, the program “could help provide more personalized response options for the recently introduced Smart Reply feature in Inbox by Gmail.”

Meanwhile, over at Stanford, another group of researchers is using modern books to help their AI program, Augur, understand everyday human activities.

Today, as the team explains in their paper, AI can’t anticipate daily needs without human input (e.g. when to brew a pot of coffee or when to silence a phone because we’re sleeping). They argue this is because there are simply too many little daily tasks and needs for any person to program manually. Instead, they “demonstrate it is possible to mine a broad knowledge base of human behavior by analyzing more than one billion words of modern fiction.”

In other news, perhaps the development of machines that learn human behavior from fiction could be applied to systems such as Google’s driverless car, which, for the first time, accepted partial responsibility for an accident this week. Google’s cars have logged over 1 million miles of driving and been in about 17 accidents, however, in every other case, it was the fault of the human driving the other car, or it occurred when one of Google’s employees was driving the Google car.

This particular accident occurred because the Google car didn’t properly anticipate the actions of a bus behind it. The car swerved slightly to avoid an obstruction on the road, and the bus side-swiped the car. According to CNN, “The company said the Google test driver who was behind the wheel thought the bus was going to yield, and the bus driver likely thought the Google car was going to yield to the bus.”

The accident raises an interesting question to consider: How tolerant will humans be of rare mistakes made by autonomous systems? As we’ve mentioned in the past, “If self-driving cars cut the 32000 annual US traffic fatalities in half, the car makers won’t get 16000 thank-you notes, but 16000 lawsuits.”

Who’s to Blame (Part 5): A Deeper Look at Predicting the Actions of Autonomous Weapons


Source: Dilbert Comic Strip on 2011-03-06 | Dilbert by Scott Adams

An autonomous weapon system (AWS) is designed and manufactured in a collaborative project between American and Indian defense contractors.  It is sold to numerous countries around the world. This model of AWS is successfully deployed in conflicts in Latin America, the Caucuses, and Polynesia without violating the laws of war. An American Lt. General then orders that 50 of these units be deployed during a conflict in the Persian Gulf for use in ongoing urban combat in several cities. One of those units had previously seen action in urban combat in the Caucuses and desert combat during the same Persian Gulf conflict, all without incident. A Major makes the decision to deploy that AWS unit to assist a platoon engaged in block-to-block urban combat in Sana’a. Once the AWS unit is on the ground, a Lieutenant is responsible for telling the AWS where to go. The Lt. General, the Major, and the Lieutenant all had previous experience using this model of AWS and had given similar orders to these in prior combat situations without incident.

The Lieutenant has lost several men to enemy snipers over the past several weeks.  He orders the AWS to accompany one of the squads under his command and preemptively strike any enemy sniper nests it detects–again, an order he had given to other AWS units before without incident.  This time, the AWS unit misidentifies a nearby civilian house as containing a sniper nest, based on the fact that houses with similar features had frequently been used as sniper nests in the Caucuses conflict. It launches a RPG at the house.  There are no snipers inside, but there are 10 civilians–all of whom are killed by the RPG. Human soldiers who had been fighting in the area would have known that that particular house likely did not contain a sniper’s nest because the glare from the sun off a nearby glass building reduces visibility on that side of the street at the times of day that American soldiers typically patrol the area–a fact that the human soldiers knew well from prior combat in the area, but a variable that the AWS had not been programmed to take into consideration.

In my most recent post for FLI on autonomous weapons, I noted that it would be difficult for humans to predict the actions of autonomous weapon systems (AWSs) programmed with machine learning capabilities.  If the military commanders responsible for deploying AWSs were unable to reliably foresee how the AWS would operate on the battlefield, it would be difficult to hold those commanders responsible if the AWS violates the law of armed conflict (LOAC).  And in the absence of command responsibility, it is not clear whether any human could be held responsible under the existing LOAC framework.

A side comment from a lawyer on Reddit made me realize that my reference to “foreseeability” requires a bit more explanation.  “Foreseeability” is one of those terms that makes lawyers’ ears perk up when they hear it because it’s a concept that every American law student encounters when learning the principles of negligence in their first-year class on Tort Law.

Read more