Transparent and Interpretable AI: an interview with Percy Liang

At the end of 2017, the United States House of Representatives passed a bill called the SELF DRIVE Act, laying out an initial federal framework for autonomous vehicle regulation. Autonomous cars have been undergoing testing on public roads for almost two decades. With the passing of this bill, along with the increasing safety benefits of autonomous vehicles, it is likely that they will become even more prevalent in our daily lives. This is true for numerous autonomous technologies including those in the medical, legal, and safety fields – just to name a few.

To that end, researchers, developers, and users alike must be able to have confidence in these types of technologies that rely heavily on artificial intelligence (AI). This extends beyond autonomous vehicles, applying to everything from security devices in your smart home to the personal assistant in your phone.

 

Predictability in Machine Learning

Percy Liang, Assistant Professor of Computer Science at Stanford University, explains that humans rely on some degree of predictability in their day-to-day interactions — both with other humans and automated systems (including, but not limited to, their cars). One way to create this predictability is by taking advantage of machine learning.

Machine learning deals with algorithms that allow an AI to “learn” based on data gathered from previous experiences. Developers do not need to write code that dictates each and every action or intention for the AI. Instead, the system recognizes patterns from its experiences and assumes the appropriate action based on that data. It is akin to the process of trial and error.

A key question often asked of machine learning systems in the research and testing environment is, “Why did the system make this prediction?” About this search for intention, Liang explains:

“If you’re crossing the road and a car comes toward you, you have a model of what the other human driver is going to do. But if the car is controlled by an AI, how should humans know how to behave?”

It is important to see that a system is performing well, but perhaps even more important is its ability to explain in easily understandable terms why it acted the way it did. Even if the system is not accurate, it must be explainable and predictable. For AI to be safely deployed, systems must rely on well-understood, realistic, and testable assumptions.

Current theories that explore the idea of reliable AI focus on fitting the observable outputs in the training data. However, as Liang explains, this could lead “to an autonomous driving system that performs well on validation tests but does not understand the human values underlying the desired outputs.”

Running multiple tests is important, of course. These types of simulations, explains Liang, “are good for debugging techniques — they allow us to more easily perform controlled experiments, and they allow for faster iteration.”

However, to really know whether a technique is effective, “there is no substitute for applying it to real life,” says Liang, “ this goes for language, vision, and robotics.” An autonomous vehicle may perform well in all testing conditions, but there is no way to accurately predict how it could perform in an unpredictable natural disaster.

 

Interpretable ML Systems

The best-performing models in many domains — e.g., deep neural networks for image and speech recognition — are obviously quite complex. These are considered “blackbox models,” and their predictions can be difficult, if not impossible, for them to explain.

Liang and his team are working to interpret these models by researching how a particular training situation leads to a prediction. As Liang explains, “Machine learning algorithms take training data and produce a model, which is used to predict on new inputs.”

This type of observation becomes increasingly important as AIs take on more complex tasks – think life or death situations, such as interpreting medical diagnoses. “If the training data has outliers or adversarially generated data,” says Liang, “this will affect (corrupt) the model, which will in turn cause predictions on new inputs to be possibly wrong.  Influence functions allow you to track precisely the way that a single training point would affect the prediction on a particular new input.”

Essentially, by understanding why a model makes the decisions it makes, Liang’s team hopes to improve how models function, discover new science, and provide end users with explanations of actions that impact them.

Another aspect of Liang’s research is ensuring that an AI understands, and is able to communicate, its limits to humans. The conventional metric for success, he explains, is average accuracy, “which is not a good interface for AI safety.” He posits, “what is one to do with an 80 percent reliable system?”

Liang is not looking for the system to have an accurate answer 100 percent of the time. Instead, he wants the system to be able to admit when it does not know an answer. If a user asks a system “How many painkillers should I take?” it is better for the system to say, “I don’t know” rather than making a costly or dangerous incorrect prediction.

Liang’s team is working on this challenge by tracking a model’s predictions through its learning algorithm — all the way back to the training data where the model parameters originated.

Liang’s team hopes that this approach — of looking at the model through the lens of the training data — will become a standard part of the toolkit of developing, understanding, and diagnosing machine learning. He explains that researchers could relate this to many applications: medical, computer, natural language understanding systems, and various business analytics applications.

“I think,” Liang concludes, “there is some confusion about the role of simulations some eschew it entirely and some are happy doing everything in simulation. Perhaps we need to change culturally to have a place for both.

In this way, Liang and his team plan to lay a framework for a new generation of machine learning algorithms that work reliably, fail gracefully, and reduce risks.

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project. If you’re interested in applying for our 2018 grants competition, please see this link.

Podcast: Top AI Breakthroughs and Challenges of 2017 with Richard Mallah and Chelsea Finn

AlphaZero, progress in meta-learning, the role of AI in fake news, the difficulty of developing fair machine learning — 2017 was another year of big breakthroughs and big challenges for AI researchers!

To discuss this more, we invited FLI’s Richard Mallah and Chelsea Finn from UC Berkeley to join Ariel for this month’s podcast. They talked about some of the technical progress they were most excited to see and what they’re looking forward to in the coming year.

You can listen to the podcast here, or read the transcript below.

Ariel: I’m Ariel Conn with the Future of Life Institute. In 2017, we saw an increase in investments into artificial intelligence. More students are applying for AI programs, and more AI labs are cropping up around the world. With 2017 now solidly behind us, we wanted to take a look back at the year and go over some of the biggest AI breakthroughs. To do so, I have Richard Mallah and Chelsea Finn with me today.

Richard is the director of AI projects with us at the Future of Life Institute, where he does meta-research, analysis and advocacy to keep AI safe and beneficial. Richard has almost two decades of AI experience in industry and is currently also head of AI R & D at the recruiting automation firm, Avrio AI. He’s also co-founder and chief data science officer at the content marketing planning firm, MarketMuse.

Chelsea is a PhD candidate in computer science at UC Berkeley and she’s interested in how learning algorithms can enable robots to acquire common sense, allowing them to learn a variety of complex sensory motor skills in real-world settings. She completed her bachelor’s degree at MIT and has also spent time at Google Brain.

Richard and Chelsea, thank you so much for being here.

Chelsea: Happy to be here.

Richard: As am I.

Ariel: Now normally I spend time putting together questions for the guests, but today Richard and Chelsea chose the topics. Many of the breakthroughs they’re excited about were more about behind-the-scenes technical advances that may not have been quite as exciting for the general media. However, there was one exception to that, and that’s AlphaZero.

AlphaZero, which was DeepMind’s follow-up to AlphaGo, made a big splash with the popular press in December when it achieved superhuman skills at Chess, Shogi and Go without any help from humans. So Richard and Chelsea, I’m hoping you can tell us more about what AlphaZero is, how it works and why it’s a big deal. Chelsea, why don’t we start with you?

Chelsea: Yeah, so DeepMind first started with developing AlphaGo a few years ago, and AlphaGo started its learning by watching human experts play, watching how human experts play moves, how they analyze the board — and then once it analyzed and once it started with human experts, it then started learning on its own.

What’s exciting about AlphaZero is that the system started entirely on its own without any human knowledge. It started just by what’s called “self-play,” where the agent, where the artificial player is essentially just playing against itself from the very beginning and learning completely on its own.

And I think that one of the really exciting things about this research and this result was that AlphaZero was able to outperform the original AlphaGo program, and in particular was able to outperform it by removing the human expertise, by removing the human input. And so I think that this suggests that maybe if we could move towards removing the human biases and removing the human input and move more towards what’s called unsupervised learning, where these systems are learning completely on their own, then we might be able to build better and more capable artificial intelligence systems.

Ariel: And Richard, is there anything you wanted to add?

Richard: So, what was particularly exciting about AlphaZero is that it’s able to do this by essentially a technique very similar to what Paul Christiano of AI Safety fame has called “capability amplification.” It’s similar in that it’s learning a function to predict a prior or an expectation over which moves are likely at a given point, as well as function to predict which player will win. And it’s able to do these in an iterative manner. It’s able to apply what’s called an “amplification scheme” in the more general sense. In this case it was Monte Carlo tree search, but in the more general case it could be other more appropriate amplification schemes for taking a simple function and iterating it many times to make it stronger, to essentially have a leading function that is then summarized.

Ariel: So I do have a quick follow up question here. With AlphaZero, it’s a program that’s living within a world that has very strict rules. What is the next step towards moving outside of that world with very strict rules and into the much messier real world?

Chelsea: That’s a really good point. The catch with these results, with these types of games — and even video games, which are a little bit messier than the strict rules of a board game — these games, all of these games can be perfectly simulated. You can perfectly simulate what will happen when you make a certain move or when you take a certain action, either in a video game or in the game of Go or the game of Chess, et cetera. Then therefore, you can train these systems with many, many lifetimes of data.

The real physical world on the other hand, we can’t simulate. We don’t know how to simulate the complex physics of the real world. As a result, you’re limited by the number of robots that you have if you’re interested in robots, or if you’re interested in healthcare, you’re limited by the number of patients that you have. And you’re also limited by safety concerns, the cost of failure, et cetera.

I think that we still have a long way to go towards taking these sorts of advances into real world settings where there’s a lot of noise, there’s a lot of complexity in the environment, and I think that these results are inspiring, and we can take some of the ideas from these approaches and apply them to these sorts of systems, but we need to keep in mind that there are a lot of challenges ahead of us.

Richard: So between real world systems and something like the game of Go, there are also incremental improvements, like introducing this port for partial observability or more stochastic environments, or more continuous environments as opposed to the very discrete ones. So these challenges, assuming that we do have a situation where we could actually simulate what we would like to see or use a simulation to help to get training data on the fly, then in those cases, we’re likely to be able to make some progress. Using a technique like this with some extensions or with some modifications to support those criteria.

Ariel: Okay. Now, I’m not sure if this is a natural jump to the next topic or not, but you’ve both mentioned that one of the big things that you saw happening last year were new creative approaches to unsupervised learning, and Richard in an email to me you mentioned “word translation without parallel data.” So I was hoping you could talk a little bit more about what these new creative approaches are and what you’re excited about there.

Richard: So this year, we saw an application of taking vector spaces, or taking word embeddings, which are essentially these multidimensional spaces where there are relationships between points that are meaningful semantically. The space itself is learned by a relatively shallow deep-learning network, but this meaningfulness that is imbued in the space, is actually able to be used, we’ve seen this year, by taking different languages, or I should say vector spaces that were trained in different languages or created from corpora of different languages and compared, and via some techniques to sort of compare and rationalize the differences between those spaces, we’re actually able to translate words and translate things between language pairs in ways that actually, in some cases, exceed supervised approaches because typically there are parallel sets of documents that have the same meaning in different languages. But in this case, we’re able to essentially do something very similar to what the Star Trek universal translator does. By consuming enough of the alien language, or the foreign language I should say, it’s able to model the relationships between concepts and then realign those with the concepts that are known.

Chelsea, would you like to comment on that?

Chelsea: I don’t think I have too much to add. I’m also excited about the translation results and I’ve also seen similar, I guess, works that are looking at unsupervised learning, not for translation, that have a little bit of a similar vein, but they’re fairly technical in terms of the actual approach.

Ariel: Yeah, I’m wondering if either of you want to try to take a stab at explaining how this works without mentioning vector spaces?

Richard: That’s difficult because it is a space, I mean it’s a very geometric concept, and it’s because we’re aligning shapes within that space that we actually get the magic happening.

Ariel: So would it be something like you have different languages going in, some sort of document or various documents from different languages going in, and this program just sort of maps them into this space so that it figures out which words are parallel to each other then?

Richard: Well it figures out the relationship between words and based on the shape of relationships in the world, it’s able to take those shapes and rotate them into a way that sort of matches up.

Chelsea: Yeah, perhaps it could be helpful to give an example. I think that generally in language you’re trying to get across concepts, and there is structure within the language, I mean there’s the structure that you learn about in grade school when you’re learning vocabulary. You learn about verbs, you learn about nouns, you learn about people and you learn about different words that describe these different things, and different languages have shared this sort of structure in terms of what they’re trying to communicate.

And so, what these algorithms do is they are given basically data of people talking in English, or people writing documents in English, and they’re also given data in another language — and the first one doesn’t necessarily need to be English. They’re given data in one language and data in another language. This data doesn’t match up. It’s not like one document that’s been translated into another, it’s just pieces of language, documents, conversations, et cetera, and by using the structure that exists, and the data such as nouns, verbs, animals, people, it can basically figure out how to map from the structure of one language to the structure of another language. It can recognize this similar structure in both languages and then figure out basically a mapping from one to the other.

Ariel: Okay. So I think, I want to keep moving forward, but continuing with the concept of learning, and Chelsea I want to stick with you for a minute. You mentioned that there were some really big metalearning advances that occurred last year, and you also mentioned a workshop and symposium at NIPS. I was wondering if you could talk a little more about that.

Chelsea: Yeah, I think that there’s been a lot of excitement around metalearning, or learning to learn. There were two gatherings at NIPS, one symposium, one workshop this year and both were well-attended by a number of people. Actually, metalearning has a fairly long history, and so it’s by no means a recent or a new topic, but I think that it has renewed attention within the machine learning community.

And so, I guess I can describe metalearning. It’s essentially having systems that learn how to learn. There’s a number of different applications for such systems. So one of them is an application that’s often referred to as AutoML, or automatic machine learning, where these systems can essentially optimize the hyper parameters, basically figure out the best set of parameters and then run a learning algorithm with those sets of hyper parameters. Essentially kind of taking the job of the machine learning researcher that is tuning different models on different data sets. And this can basically allow people to more easily train models on a data set.

Another application of metalearning that I’m really excited about is enabling systems to reuse data and reuse experience from other tasks when trying to solve new tasks. So in machine learning, there’s this paradigm of creating everything from scratch, and as a result, if you’re training from scratch, from zero prior knowledge, then it’s going to take a lot of data. It’s going to take a lot of time to train because you’re starting from nothing. But if instead you’re starting from previous experience in a different environment or on a different task, and you can basically learn how to efficiently learn from that data, then when you see a new task that you haven’t seen before, you should be able to solve it much more efficiently.

And so, one example of this is what’s called One-Shot Learning or Few-Shot Learning, where you learn essentially how to learn from a few examples, such that when you see a new setting and you just get one or a few examples, labeled examples, labeled data points, you can figure out the new task and solve the new task just from a small number of examples.

One explicit example of how humans do this is that you can have someone point out a Segway to you on the street, and even if you’ve never seen a Segway before or never heard of the concept of a Segway, just from that one example of a human pointing out to you, you can then recognize other examples of Segways. And the way that you do that is basically by learning how to recognize objects over the course of your lifetime.

Ariel: And are there examples of programs doing this already? Or we’re just making progress towards programs being able to do this more effectively?

Chelsea: There are some examples of programs being able to do this in terms of image recognition. There’s been a number of works that have been able to do this with real images. I think that more recently we’ve started to see systems being applied to robotics, which I think is one of the more exciting applications of this setting because when you’re training a robot in the real world, you can’t have the robot collect millions of data points or days of experience in order to learn a single task. You need it to share and reuse experiences from other tasks when trying to learn a new task.

So one example of this is that you can have a robot be able to manipulate a new object that it’s never seen before based on just one demonstration of how to manipulate that object from a human.

Ariel: Okay, thanks.

I want to move to a topic that is obviously of great interest to FLI and that is technical safety advances that occurred last year. Again in an email to me, you’ve both mentioned “inverse reward design” and “deep reinforcement learning for human preferences” as two areas related to the safety issue that were advanced last year. I was hoping you could both talk a little bit about what you saw happening last year that gives you hope for developing safer AI and beneficial AI.

Richard: So, as I mentioned, both inverse reward design and deep reinforcement learning from human preferences are exciting papers that came out this year.

So inverse reward design is where the AI system is trying to understand what the original designer or what the original user intends for the system to do. So it actually tries, if it’s in some new setting, a test setting where there are some potentially problematic new things that were introduced relative to the training time, then it tries specifically to back those out or to mitigate the effects of those, so that’s kind of exciting.

Deep reinforcement learning from human preferences is an algorithm for trying to very efficiently get feedback from humans based on trajectories in the context of reinforcement learning systems. So, these are systems that are trying to learn some way to plan, let’s say a path through a game environment or in general trying to learn a policy of what to do in a given scenario. This algorithm, deep RL from human preferences, shows little snippets of potential paths to humans and has them simply choose which are better, very similar to what goes on at an optometrist. Does A look better or does B look better? And just from that, very sophisticated behaviors can be learned from human preferences in a way that was not possible before in terms of scale.

Ariel: Chelsea, is there anything that you wanted to add?

Chelsea: Yeah. So, in general, I guess, going back to AlphaZero and going back to games in general, there’s a very clear objective for achieving the goal, which is whether or not you won the game or your score at the game. It’s very clear what the objective is and what each system should be optimizing for. AlphaZero should be, like when playing Go should be optimizing for winning the game, and if a system is playing Atari games it should be optimizing for maximizing the score.

But in the real world, when you’re training systems, when you’re training agents to do things, when you’re training an AI to have a conversation with you, when you’re training a robot to set the table for you, there is no score function. The real world doesn’t just give you a score function, doesn’t tell you whether or not you’re winning or losing. And I think that this research is exciting and really important because it gives us another mechanism for telling robots, telling these AI systems how to do the tasks that we want them to do.

And for example, the human preferences work, it allows us, in sort of specifying some sort of goal that we want the robot to achieve or kind of giving it a demonstration of what we want the robot to achieve, or some sort of reward function, instead lets us say, “okay, this is not what I want, this is what I want,” throughout the process of learning. And then as a result, at the end you can basically guarantee that if it was able to optimize for your preferences successfully, then you’ll end up with behavior that you’re happy with.

Ariel: Excellent. So I’m sort of curious, before we started recording, Chelsea, you were telling me a little bit about your own research. Are you doing anything with this type of work? Or is your work a little different?

Chelsea: Yeah. So more recently I’ve been working on metalearning and so some of the metalearning works that I talked about previously, like learning just from a single demonstration and reusing data, reusing experience that you talked about previously, has been some of the things that I’ve been focusing on recently in terms of getting robots to be able to do things in the real world, such as manipulating objects, pushing objects around, using a spatula, stuff like that.

I’ve also done work on reinforcement learning where you essentially give a robot an objective, tell it to try to get the object as close as possible to the goal, and I think that the human preferences work provides a nice alternative to the classic setting, to the classic framework of reinforcement learning, that we could potentially apply to real robotic systems.

Ariel: Chelsea, I’m going to stick with you for one more question. In your list of breakthroughs that you’re excited about, one of the things that you mentioned is very near and dear to my heart, and that was better communication, and specifically better communication of the research. And I was hoping you could talk a little bit about some of the websites and methods of communicating that you saw develop and grow last year.

Chelsea: Yes. I think that more and more we’re seeing researchers put their work out in blog posts and try to make their work more accessible to the average user by explaining it in terms that are easier to understand, by motivating it in words that are easier for the average person to understand and I think that this is a great way to communicate the research in a clear way to a broader audience.

In addition, I’ve been quite excited about an effort, I think led by Chris Olah, on building what is called distill.pub. It’s a website and a journal, an academic journal, that tries to move away from this paradigm of publishing research on paper, on trees essentially. Because we have such rich digital technology that allows us to communicate in many different ways, it makes sense to move past just completely written forms of research dissemination. And I think that’s what distill.pub does, is it allows us, allows researchers to communicate research ideas in the form of animations, in the form of interactive demonstrations on a computer screen, and I think this is a big step forward and has a lot of potential in terms of moving forward the communication of research, the dissemination of research among the research community as well as beyond to people that are less familiar with the technical concepts in the field.

Ariel: That sounds awesome, Chelsea, thank you. And distill.pub is probably pretty straight forward, but we’ll still link to it on the post that goes along with this podcast if anyone wants to click straight through.

And Richard, I want to switch back over to you. You mentioned that there was more impressive output from GANs last year, generative adversarial networks.

Richard: Yes.

Ariel: Can you tell us what a generative adversarial network is?

Richard: So a generative adversarial network is an AI system where there are two parts, essentially a generator or creator that comes up with novel artifacts and a critic that tries to determine whether this is a good or legitimate or realistic type of thing that’s being generated. So both are learned in parallel as training data is streamed into the system, so in this way, the generator learns relatively efficiently how to create things that are good or realistic.

Ariel: So I was hoping you could talk a little bit about what you saw there that was exciting.

Richard: Sure, so new architectures and new algorithms and simply more horsepower as well have led to more impressive output. Particularly exciting are conditional generative adversarial networks, where there can be structured biases or new types of inputs that one wants to base some output around.

Chelsea: Yeah, I mean, one thing to potentially add is that I think the research on GANs is really exciting and I think that it will not only make advances in generating images of realistic quality, but also generating other types of things, like generating behavior potentially, or generating speech, or generating a language. We haven’t seen as much advances in those areas as generating images, thus far the most impressive advances have been in generating images. I think that those are areas to watch out for as well.

One thing to be concerned about in terms of GANs is the ability for people to generate fake images, fake videos of different events happening and putting those fake images and fake videos into the media, because while there might be ways to detect whether or not these images are made-up or are counterfeited essentially, the public might choose to believe something that they see. If you see something, you’re very likely to believe it, and this might exacerbate all of the, I guess, fake news issues that we’ve had recently.

Ariel: Yeah, so that actually brings up something that I did want to get into, and honestly, that, Chelsea, what you just talked about, is some of the scariest stuff I’ve seen, just because it seems like it has the potential to create sort of a domino effect of triggering all of these other problems just with one fake video. So I’m curious, how do we address something like that? Can we? And are there other issues that you’ve seen crop in the last year that also have you concerned?

Chelsea: I think there are potentially ways to address the problem in that if media websites, if it seems like it’s becoming a real danger in the imminent future, then I think that media websites, including social media websites, should take measures to try to be able to detect fake images and fake videos and either prevent them from being displayed or put a warning that it seems like it was detected as something that was fake, to explicitly try to mitigate the effects.

But, that said, I haven’t put that much thought into it. I do think it’s something that we should be concerned about, and the potential solution that I mentioned, I think that even if it can help solve some of the problems, I think that we don’t have a solution to the problem yet.

Ariel: Okay, thank you. I want to move on to the last question that I have that you both brought up, and that was, last year we saw an increased discussion of fairness in machine learning. And Chelsea, you mentioned there was a NIPS tutorial on this and the keynote mentioned it at NIPS as well. So I was hoping you could talk a bit about what that means, what we saw happen, and how you hope this will play out to better programs in the future.

Chelsea: So, there’s been a lot of discussion in how we can build machine-learning systems, build AI systems such that when they make decisions, they are fair and they aren’t biased. And all this discussion has been around fairness in machine learning, and actually one of the interesting things about the discussion from a technical point of view is how you even define fairness and how you define removing biases and such, because a lot of the biases are inherent to the data itself. And how you try to remove those biases can be a bit controversial.

Ariel: Can you give us some examples?

Chelsea: So one example is, if you’re trying to build an autonomous car system that is trying to avoid hitting pedestrians, and recognize pedestrians when appropriate and respond to them, then if these systems are trained in environments and in communities that are predominantly of one race, for example in Caucasian communities, and you then deploy this system in settings where there are people of color and in other environments that it hasn’t seen before, then the resulting system won’t have as good accuracy on settings that it hasn’t seen before and will be biased inherently, when it for example tries to recognize people of color, and this is a problem.

So some other examples of this is if machine learning systems are making decisions about who to give health insurance to, or speech recognition systems that are trying to recognize different speeches, if these systems are trained on a smaller part of the community that is not representative of the entire population as a whole, then they won’t be able to accurately make decisions about the entire population. Or if they’re trained on data that was collected by humans that has the same biases as humans, then they will make the same mistake, they will inherit the same biases that humans inherit, that humans have.

I think that the people that have been researching fairness in machine learning systems, unfortunately one of the conclusions that they’ve made so far is that there isn’t just a one size fits all solution to all of these different problems, and in many cases we’ll have to think about fairness in individual contexts.

Richard: Chelsea, you mentioned that some of the remediations for fairness issues in machine learning are themselves controversial. Can you go into an example or so about that?

Chelsea: Yeah, I guess part of what I meant there is that even coming up with a definition for what is fair is unclear. It’s unclear what even the problem specification is, and without a problem specification, without a definition of what you want your system to be doing, creating a system that’s fair is a challenge if you don’t have a definition for what fair is.

Richard: I see.

Ariel: So then, my last question to you both, as we look towards 2018, what are you most excited or hopeful to see?

Richard: I’m very hopeful for the FLI grants program that we announced at the very end of 2017 leading to some very interesting and helpful AI safety papers and AI safety research in general that will build on past research and break new ground and will enable additional future research to be built on top of it to make the prospect of general intelligence safer and something that we don’t need to fear as much. But that is a hope.

Ariel: And Chelsea, what about you?

Chelsea: I think I’m excited to see where metalearning goes. I think that there’s a lot more people that are paying attention to it and starting to research into “learning to learn” topics. I’m also excited to see more advances in machine learning for robotics. I think that, unlike other fields in machine learning like machine translation, image recognition, et cetera, I think that robotics still has a long way to go in terms of being useful and solving a range of complex tasks and I hope that we can continue to make strides in machine learning for robotics in the coming year and beyond.

Ariel: Excellent. Well, thank you both so much for joining me today.

Richard: Sure, thank you.

Chelsea: Yeah, I enjoyed talking to you.

 

This podcast was edited by Tucker Davey.

Is There a Trade-off Between Immediate and Longer-term AI Safety Efforts?

Something I often hear in the machine learning community and media articles is “Worries about superintelligence are a distraction from the *real* problem X that we are facing today with AI” (where X = algorithmic bias, technological unemployment, interpretability, data privacy, etc). This competitive attitude gives the impression that immediate and longer-term safety concerns are in conflict. But is there actually a tradeoff between them?

tradeoff

We can make this question more specific: what resources might these two types of efforts be competing for?

Media attention. Given the abundance of media interest in AI, there have been a lot of articles about all these issues. Articles about advanced AI safety have mostly been alarmist Terminator-ridden pieces that ignore the complexities of the problem. This has understandably annoyed many AI researchers, and led some of them to dismiss these risks based on the caricature presented in the media instead of the real arguments. The overall effect of media attention towards advanced AI risk has been highly negative. I would be very happy if the media stopped writing about superintelligence altogether and focused on safety and ethics questions about today’s AI systems.

Funding. Much of the funding for advanced AI safety work currently comes from donors and organizations who are particularly interested in these problems, such as the Open Philanthropy Project and Elon Musk. They would be unlikely to fund safety work that doesn’t generalize to advanced AI systems, so their donations to advanced AI safety research are not taking funding away from immediate problems. On the contrary, FLI’s first grant program awarded some funding towards current issues with AI (such as economic and legal impacts). There isn’t a fixed pie of funding that immediate and longer-term safety are competing for – it’s more like two growing pies that don’t overlap very much. There has been an increasing amount of funding going into both fields, and hopefully this trend will continue.

Talent. The field of advanced AI safety has grown in recent years but is still very small, and the “brain drain” resulting from researchers going to work on it has so far been negligible. The motivations for working on current and longer-term problems tend to be different as well, and these problems often attract different kinds of people. For example, someone who primarily cares about social justice is more likely to work on algorithmic bias, while someone who primarily cares about the long-term future is more likely to work on superintelligence risks.

Overall, there does not seem to be much tradeoff in terms of funding or talent, and the media attention tradeoff could (in theory) be resolved by devoting essentially all the airtime to current concerns. Not only are these issues not in conflict – there are synergies between addressing them. Both benefit from fostering a culture in the AI research community of caring about social impact and being proactive about risks. Some safety problems are highly relevant both in the immediate and longer term, such as interpretability and adversarial examples. I think we need more people working on these problems for current systems while keeping scalability to more advanced future systems in mind.

AI safety problems are too important for the discussion to be derailed by status contests like “my issue is better than yours”. This kind of false dichotomy is itself a distraction from the shared goal of ensuring AI has a positive impact on the world, both now and in the future. People who care about the safety of current and future AI systems are natural allies – let’s support each other on the path towards this common goal.

This article originally appeared on the Deep Safety blog.

MIRI’s January 2018 Newsletter

Our 2017 fundraiser was a huge success, with 341 donors contributing a total of $2.5 million!

Some of the largest donations came from Ethereum inventor Vitalik Buterin, bitcoin investors Christian Calderon and Marius van Voorden, poker players Dan Smith and Tom and Martin Crowley (as part of a matching challenge), and the Berkeley Existential Risk Initiative. Thank you to everyone who contributed!

Research updates

General updates

News and links

AI Should Provide a Shared Benefit for as Many People as Possible

Shared Benefit Principle: AI technologies should benefit and empower as many people as possible.

Today, the combined wealth of the eight richest people in the world is greater than that of the poorest half of the global population. That is, 8 people have more than the combined wealth of 3,600,000,000 others.

This is already an extreme example of income inequality, but if we don’t prepare properly for artificial intelligence, the situation could get worse. In addition to the obvious economic benefits that would befall whoever designs advanced AI first, those who profit from AI will also likely have: access to better health care, happier and longer lives, more opportunities for their children, various forms of intelligence enhancement, and so on.

A Cultural Shift

Our approach to technology so far has been that whoever designs it first, wins — and they win big. In addition to the fabulous wealth an inventor can accrue, the creator of a new technology also assumes complete control over the product and its distribution. This means that an invention or algorithm will only benefit those whom the creator wants it to benefit. While this approach may have worked with previous inventions, many are concerned that advanced AI will be so powerful that we can’t treat it as business-as-usual.

What if we could ensure that as AI is developed we all benefit? Can we make a collective — and pre-emptive — decision to use AI to help raise up all people, rather than just a few?

Joshua Greene, a professor of psychology at Harvard, explains his take on this Principle: “We’re saying in advance, before we know who really has it, that this is not a private good. It will land in the hands of some private person, it will land in the hands of some private company, it will land in the hands of some nation first. But this principle is saying, ‘It’s not yours.’ That’s an important thing to say because the alternative is to say that potentially, the greatest power that humans ever develop belongs to whoever gets it first.”

AI researcher Susan Craw also agreed with the Principle, and she further clarified it.

“That’s definitely a yes,” Craw said, “But it is AI technologies plural, when it’s taken as a whole. Rather than saying that a particular technology should benefit lots of people, it’s that the different technologies should benefit and empower people.”

The Challenge of Implementation

However, as is the case with all of the Principles, agreeing with them is one thing; implementing them is another. John Havens, the Executive Director of The IEEE Global Initiative for Ethical Considerations in Artificial Intelligence and Autonomous Systems, considered how the Shared Benefit Principle would ultimately need to be modified so that the new technologies will benefit both developed and developing countries alike.

“Yes, it’s great,” Havens said of the Principle, before adding, “if you can put a comma after it, and say … something like, ‘issues of wealth, GDP, notwithstanding.’ The point being, what this infers is whatever someone can afford, it should still benefit them.”

Patrick Lin, a philosophy professor at California Polytechnic State University, was even more concerned about how the Principle might be implemented, mentioning the potential for unintended consequences.

Lin explained: “Shared benefit is interesting, because again, this is a principle that implies consequentialism, that we should think about ethics as satisfying the preferences or benefiting as many people as possible. That approach to ethics isn’t always right. … Consequentialism often makes sense, so weighing these pros and cons makes sense, but that’s not the only way of thinking about ethics. Consequentialism could fail you in many cases. For instance, consequentialism might green-light torturing or severely harming a small group of people if it gives rise to a net increase in overall happiness to the greater community.”

“That’s why I worry about the … Shared Benefit Principle,” Lin continued. “[It] makes sense, but [it] implicitly adopts a consequentialist framework, which by the way is very natural for engineers and technologists to use, so they’re very numbers-oriented and tend to think of things in black and white and pros and cons, but ethics is often squishy. You deal with these squishy, abstract concepts like rights and duties and obligations, and it’s hard to reduce those into algorithms or numbers that could be weighed and traded off.”

As we move from discussing these Principles as ideals to implementing them as policy, concerns such as those that Lin just expressed will have to be addressed, keeping possible downsides of consequentialism and utilitarianism in mind.

The Big Picture

The devil will always be in the details. As we consider how we might shift cultural norms to prevent all benefits going only to the creators of new technologies — as well as considering the possible problems that could arise if we do so — it’s important to remember why the Shared Benefit Principle is so critical. Roman Yampolskiy, an AI researcher at the University of Louisville, sums this up:

“Early access to superior decision-making tools is likely to amplify existing economic and power inequalities turning the rich into super-rich, permitting dictators to hold on to power and making oppositions’ efforts to change the system unlikely to succeed. Advanced artificial intelligence is likely to be helpful in medical research and genetic engineering in particular making significant life extension possible, which would remove one the most powerful drivers of change and redistribution of power – death. For this and many other reasons, it is important that AI tech should be beneficial and empowering to all of humanity, making all of us wealthier and healthier.”

What Do You Think?

How important is the Shared Benefit Principle to you? How can we ensure that the benefits of new AI technologies are spread globally, rather than remaining with only a handful of people who developed them? How can we ensure that we don’t inadvertently create more problems in an effort to share the benefits of AI?

This article is part of a series on the 23 Asilomar AI Principles. The Principles offer a framework to help artificial intelligence benefit as many people as possible. But, as AI expert Toby Walsh said of the Principles, “Of course, it’s just a start. … a work in progress.” The Principles represent the beginning of a conversation, and now we need to follow up with broad discussion about each individual principle. You can read the discussions about previous principles here.

Deep Safety: NIPS 2017 Report

This year’s NIPS gave me a general sense that near-term AI safety is now mainstream and long-term safety is slowly going mainstream. On the near-term side, I particularly enjoyed Kate Crawford’s keynote on neglected problems in AI fairness, the ML security workshops, and the Interpretable ML symposium debate that addressed the “do we even need interpretability?” question in a somewhat sloppy but entertaining way. There was a lot of great content on the long-term side, including several oral / spotlight presentations and the Aligned AI workshop.

Value alignment papers

Inverse Reward Design (Hadfield-Menell et al) defines the problem of an RL agent inferring a human’s true reward function based on the proxy reward function designed by the human. This is different from inverse reinforcement learning, where the agent infers the reward function from human behavior. The paper proposes a method for IRD that models uncertainty about the true reward, assuming that the human chose a proxy reward that leads to the correct behavior in the training environment. For example, if a test environment unexpectedly includes lava, the agent assumes that a lava-avoiding reward function is as likely as a lava-indifferent or lava-seeking reward function, since they lead to the same behavior in the training environment. The agent then follows a risk-averse policy with respect to its uncertainty about the reward function.

ird

The paper shows some encouraging results on toy environments for avoiding some types of side effects and reward hacking behavior, though it’s unclear how well they will generalize to more complex settings. For example, the approach to reward hacking relies on noticing disagreements between different sensors / features that agreed in the training environment, which might be much harder to pick up on in a complex environment. The method is also at risk of being overly risk-averse and avoiding anything new, whether it be lava or gold, so it would be great to see some approaches for safe exploration in this setting.

Repeated Inverse RL (Amin et al) defines the problem of inferring intrinsic human preferences that incorporate safety criteria and are invariant across many tasks. The reward function for each task is a combination of the task-invariant intrinsic reward (unobserved by the agent) and a task-specific reward (observed by the agent). This multi-task setup helps address the identifiability problem in IRL, where different reward functions could produce the same behavior.

repeated irl

The authors propose an algorithm for inferring the intrinsic reward while minimizing the number of mistakes made by the agent. They prove an upper bound on the number of mistakes for the “active learning” case where the agent gets to choose the tasks, and show that a certain number of mistakes is inevitable when the agent cannot choose the tasks (there is no upper bound in that case). Thus, letting the agent choose the tasks that it’s trained on seems like a good idea, though it might also result in a selection of tasks that is less interpretable to humans.

Deep RL from Human Preferences (Christiano et al) uses human feedback to teach deep RL agents about complex objectives that humans can evaluate but might not be able to demonstrate (e.g. a backflip). The human is shown two trajectory snippets of the agent’s behavior and selects which one more closely matches the objective. This method makes very efficient use of limited human feedback, scaling much better than previous methods and enabling the agent to learn much more complex objectives (as shown in MuJoCo and Atari).

qbert_trimmed

Dynamic Safe Interruptibility for Decentralized Multi-Agent RL (El Mhamdi et al) generalizes the safe interruptibility problem to the multi-agent setting. Non-interruptible dynamics can arise in a group of agents even if each agent individually is indifferent to interruptions. This can happen if Agent B is affected by interruptions of Agent A and is thus incentivized to prevent A from being interrupted (e.g. if the agents are self-driving cars and A is in front of B on the road). The multi-agent definition focuses on preserving the system dynamics in the presence of interruptions, rather than on converging to an optimal policy, which is difficult to guarantee in a multi-agent setting.

Aligned AI workshop

This was a more long-term-focused version of the Reliable ML in the Wild workshop held in previous years. There were many great talks and posters there – my favorite talks were Ian Goodfellow’s “Adversarial Robustness for Aligned AI” and Gillian Hadfield’s “Incomplete Contracting and AI Alignment”.

Ian made the case of ML security being important for long-term AI safety. The effectiveness of adversarial examples is problematic not only from the near-term perspective of current ML systems (such as self-driving cars) being fooled by bad actors. It’s also bad news from the long-term perspective of aligning the values of an advanced agent, which could inadvertently seek out adversarial examples for its reward function due to Goodhart’s law. Relying on the agent’s uncertainty about the environment or human preferences is not sufficient to ensure safety, since adversarial examples can cause the agent to have arbitrarily high confidence in the wrong answer.

ian talk_3

Gillian approached AI safety from an economics perspective, drawing parallels between specifying objectives for artificial agents and designing contracts for humans. The same issues that make contracts incomplete (the designer’s inability to consider all relevant contingencies or precisely specify the variables involved, and incentives for the parties to game the system) lead to side effects and reward hacking for artificial agents.

Gillian talk_4

The central question of the talk was how we can use insights from incomplete contracting theory to better understand and systematically solve specification problems in AI safety, which is a really interesting research direction. The objective specification problem seems even harder to me than the incomplete contract problem, since the contract design process relies on some level of shared common sense between the humans involved, which artificial agents do not currently possess.

Interpretability for AI safety

I gave a talk at the Interpretable ML symposium on connections between interpretability and long-term safety, which explored what forms of interpretability could help make progress on safety problems (slidesvideo). Understanding our systems better can help ensure that safe behavior generalizes to new situations, and it can help identify causes of unsafe behavior when it does occur.

For example, if we want to build an agent that’s indifferent to being switched off, it would be helpful to see whether the agent has representations that correspond to an off-switch, and whether they are used in its decisions. Side effects and safe exploration problems would benefit from identifying representations that correspond to irreversible states (like “broken” or “stuck”). While existing work on examining the representations of neural networks focuses on visualizations, safety-relevant concepts are often difficult to visualize.

Local interpretability techniques that explain specific predictions or decisions are also useful for safety. We could examine whether features that are idiosyncratic to the training environment or indicate proximity to dangerous states influence the agent’s decisions. If the agent can produce a natural language explanation of its actions, how does it explain problematic behavior like reward hacking or going out of its way to disable the off-switch?

There are many ways in which interpretability can be useful for safety. Somewhat less obvious is what safety can do for interpretability: serving as grounding for interpretability questions. As exemplified by the final debate of the symposium, there is an ongoing conversation in the ML community trying to pin down the fuzzy idea of interpretability – what is it, do we even need it, what kind of understanding is useful, etc. I think it’s important to keep in mind that our desire for interpretability is to some extent motivated by our systems being fallible – understanding our AI systems would be less important if they were 100% robust and made no mistakes. From the safety perspective, we can define interpretability as the kind of understanding that help us ensure the safety of our systems.

For those interested in applying the interpretability hammer to the safety nail, or working on other long-term safety questions, FLI has recently announced a new grant program. Now is a great time for the AI field to think deeply about value alignment. As Pieter Abbeel said at the end of his keynote, “Once you build really good AI contraptions, how do you make sure they align their value system with our value system? Because at some point, they might be smarter than us, and it might be important that they actually care about what we care about.”

(Thanks to Janos Kramar for his feedback on this post, and to everyone at DeepMind who gave feedback on the interpretability talk.)

This article was originally posted here.

Research for Beneficial Artificial Intelligence

Research Goal: The goal of AI research should be to create not undirected intelligence, but beneficial intelligence.

It’s no coincidence that the first Asilomar Principle is about research. On the face of it, the Research Goal Principle may not seem as glamorous or exciting as some of the other Principles that more directly address how we’ll interact with AI and the impact of superintelligence. But it’s from this first Principle that all of the others are derived.

Simply put, without AI research and without specific goals by researchers, AI cannot be developed. However, participating in research and working toward broad AI goals without considering the possible long-term effects of the research could be detrimental to society.

There’s a scene in Jurassic Park, in which Jeff Goldblum’s character laments that the scientists who created the dinosaurs “were so preoccupied with whether or not they could that they didn’t stop to think if they should.” Until recently, AI researchers have also focused primarily on figuring out what they could accomplish, without longer-term considerations, and for good reason: scientists were just trying to get their AI programs to work at all, and the results were far too limited to pose any kind of threat.

But in the last few years, scientists have made great headway with artificial intelligence. The impacts of AI on society are already being felt, and as we’re seeing with some of the issues of bias and discrimination that are already popping up, this isn’t always good.

Attitude Shift

Unfortunately, there’s still a culture within AI research that’s too accepting of the idea that the developers aren’t responsible for how their products are used. Stuart Russell compares this attitude to that of civil engineers, who would never be allowed to say something like, “I just design the bridge; someone else can worry about whether it stays up.”

Joshua Greene, a psychologist from Harvard, agrees. He explains:

“I think that is a bookend to the Common Good Principle [#23] – the idea that it’s not okay to be neutral. It’s not okay to say, ‘I just make tools and someone else decides whether they’re used for good or ill.’ If you’re participating in the process of making these enormously powerful tools, you have a responsibility to do what you can to make sure that this is being pushed in a generally beneficial direction. With AI, everyone who’s involved has a responsibility to be pushing it in a positive direction, because if it’s always somebody else’s problem, that’s a recipe for letting things take the path of least resistance, which is to put the power in the hands of the already powerful so that they can become even more powerful and benefit themselves.”

What’s Beneficial?

Other AI experts I spoke with agreed with the general idea of the Principle, but didn’t see quite eye-to-eye on how it was worded. Patrick Lin, for example was concerned about the use of the word “beneficial” and what it meant, while John Havens appreciated the word precisely because it forces us to consider what “beneficial” means in this context.

“I generally agree with this research goal,” explained Lin, a philosopher at Cal Poly. “Given the potential of AI to be misused or abused, it’s important to have a specific positive goal in mind. I think where it might get hung up is what this word ‘beneficial’ means. If we’re directing it towards beneficial intelligence, we’ve got to define our terms; we’ve got to define what beneficial means, and that to me isn’t clear. It means different things to different people, and it’s rare that you could benefit everybody.”

Meanwhile, Havens, the Executive Director of The IEEE Global Initiative for Ethical Considerations in Artificial Intelligence and Autonomous Systems, was pleased the word forced the conversation.

“I love the word beneficial,” Havens said. “I think sometimes inherently people think that intelligence, in one sense, is always positive. Meaning, because something can be intelligent, or autonomous, and that can advance technology, that that is a ‘good thing’. Whereas the modifier ‘beneficial’ is excellent, because you have to define: What do you mean by beneficial? And then, hopefully, it gets more specific, and it’s: Who is it beneficial for? And, ultimately, what are you prioritizing? So I love the word beneficial.”

AI researcher Susan Craw, a professor at Robert Gordon University, also agrees with the Principle but questioned the order of the phrasing.

“Yes, I agree with that,” Craw said, but adds, “I think it’s a little strange the way it’s worded, because of ‘undirected.’ It might even be better the other way around, which is, it would be better to create beneficial research, because that’s a more well-defined thing.”

Long-term Research

Roman Yampolskiy, an AI researcher at the University of Louisville, brings the discussion back to the issues of most concern for FLI:

“The universe of possible intelligent agents is infinite with respect to both architectures and goals. It is not enough to simply attempt to design a capable intelligence, it is important to explicitly aim for an intelligence that is in alignment with goals of humanity. This is a very narrow target in a vast sea of possible goals and so most intelligent agents would not make a good optimizer for our values resulting in a malevolent or at least indifferent AI (which is likewise very dangerous). It is only by aligning future superintelligence with our true goals, that we can get significant benefit out of our intellectual heirs and avoid existential catastrophe.”

And with that in mind, we’re excited to announce we’ve launched a new round of grants! If you haven’t seen the Request for Proposals (RFP) yet, you can find it here. The focus of this RFP is on technical research or other projects enabling development of AI that is beneficial to society, and robust in the sense that the benefits are somewhat guaranteed: our AI systems must do what we want them to do.

If you’re a researcher interested in the field of AI, we encourage you to review the RFP and consider applying.

This article is part of a series on the 23 Asilomar AI Principles. The Principles offer a framework to help artificial intelligence benefit as many people as possible. But, as AI expert Toby Walsh said of the Principles, “Of course, it’s just a start. … a work in progress.” The Principles represent the beginning of a conversation, and now we need to follow up with broad discussion about each individual principle. You can read the discussions about previous principles here.

Podcast: Beneficial AI and Existential Hope in 2018

For most of us, 2017 has been a roller coaster, from increased nuclear threats to incredible advancements in AI to crazy news cycles. But while it’s easy to be discouraged by various news stories, we at FLI find ourselves hopeful that we can still create a bright future. In this episode, the FLI team discusses the past year and the momentum we’ve built, including: the Asilomar Principles, our 2018 AI safety grants competition, the recent Long Beach workshop on Value Alignment, and how we’ve honored one of civilization’s greatest heroes.

Full transcript:

Ariel: I’m Ariel Conn with the Future of Life Institute. As you may have noticed, 2017 was quite the dramatic year. In fact, without me even mentioning anything specific, I’m willing to bet that you already have some examples forming in your mind of what a crazy year this was. But while it’s easy to be discouraged by various news stories, we at FLI find ourselves hopeful that we can still create a bright future. But I’ll let Max Tegmark, president of FLI, tell you a little more about that.

Max: I think it’s important when we reflect back at the years news to understand how things are all connected. For example, the drama we’ve been following with Kim Jung Un and Donald Trump and Putin with nuclear weapons, is really very connected to all the developments in artificial intelligence because in both cases we have a technology which is so powerful that it’s not clear that we humans have sufficient wisdom to manage it well. And that’s why I think it’s so important that we all continue working towards developing this wisdom further, to make sure that we can use these powerful technologies like nuclear energy, like artificial intelligence, like biotechnology and so on to really help rather than to harm us.

Ariel: And it’s worth remembering that part of what made this such a dramatic year was that there were also some really positive things that happened. For example, in March of this year, I sat in a sweltering room in New York City, as a group of dedicated, caring individuals from around the world discussed how they planned to convince the United Nations to ban nuclear weapons once and for all. I don’t think anyone in the room that day realized that not only would they succeed, but by December of this year, the International Campaign to Abolish Nuclear Weapons, led by Beatrice Fihn would be awarded the Nobel Peace Prize for their efforts. And while we did what we could to help that effort, our own big story had to be the Beneficial AI Conference that we hosted in Asilomar California. Many of us at FLI were excited to talk about Asilomar, but I’ll let Anthony Aguirre, Max, and Victoria Krakovna start.

Anthony: I would say pretty unquestionably the big thing that I felt was most important and felt most excited about was the big meeting in Asilomar and centrally putting together the Asilomar Principles.

Max: I’m going to select the Asilomar conference that we organized early this year, whose output was the 23 Asilomar Principles, which has since been signed by over a thousand AI researchers around the world.

Vika: (take 2) I was really excited about the Asilomar conference that we organized this year. This was the sequel to FLI’s Puerto Rico Conference, which was at the time a real game changer in terms of making AI safety more mainstream and connecting people working in AI safety with the machine learning community and integrating those two. I think Asilomar did a great job of continuing to build on that.

Max: I’m very excited about this because I feel that it really has helped mainstream AI safety work. Not just near term AI safety stuff, like how to transform today’s buggy and hackable computers into robust systems that you can really trust but also mainstream larger issues. The Asilomar Principles actually contain the word super intelligence, contain the phrase existential risk, contain the phrase recursive self improvement and yet they have been signed by really a who’s who in AI. So it’s from now on, it’s impossible for anyone to dismiss these kind of concerns, this kind of safety research. By saying, that’s just people who have no clue about AI.

Anthony: That was a process that started in 2016, brainstorming at FLI and then the wider community and then getting rounds of feedback and so on. But it was exciting both to see how much cohesion there was in the community and how much support there was for getting behind some sort of principles governing AI. But also, just to see the process unfold because one of the things that I’m quite frustrated about often is this sense that there’s this technology that’s just unrolling like a steam roller and it’s going to go where it’s going to go, and we don’t have any agency over where that is. And so to see people really putting thought into what is the world we would like there to be in ten, fifteen, twenty, fifty years and how can we distill what it is that we like about that world into principles like these…that felt really, really good. It felt like an incredibly useful thing for society as a whole but in this case, the people who are deeply engaged with AI, to be thinking through in a real way rather than just how can we put out the next fire, or how can we just turn the progress one more step forward, to really think about the destination.

Ariel: But what’s that next step? How do we transition from Principles that we all agree on to actions that we can also all get behind. Jessica Cussins joined FLI later in the year, but when asked what she was excited about as far as FLI was concerned, she immediately mentioned the implementation of things like the Asilomar Principles.

Jessica: I’m most excited about the developments we’ve seen over the last year related to safe, beneficial and ethical AI. I think FLI has been a really important player in this. We had the beneficial AI conference in January that resulted in the Asilomar AI Principles. It’s been really amazing to see how much traction those principles have gotten and to see a growing consensus around the importance of being thoughtful about the design of AI systems, the challenges of algorithmic bias of data control and manipulation and accountability and governance. So the thing I’m most excited about right now, is the growing number of initiatives we’re seeing around the world related to ethical and beneficial IA.

Anthony: What’s been great to see is the development of ideas both from FLI and from many other organizations of what policies might be good. What concrete legislative actions there might be or standards, organizations or non-profits, agreements between companies and so on might be interesting.

But I think, we’re only at the step of formulating those things and not that much action has been taken anywhere in terms of actually doing those things. Little bits of legislation here and there. But I think we’re getting to the point where lots of governments, lots of companies, lots of organizations are going to be publishing and creating and passing more and more of these things. I think seeing that play out and working really hard to ensure that it plays out in a way that’s favorable in as many ways and as many people as possible, I think is super important and something we’re excited to do.

Vika: I think that Asilomar principles are a great common point for the research community and others to agree what we are going for, what’s important.

Besides having the principles as an output, the event itself was really good for building connections between different people from interdisciplinary backgrounds, from different related fields who are interested in the questions of safety and ethics.

And we also had this workshop that was adjacent to Asilomar where our grant winners actually presented their work. I think it was great to have a concrete discussion of research and the progress we’ve made so far and not just abstract discussions of the future, and I hope that we can have more such technical events, discussing research progress and making the discussion of AI safety really concrete as time goes on.

Ariel: And what is the current state of AI safety research? Richard Mallah took on the task of answering that question for the Asilomar conference, while Tucker Davey has spent the last year interviewing various FLI grant winners to better understand their work.

Richard: I presented a landscape of technical AI safety research threads. This lays out hundreds of different types of research areas and how they are related to each other. All different areas that need a lot more research going into them than they have today to help keep AI safe and beneficent and robust. I was really excited to be at Asilomar and to have co-organized Asilomar and that so many really awesome people were there and collaborating on these different types of issues. And that they were using that landscape that I put together as sort of a touchpoint and way to coordinate. That was pretty exciting.

Tucker: I just found it really inspiring interviewing all of our AI grant recipients. It’s kind of been an ongoing project interviewing these researchers and writing about what they’re doing. Just for me, getting recently involved in AI, it’s been incredibly interesting to get either a half an hour, an hour with these researchers to talk in depth about their work and really to learn more about a research landscape that I hadn’t been aware of before working at FLI. Really, being a part of those interviews and learning more about the people we’re working with and these people that are really spearheading AI safety was really inspiring to be a part of.

Ariel: And with that, we have a big announcement.

Richard: So, FLI is launching a new grants program in 2018. This time around, we will be focusing more on artificial general intelligence, artificial super intelligence and ways that we can do technical research and other kinds of research today. On today’s systems or things that we can analyze today, things that we can model or make theoretical progress on today that are likely to actually still be relevant at the time, where AGI comes about. This is quite exciting and I’m excited to be part of the ideation and administration around that.

Max: I’m particularly excited about the new grants program that we’re launching for AI safety research. Since AI safety research itself has become so much more mainstream, since we did our last grants program three years ago, there’s now quite a bit of funding for a number of near term challenges. And I feel that we at FLI should focus on things more related to challenges and opportunities from super intelligence, since there is virtually no funding for that kind of safety research. It’s going to be really exciting to see what proposals come in and what research teams get selected by the review panels. Above all, how this kind of research hopefully will contribute to making sure that we can use this powerful technology to create a really awesome future.

Vika: I think this grant program could really build on the impact of our previous grant program. I’m really excited that it’s going to focus more on long term AI safety research, which is still the most neglected area.

AI safety has really caught on in the past two years, and there’s been a lot more work on that going on, which is great. And part of what this means is that the we at FLI can focus more on the long term. The long term work has also been getting more attention, and this grant program can help us build on that and make sure that the important problems get solved. This is really exciting.

Max: I just came back from spending a week at the NIPS Conference, the biggest artificial intelligence conference of the year. Its fascinating how rapidly everything is proceeding. AlphaZero has now defeated not just human chess players and Go players but it has also defeated human AI researchers, who after spending 30 years handcrafting artificial intelligence software to play computer chess, got all their work completely crushed by AlphaZero that just learned to do much better than that from scratch in four hours.

So, AI is really happening, whether we like it or not. The challenge we face is simply to compliment that through AI safety research and a lot of good thinking to make sure that this helps humanity flourish rather than flounder.

Ariel: In the spirit of flourishing, FLI also turned its attention this year to the movement to ban lethal autonomous weapons. While there is great debate around how to define autonomous weapons and whether or not they should be developed, more people tend to agree that the topic should at least come before the UN for negotiations. And so we helped create the video Slaughterbots to help drive this conversation. I’ll let Max take it from here.

Max: Slaughterbots, autonomous little drones that can go anonymously murder people without any human control. Fortunately, they don’t exist yet. We hope that an international treaty is going to keep it that way, even though we almost have the technology to do them already. Just need to integrate then mass produce tech we already have. So to help with this, we made this video called Slaughterbots. It was really impressive to see it get over forty million views and make the news throughout the world. I was very happy that Stewart Russell, whom we partnered with in this, also presented this to the diplomats at the United Nations in Geneva when they were discussing whether to move towards a treaty, drawing a line in the sand.

Anthony: Pushing on the autonomous weapons front, it’s been really scary, I would say to think through that issue. But a little bit like the issue of AI, in general, there’s a potential scary side but there’s also a potentially helpful side in that I think this is an issue that is a little bit tractable. Even a relatively small group of committed individuals can make difference. So I think, I’m excited to see how much movement we can get on the autonomous weapons front. It doesn’t seem at all like a hopeless issue to me and I think 2018 will be kind of a turning point — I hope that will be sort of a turning point for that issue. It’s kind of flown under the radar but it really is coming up now and it will be at least interesting. Hopefully, it will be exciting and happy and so on as well as interesting. It will at least be interesting to see how it plays out on the world stage.

Jessica: For 2018, I’m hopeful that we will see the continued growth of the global momentum against lethal autonomous weapons. Already, this year a lot has happened at the United Nations and across communities around the world, including thousands of AI and robotics researchers speaking out and saying they don’t want to see their work used to create these kinds of destabilizing weapons of mass destruction. One thing I’m really excited for 2018 is to see a louder, rallying call for an international ban of lethal autonomous weapons.

Ariel: Yet one of the biggest questions we face when trying to anticipate autonomous weapons and artificial intelligence in general, and even artificial general intelligence – one of the biggest questions is: when? When will these technologies be developed? If we could answer that, then solving problems around those technologies could become both more doable and possibly more pressing. This is an issue Anthony has been considering.

Anthony: Of most interest has been the overall set of projects to predict artificial intelligence timelines and milestones. This is something that I’ve been doing through this prediction website, Metaculus, which I’ve been a part of. And also something where I’ve took part in a very small workshop run by the Foresight Institute over the summer. It’s both a super important question because I think the overall urgency with which we have to deal with certain issues really depends on how far away they are. It’s also an instructive one, in that even posing the questions of what do we want to know exactly, really forces you to think through what is it that you care about, how would you estimate things, what different considerations are there in terms of this sort of big question.

We have this sort of big question, like when is really powerful AI going to appear? But when you dig into that, what exactly is really powerful, what exactly…  What does appear mean? Does that mean in sort of an academic setting? Does it mean becomes part of everybody’s life?

So there are all kinds of nuances to that overall big question that lots of people asking. Just getting into refining the questions, trying to pin down what it is that mean — make them exact so that they can be things that people can make precise and numerical predictions about. I think its been really, really interesting and elucidating to me and in sort of understanding what all the issues are. I’m excited to see how that kind of continues to unfold as we get more questions and more predictions and more expertise focused on that. Also, a little but nervous because the timeline seemed to be getting shorter and shorter and the urgency of the issue seems to be getting greater and greater. So that’s a bit of a fire under us, I think, to keep acting and keep a lot of intense effort on making sure that as AI gets more powerful, we get better at managing it.

Ariel: One of the current questions AI researchers are struggling with is the problem of value alignment, especially when considering more powerful AI. Meia Chita-Tegmark and Lucas Perry recently co-organized an event to get more people thinking creatively about how to address this.

Meia: So we just organized a workshop about the ethics of value alignment together with a few partner organizations, the Berggruen Institute and also CFAR.

Lucas: This was a workshop recently that took place in California and just to remind everyone, value alignment is the process by which we bring AI’s actions, goals, and intention in alignment with and in accordance with what is deemed to be the good or what are human values and preferences and goals and intentions.

Meia: And we had a fantastic group of thinkers there. We had philosophers. We had social scientists, AI researchers, political scientists. We were all discussing this very important issue of how do we get an artificial intelligence that is aligned to our own goals and our own values.

It was really important to have the perspectives of ethicists and moral psychologists, for example, because this question is not just about the technical aspect of how do you actually implement it, but also about whose values do we want implemented and who should be part of the conversation and who gets excluded and what process do we want to establish to collect all the preferences and values that we want implemented in AI. That was really fantastic. It was a very nice start to what I hope will continue to be a really fruitful collaboration between different disciplines on this very important topic.

Lucas: I think one essential take-away from that was that value alignment is truly something that is interdisciplinary. It’s normally been something which has been couched and understood in the context of technical AI safety research, but value alignment, at least in my view, also inherently includes ethics and governance. It seems that the project of creating beneficial AI through efforts and value alignment can really only happen when we have lots of different people from lots of different disciplines working together on this supremely hard issue.

Meia: I think the issue with AI is something that … first of all, it concerns such a great number of people. It concerns all of us. It will impact, and it already is impacting all of our experiences. There’re different disciplines that look at this impact from different ways.

Of course, technical AI researchers will focus on developing this technology, but it’s very important to think about how does this technology co-evolve with us. For example, I’m a psychologist. I like to think about how does it impact our own psyche. How does it impact the way we act in the world, the way we behave. Stuart Russell many times likes to point out that one danger that can come with very intelligent machines is a subtle one, not necessarily what they will do, but what we will not do because of them. He calls this enfeeblement. What are the capacities that are being stifled because we no longer engage in some of the cognitive tasks that we’re now delegating to AIs.

So that’s just one example of how, for example, psychologists can help really bring more light and make us reflect on what is it that we want from our machines and how do we want to interact with them and how do we wanna design them such that they actually empower us rather than enfeeble us.

Lucas: Yeah, I think that one essential thing to FLI’s mission and goal is the generation of beneficial AI. To me, and I think many other people coming out of this Ethics of Value Alignment conference, you know, what beneficial exactly entails and what beneficial looks like is still a really open question both in the short term and in the long-term. I’d be really interested in seeing both FLI and other organizations pursue questions in value alignment more vigorously. Issues with regard to the ethics of AI and issues regarding value and the sort of world that we want to live in.

Ariel: And what sort of world do we want to live in? If you’ve made it this far through the podcast, you might be tempted to think that all we worry about is AI. And we do think a lot about AI. But our primary goal is to help society flourish. And so this year, we created the Future of Life Award to be presented to people who act heroically to ensure our survival and hopefully move us closer to that ideal world. Our inaugural award was presented in honor of Vasili Arkhipov who stood up to his commander on a Soviet submarine, and prevented the launch of a nuclear weapon during the height of tensions in the Cold War.

Tucker: One thing that particularly stuck out to me was our inaugural Future of Life Award and we presented this award to Vasili Arkhipov who was a Soviet officer in the Cold War and arguably saved the world and is the reason we’re all alive today. He’s now passed, but FLI presented a generous award to his daughter and his grandson. It was really cool to be a part of this because it seemed like the first award of its kind.

Meia: So, of course with FLI, we have all these big projects that take a lot of time. But I think for me, one of the more exciting and heartwarming and wonderful moments that I was able to experience due to our work here at FLI was a train ride from London to Cambridge with Elena and Sergei, the daughter and the grandson of Vasili Arkhipov. Vasili Arkhipov is this Russian naval officer that helped prevent a second world war in the Cuban missile crisis. The Future of Life Institute awarded him the Future of Life prize this year. He is now dead unfortunately, but his daughter and his grandson was there in London to receive it.

Vika: It was great to get to meet them in person and to all go on stage together and have them talk about their attitude towards the dilemma that Vasili Arkhipov has faced, and how it is relevant today, and how we should be really careful with nuclear weapons and protecting our future. It was really inspiring.

At that event, Max was giving his talk about his book, and then at the end we had the Arkhipovs come up on stage and it was kind of fun for me to translate their speech to the audience. I could not fully transmit all the eloquence, but thought it was a very special moment.

Meia: It was just so amazing to really listen to their stories about the father, the grandfather, and look at photos that they had brought all the way from Moscow. This person who has become the hero for so many people that are really concerned about this essential risk, it was nice to really imagine him in his capacity as a son, as a grandfather, as a husband, as a human being. It was very inspiring and touching.

One of the nice things was they showed a photo of him that had actually notes that he had written on the back of it. That was his favorite photo. And one of the comments he made is that he felt that that was the most beautiful photo of himself because there was no glint in his eyes. It was just this pure sort of concentration. I thought that said a lot about his character. He rarely smiled in photos, also. Also always looked very pensive. Very much like you’d imagine a hero who saved the world would be.

Tucker: It was especially interesting for me to work on the press release for this award and to reach out to people from different news outlets, like The Guardian and The Atlantic, and to actually see them write about this award.

I think something like the Future of Life Award is inspiring because it highlights people in the past that have done an incredible service to civilization, but I also think it’s interesting to look forward and think about who might be the future Vasili Arkhipov that saves the world.

Ariel: As Tucker just mentioned, this award was covered by news outlets like the Guardian and the Atlantic. And in fact, we’ve been incredibly fortunate to have many of our events covered by major news. However, there are even more projects we’ve worked on that we think are just as important and that we’re just as excited about that most people probably aren’t aware of.

Jessica: So people may not know that FLI recently joined the partnership on AI. This was the group that was founded by Google and Amazon, Facebook and Apple and others to think about issues like safety, and fairness and impact from AI systems. So I’m excited about this because I think it’s really great to see this kind of social commitment from industry, and it’s going to be critical to have the support and engagement from these players to really see AI being developed in a way that’s positive for everyone. So I’m really happy that FLI is now one of the partners of what will likely be an important initiative for AI.

Anthony: I attending the first meeting of the partnership on AI in October. And to see, at that meeting, so much discussion of some of the principles themselves directly but just in a broad sense. So much discussion from all of the key organizations that are engaged with AI, that almost all of whom had representation there, about how are we going to make these things happen. If we value transparency, if we value fairness, if we value safety and trust in AI systems, how are we going to actually get together and formulate best practices and policies, and groups and data sets and things to make all that happen. And to see the speed at which, I would say the field has moved from purely, wow, we can do this, to how are we going to do this right and how are we going to do this well and what does this all mean, has been a ray of hope I would say.

AI is moving so fast but it was good to see that I think the sort of wisdom race hasn’t been conceded entirely. That there are dedicated group of people that are working really hard to figure out how to do it well.

Ariel: And then there’s Dave Stanley, who has been the force around many of the behind-the-scenes projects that our volunteers have been working on that have helped FLI grow this year.

Dave: As for another project that has very much been ongoing and more relates to the website is basically our ongoing effort to make the English content on the website that’s been fairly influential in English speaking countries about AI safety and nuclear weapons, take that content and make it available in a lot of other languages to maximize the impact that it’s having.

Right now, thanks to the efforts of our volunteers, we have 55 translations available on our website right now in nine different languages, which are Russian, Chinese, French, Polish, Spanish, German, Hindi, Japanese, and Korean. All in all, this represents about 1000 hours of volunteer time put in by our volunteers. I’d just like to give a shoutout to some of the volunteers who have been involved. They are Alan Yan, Kevin Wang, Kazue Evans, Jake Beebe, Jason Orlosky, Li Na, Bena Lim, Alina Kovtun, Ben Peterson, Carolyn Wu, Zhaoran Joanna Wang, Mayumi Nakamura, Derek Su, Dipti Pandey, Marvin, Vera Koroleva, Grzegorz Orwiński, Szymon Radziszewicz, Natalia Berezovskaya, Vladimir Nimensky, Natalia Kuzmenko, George Godula, Eric Gastfriend, Olivier Grondin, Claire Park, Kristy Wen, Yishuai Du, and Revathi Vinoth Kumar.

Ariel: As we’ve worked to establish AI safety as a global effort, Dave and the volunteers were behind the trip Richard took to China, where he participated in the Global Mobile Internet Conference in Beijing earlier this year.

Dave: So basically, this was something that was actually prompted and largely organized by one of FLIs volunteers, George Godula, who’s based in Shanghai right now.

Basically, this is partially motivated by the fact that recently, China’s been promoting a lot of investment in artificial intelligence research, and they’ve made it a national objective to become a leader in AI research by 2025. So FLI and the team have been making some efforts to basically try to build connections with China and raise awareness about AI safety, at least our view on AI safety and engage in dialogue there.

It’s culminated with George organizing this trip for Richard, and A large portion of the FLI volunteer team participating in basically support for that trip. So identifying contacts for Richard to connect with over there and researching the landscape and providing general support for that. And then that’s been coupled with an effort to take some of the existing articles that FLI has on their website about AI safety and translate those to Chinese to make it accessible to that audience.

Ariel: In fact, Richard has spoken at many conferences, workshops and other events this year, and he’s noted a distinct shift in how AI researchers view AI safety.

Richard: This is a single example of many of these things I’ve done throughout the year. Yesterday I gave a talk to a bunch of machine learning and artificial intelligence researchers and entrepreneurs in Boston, here where I’m based about AI safety and beneficence. Every time I do this it’s really fulfilling that so many of these people who really are pushing the leading edge of what AI does in many respects. They realize that these are extremely valid concerns and there are new types of technical avenues to help just keep things better for the future. The facts that I’m not receiving push back anymore as compared to many years ago when I would talk about these things — that people really are trying to gauge and understand and kind of weave themselves into whatever is going to turn into the best outcome for humanity. Given the type of leverage that advanced AI will bring us. I think people are starting to really get what’s at stake.

Ariel: And this isn’t just the case among AI researchers. Throughout the year, we’ve seen this discussion about AI safety broaden into various groups outside of traditional AI circles, and we’re hopeful this trend will continue in 2018.

Meia: I think that 2017 has been fantastic to start this project of getting more thinkers from different disciplines to really engage with the topic of artificial intelligence, but I think we are just manage to scratch the surface of this topic in this collaboration. So I would really like to work more on strengthening this conversation and this flow of ideas between different disciplines. I think we can achieve so much more if we can make sure that we hear each other, that we go past our own disciplinary jargon, and that we truly are able to communicate and join each other in research projects where we can bring different tools and different skills to the table.

Ariel: The landscape on AI safety research that Richard presented at Asilomar at the start of the year was designed to enable greater understanding among researchers. Lucas rounded off the year with another version of the landscape. This one looking at ethics and value alignment with the goal, in part, of bringing more experts from other fields into the conversation.

Lucas: One thing that I’m also really excited about for next year is seeing our conceptual landscapes of both AI safety and value alignment being used in more educational context and in context in which they can foster interdisciplinary conversations regarding issues in AI. I think that their virtues are that they create a conceptual landscape of both AI safety and value alignment, but also include definitions and descriptions of jargon. Given this, it functions both as a means by which you can introduce people to AI safety and value alignment and AI risk, but it also serves as a means of introducing experts to sort of the conceptual mappings of the spaces that other experts are engaged with and so they can learn each other’s jargon and really have conversations that are fruitful and sort of streamlined.

Ariel: As we look to 2018, we hope to develop more programs, work on more projects, and participate in more events that will help draw greater attention to the various issues we care about. We hope to not only spread awareness, but also to empower people to take action to ensure that humanity continues to flourish in the future.

Dave: There’s a few things that are coming up that I’m really excited about. The first one is basically we’re going to be trying to release some new interactive apps on the website that’ll hopefully be pages that can gather a lot of attention and educate people about the issues that we’re focused on, mainly nuclear weapons, and answering questions to give people a better picture of what are the geopolitical and economic factors that motivate countries to keep their nuclear weapons and how does this relate to public support, based on polling data, for whether the general public wants to keep these weapons or not.

Meia: One thing that I think has made me also very excited in 2017, and I’m looking forward to seeing the evolution of in 2018 was the public’s engagement with this topic. I’ve had the luck to be in the audience for many of the book talks that Max has given for his book “Life 3.0: Being Human in the Age of Artificial Intelligence,” and it was fascinating just listening to the questions. They’ve become so much more sophisticated and nuanced than a few years ago. I’m very curious to see how this evolves in 2018, and I hope that FLI will contribute to this conversation and making it more rich. I think I’d like people in general to get engaged with this topic much more, and refine their understanding of it.

Tucker: Well, I think in general it’s been amazing to watch FLI this year because we’ve made big splashes in so many different things with the Asilomar conference, with our Slaughterbots video, helping with the nuclear ban, but I think one thing that I’m particularly interested in is working more this coming year to I guess engage my generation more on these topics. I sometimes sense a lot of defeatism and hopelessness with people in my generation. Kind of feeling like there’s nothing we can do to solve civilization’s biggest problems. I think being at FLI has kind of given me the opposite perspective. Sometimes I’m still subject to that defeatism, but working here really gives me a sense that we can actually do a lot to solve these problems. I’d really like to just find ways to engage more people in my generation to make them feel like they actually have some sense of agency to solve a lot of our biggest challenges.

Ariel: Learn about these issues and more, join the conversation, and find out how you can get involved by visiting futureoflife.org.

[end]

 

2018 International AI Safety Grants Competition

I. THE FUTURE OF AI: REAPING THE BENEFITS WHILE AVOIDING PITFALLS

For many years, artificial intelligence (AI) research has been appropriately focused on the challenge of making AI effective, with significant recent success, and great future promise. This recent success has raised an important question: how can we ensure that the growing power of AI is matched by the growing wisdom with which we manage it? In an open letter in 2015, a large international group of leading AI researchers from academia and industry argued that this success makes it important and timely to research also how to make AI systems robust and beneficial, and that this includes concrete research directions that can be pursued today. In early 2017, a broad coalition of AI leaders went further and signed the Asilomar AI Principles, which articulate beneficial AI requirements in greater detail.

The first Asilomar Principle is that The goal of AI research should be to create not undirected intelligence, but beneficial intelligence, and the second states that Investments in AI should be accompanied by funding for research on ensuring its beneficial use, including thorny questions in computer science, economics, law, ethics, and social studies…”  The aim of this request for proposals is to support research that serves these and other goals indicated by the Principles.

The focus of this RFP is on technical research or other projects enabling development of AI that is beneficial to society and robust in the sense that the benefits have some guarantees: our AI systems must do what we want them to do.

II. EVALUATION CRITERIA & PROJECT ELIGIBILITY

This 2018 grants competition is the second round of the multi-million dollar grants program announced in January 2015, and will give grants totaling millions more to researchers in academic and other nonprofit institutions for projects up to three years in duration, beginning September 1, 2018. Results-in-progress from the first round are here. Following the launch of the first round, the field of AI safety has expanded considerably in terms of institutions, research groups, and potential funding sources entering the field.  Many of these, however, focus on immediate or relatively short-term issues relevant to extrapolations of present machine learning and AI systems as they are applied more widely.  There are still relatively few resources devoted to issues that will become crucial if/when AI research attains its original goal: building artificial general intelligence (AGI) that can (or can learn to) outperform humans on all cognitive tasks (see Asilomar Principles 19-23).

For maximal positive impact, this new grants competition thus focuses on Artificial General Intelligence, specifically research for safe and beneficial AGI. Successful grant proposals will either relate directly to AGI issues, or clearly explain how the proposed work is a necessary stepping stone toward safe and beneficial AGI.

As with the previous round, grant applications will be subject to a competitive process of confidential expert peer review similar to that employed by all major U.S. scientific funding agencies, with reviewers being recognized experts in the relevant fields.

Project Grants (approx. $50K-$400K per project) will each fund a small group of collaborators at one or more research institutions for a focused research project of up to three years duration. Proposals will be evaluated according to how topical and impactful they are:

TOPICAL: This RFP is limited to research that aims to help maximize the societal benefits of AGI, explicitly focusing not on the standard goal of making AI more capable, but on making it more robust and/or beneficial. In consultation with other organizations, FLI has identified a list of relatively specific problems and projects of particular interest to the AGI safety field. These will serve both as examples and as topics for special consideration.

In our RFP examples, we give a list of research topics and questions that are germane to this RFP. We also refer proposers to FLI’s landscape of AI safety research and its accompanying literature survey, as well as the 2015 research priorities and the associated survey.

The relative amount of funding for different areas is not predetermined, but will be optimized to reflect the number and quality of applications received. Very roughly, the expectation is ~70% computer science and closely related technical fields, ~30% economics, law, ethics, sociology, policy, education, and outreach.

IMPACTFUL: Proposals will be rated according to their expected positive impact per dollar, taking all relevant factors into account, such as:

  1. Intrinsic intellectual merit, scientific rigor and originality
  2. A high product of likelihood for success and importance if successful (i.e., high-risk research can be supported as long as the potential payoff is also very high.)
  3. The likelihood of the research opening fruitful new lines of scientific inquiry
  4. The feasibility of the research in the given time frame
  5. The qualifications of the Principal Investigator and team with respect to the proposed topic
  6. The part a grant may play in career development
  7. Cost effectiveness: Tight budgeting is encouraged in order to maximize the research impact of the project as a whole, with emphasis on scientific return per dollar rather than per proposal.
  8. Potential to impact the greater community as well as the general public via effective outreach and dissemination of the research results
  9. Engagement of appropriate communities (e.g. engaging research collaborators [or policymakers] in AI safety outside of North America and Europe)

Strong proposals will make it easy for FLI to evaluate their impact by explicitly stating what they aim to produce (publications, algorithms, software, events, etc.) and when (after 1st, 2nd and 3rd year, say). Preference will be given to proposals whose deliverables are made freely available (open access publications, open source software, etc.) where appropriate.

To maximize its impact per dollar, this RFP is intended to complement, not supplement, conventional funding. We wish to enable research that, because of its long-term focus or its non-commercial, speculative or non-mainstream nature would otherwise go unperformed due to lack of available resources. Thus, although there will be inevitable overlaps, an otherwise scientifically rigorous proposal that is a good candidate for an FLI grant will generally not be a good candidate for funding by the NSF, DARPA, corporate R&D, etc. – and vice versa. To be eligible, research must focus on making AI more robust/beneficial as opposed to the standard goal of making AI more capable, and it must be AGI-relevant.

Acceptable use of grant funds for Project Grants include:

  • Student/postdoc/researcher salary and benefits
  • Summer salary and teaching buyout for academics
  • Support for specific projects during sabbaticals
  • Assistance in writing or publishing books or journal articles, including page charges
  • Modest allowance for justifiable lab equipment, computers, and other research supplies
  • Modest travel allowance
  • Development of workshops, conferences, or lecture series for professionals in the relevant fields
  • Overhead of at most 15% (Please note that if this is an issue with your institution, or if your organization is not nonprofit, you can contact FLI to learn about other organizations that can help administer an FLI grant for you.)

Subawards are discouraged but possible in special circumstances.

III. APPLICATION PROCESS

To save time for both you and the reviewers, applications will be accepted electronically through a standard form on our website (click here for the application) and evaluated in a two-part process, as follows:

INITIAL PROPOSAL — DUE FEBRUARY 25 2018, 11:59 PM Eastern Time — must include:

  • A 200-500 word summary of the project, explicitly addressing why it is topical and impactful.
  • A draft budget description not exceeding 200 words, including an approximate total cost over the life of the award and explanation of how funds would be spent.
  • A PDF Curriculum Vitae for the Principal Investigator, including
    • Education and employment history
    • Full publication list
    • Optional: if the PI has any previous publications relevant to the proposed research, they may list to up to five of these as well, for a total of up to 10 representative and relevant publications. We do wish to encourage PIs to enter relevant research areas where they may not have had opportunities before, so prior relevant publications are not required.

A review panel assembled by FLI will screen each initial proposal according to the criteria in Section II. Based on their assessment, the principal investigator (PI) may be invited to submit a full proposal, on or about MARCH 23 2018, perhaps with feedback from reviewers for improving the proposal. Please keep in mind that however positive reviewers may be about a proposal at any stage, it may still be turned down for funding after full peer review.

FULL PROPOSAL — DUE MAY 20 2018 — Must Include:

  • Cover sheet
  • A 200-word project abstract, suitable for publication in an academic journal
  • A project summary not exceeding 200 words, explaining the work and its significance to laypeople
  • A detailed description of the proposed research, of between 5 and 15 single-spaced 11-point pages, including a short statement of how the application fits into the applicant’s present research program, and a description of how the results might be communicated to the wider scientific community and general public
  • A detailed budget over the life of the award, with justification and utilization distribution (preferably drafted by your institution’s grant officer or equivalent)
  • A list, for all project senior personnel, of all present and pending financial support, including project name, funding source, dates, amount, and status (current or pending)
  • Evidence of tax-exempt status of grantee institution, if other than a US university. For information on determining tax-exempt status of international organizations and institutes, please review the information here.
  • Names of three recommended referees
  • Curricula Vitae for all project senior personnel, including:
    • Education and employment history
    • A list of references of up to five previous publications relevant to the proposed research, and up to five additional representative publications
    • Full publication list

Completed full proposals will undergo a competitive process of external and confidential expert peer review, evaluated according to the criteria described in Section III. A review panel of scientists in the relevant fields will be convened to produce a final rank ordering of the proposals, which will determine the grant winners, and make budgetary adjustments if necessary. Public award recommendations will be made on or about JULY 31, 2018.

FUNDING PROCESS

The peer review and administration of this grants program will be managed by the Future of Life Institute. FLI is an independent, philanthropically funded nonprofit organization whose mission is to catalyze and support research and initiatives for safeguarding life and developing optimistic visions of the future, including positive ways for humanity to steer its own course considering new technologies and challenges.

FLI will direct these grants through a Donor Advised Fund (DAF) at the Silicon Valley Community Foundation. FLI will solicit grant applications and have them peer reviewed, and on the basis of these reviews, FLI will advise the DAF on what grants to make. After grants have been made by the DAF, FLI will work with the DAF to monitor the grantee’s performance via grant reports. In this way, researchers will continue to interact with FLI, while the DAF interacts mostly with their institutes’ administrative or grants management offices.

RESEARCH TOPIC LIST

We have solicited and synthesized suggestions from a number of technical AI safety researchers to provide a list of project requests.  Proposals on the requested topics are all germane to the RFP, but the list is not meant to be either comprehensive or exclusive: proposals on other topics that similarly address long-term safety and benefits of AI are also welcomed. We also refer the reader to FLI’s AI safety landscape and its accompanying paper as a more general summary of relevant issues as well as definitions of many key terms.

TO SUBMIT AN INITIAL PROPOSAL, CLICK HERE.

IV. An International Request for Proposals – Timeline

December 20, 2017: RFP is released

February 25, 2018 (by 11:59 PM EST): Initial Proposals due

March 23, 2018: Full Proposals invited

May 20, 2018 (by 11:59 PM EST): Full Proposals (invite only) due

July 31, 2018: Grant Recommendations are publicly announced; FLI Fund conducts due diligence on grants

September 1, 2018: Grants disbursed; Earliest date for grants to start

August 31, 2021: Latest end date for multi-year Grants

TO SUBMIT AN INITIAL PROPOSAL, CLICK HERE.

An International Request for Proposals – Frequently Asked Questions

Does FLI have particular agenda or position on AI and AI safety?

FLI’s position is well summarized by the open letter that FLI’s founders and many of its advisory board members have signed, and by the Asilomar Principles.

Who is eligible for grants?

Researchers and outreach specialists working in academic and other nonprofit institutions are eligible, as well as independent researchers. Grant awards are sent to the PI’s institution and the institution’s administration is responsible for disbursing the awards to the PI. When submitting your application, please make sure to list the appropriate grant administrator that we should contact at your institution.

If you are not affiliated with a research institution, there are many organizations that will help administer your grant. If you need suggestions, please contact FLI. Applicants are not required to be affiliated with an institution for the Initial Proposal, only for the Full Proposal.

Can researchers from outside the U.S. apply?

Yes, applications will be welcomed from any country. Please note that the US Government imposes restrictions on the types of organizations to which US nonprofits (such as FLI) can give grants. Given this, if you are awarded a grant, your institution must a) prove their equivalency to a nonprofit institution by providing the institution’s establishing law or charter, list of key staff and board members, and a signed affidavit for public universities and, b) comply with the U.S. Patriot Act. Please note that this is included to provide information about the equivalency determination process that will take place if you are awarded a grant. If there are any issues with your granting institution proving its equivalency, FLI can help provide a list of organizations that can act as a go-between to administer the grant. More detail about international grant compliance is available on our website here. Please contact FLI if you have any questions about whether your institution is eligible, to get a list of organizations that can help administer your grant, or if you want to review the affidavit that public universities must fill out.

Can I submit an application in a language other than English?

All proposals must be in English. Since our grant program has an international focus, we will not penalize applications by people who do not speak English as their first language. We will encourage the review panel to be accommodating of language differences when reviewing applications. All applications must be coherent.

How and when do we apply?

Apply online here. Please submit an Initial Proposal by February 25, 2018. After screening, you may then be invited to submit a Full Proposal, due May 20, 2018. Please see Section IV for more information.

What kinds of programs and requests are eligible for funding?

Acceptable use of grant funds for Project Grants include:

  • Student/postdoc/researcher salary and benefits
  • Summer salary and teaching buyout for academics
  • Support for specific projects during sabbaticals
  • Assistance in writing or publishing books or journal articles, including page charges
  • Modest allowance for justifiable lab equipment, computers, cloud computing services, and other research supplies
  • Modest travel allowance
  • Development of workshops, conferences, or lecture series for professionals in the relevant fields
  • Overhead of at most 15% (Please note if this is an issue with your institution, or if your organization is not nonprofit, you can contact FLI to learn about other organizations that can help administer an FLI grant for you.)
  • Subawards are discouraged but possible in special circumstances.

What is your policy on overhead?

The highest allowed overhead rate is 15%. (As mentioned before, if this is an issue with your institution, you can contact FLI to learn about other organizations that can help administer FLI grants.)

How will proposals be judged?

After screening of the Initial Proposal, applicants may be asked to submit a Full Proposal. All Full Proposals will undergo a competitive process of external and confidential expert peer review. An expert panel will evaluate and rank the reviews according to the criteria described in Section III of the RFP overview (see above).

Will FLI provide feedback on initial proposals?

FLI will generally not provide significant feedback on initial Project Proposals, but may in some cases. Please keep in mind that however positive FLI may be about a proposal at any stage, it may still be turned down for funding after peer review.

Can I submit multiple proposals?

We will consider multiple Initial Proposals from the same PI; however, we will invite at most one Full Proposal from each PI or closely associated group of applicants.

What if I am unable to submit my application electronically?

Only applications submitted through the form on our website are accepted. If you encounter problems, please contact FLI.

Is there a maximum amount of money for which we can apply?

No. You may apply for as much money as you think is necessary to achieve your goals. However, you should carefully justify your proposed expenditure. Keep in mind that projects will be assessed on potential impact per dollar requested; an inappropriately high budget may harm the proposal’s prospects, effectively pricing it out of the market. Referees are authorized to suggest budget adjustments. As mentioned in the RFP overview above, there may be an opportunity to apply for greater follow-up funding.

What will an average award be?

We expect that Project awards will typically be in the range of $50,000-$400,000 total over the life of the award (usually two to three years).

What are the reporting requirements?

Grantees will be asked to submit a progress report (if a multi-year Grantee) and/or annual report consisting of narrative and financial reports. Renewal of multi-year grants will be contingent on satisfactory demonstration in these reports that the supported research is progressing appropriately, and continues to be consistent with the spirit of the original proposal. (see below question regarding renewal.)

How are multi-year grants renewed?

This program has been formulated to maximize impact by re-allocating (and potentially adding) resources during each year of the grant program. Decisions regarding the renewal of multi-year grants will be made by a review committee on the basis of the annual progress report. This report is not pro-forma. The committee is likely to recommend that some grants not be renewed, some be renewed at reduced level, some renewed at the same level, and that some be offered the opportunity for increased funding in later years.

What are the qualifications for a Principal Investigator?

A Principal Investigator can be anyone – there are no qualification requirements (though qualifications will be taken into account during the review process). Lacking conventional academic credentials or publications does not disqualify a P.I. We encourage applications from industry and independent researchers. Please list any relevant experience or achievements in the attached resume/CV.

As noted above, Principal Investigators need not even be affiliated with a university or nonprofit. If a PI is affiliated with an academic institution, then their Principal Investigator status must be allowed by their institution. Should they be invited to submit a Full Proposal, they must obtain co-signatures on the proposal from the department head, as well as a department host with a post exceeding the duration of the grant.

My colleague(s) and I would like to apply as co-PIs. Can we do this?

Yes. For administrative purposes, however, please select a primary contact for the life of the award. The primary contact, which must be a Principal Investigator, will be the reference for your application(s) and all future correspondence, documents, etc.

Will the grants pay for laboratory or computational expenses?

Yes, however due to budgetary limitations FLI cannot fund capital-intensive equipment or computing facilities. Also, such expenses must be clearly required by the proposed research.

I have a proposal for my usual, relatively mainstream AI research program that I may be able to repackage as an appropriate proposal for this FLI program. Sound OK?

FLI is very sensitive to the problem of “fishing for money”—that is, the re-casting of an existing research program to make it appear to fit the overall thematic nature of this Request For Proposals. Such proposals will not be funded, nor renewed if erroneously funded initially.

Do proposals have to be as long as possible?

Please note that the 15-page limit is an upper limit, not a lower limit. You should simply write as much as you feel that you need in order to explain your proposal in sufficient detail for the review panel to understand it properly.

What are the “referees” in the instructions?

If there are specific reviewers whom you feel are particularly qualified to evaluate your proposal, please feel free to list them (this is completely optional)

Who are FLI’s reviewers?

FLI follows the standard practice of protecting the identities of our external reviewers and selecting them based on expertise in the relevant research areas. For example, the external reviewers in the first-round of this RFP were highly qualified experts in AI, law and economics, mostly professors and also some industry experts.

TO SUBMIT AN INITIAL PROPOSAL, CLICK HERE.

If you have additional questions that were not answered above, please email us.

MIRI’s December 2017 Newsletter and Annual Fundraiser

Our annual fundraiser is live. Discussed in the fundraiser post:

  • News  — What MIRI’s researchers have been working on lately, and more.
  • Goals — We plan to grow our research team 2x in 2018–2019. If we raise $850k this month, we think we can do that without dipping below a 1.5-year runway.
  • Actual goals — A bigger-picture outline of what we think is the likeliest sequence of events that could lead to good global outcomes.

Our funding drive will be running until December 31st.

Research updates

General updates

When Should Machines Make Decisions?

Human Control: Humans should choose how and whether to delegate decisions to AI systems, to accomplish human-chosen objectives.

When is it okay to let a machine make a decision instead of a person? Most of us allow Google Maps to choose the best route to a new location. Many of us are excited to let self-driving cars take us to our destinations while we work or daydream. But are you ready to let your car choose your destination for you? The car might recognize that your ultimate objective is to eat or to shop or to run some errand, but most of the time, we have specific stores or restaurants that we want to go to, and we may not want the vehicle making those decisions for us.

What about more challenging decisions? Should weapons be allowed to choose who to kill? If so, how do they make that choice? And how do we address the question of control when artificial intelligence becomes much smarter than people? If an AI knows more about the world and our preferences than we do, would it be better if the AI made all of our decisions for us?

Questions like these are not easy to address. In fact, two of the AI experts I interviewed responded to this Principle with comments like, “Yeah, this is tough,” and “Right, that’s very, very tricky.”

And everyone I talked to agreed that this question of human control taps into some of the most challenging problems facing the design of AI.

“I think this is hugely important,” said Susan Craw, a Research Professor at Robert Gordon University Aberdeen. “Otherwise you’ll have systems wanting to do things for you that you don’t necessarily want them to do, or situations where you don’t agree with the way that systems are doing something.”

What does human control mean?

Joshua Greene, a psychologist at Harvard, cut right to the most important questions surrounding this Principle.

“This is an interesting one because it’s not clear what it would mean to violate that rule,” Greene explained. “What kind of decision could an AI system make that was not in some sense delegated to the system by a human? AI is a human creation. This principle, in practice, is more about what specific decisions we consciously choose to let the machines make. One way of putting it is that we don’t mind letting the machines make decisions, but whatever decisions they make, we want to have decided that they are the ones making those decisions.

“In, say, a navigating robot that walks on legs like a human, the person controlling it is not going to decide every angle of every movement. The humans won’t be making decisions about where exactly each foot will land, but the humans will have said, ‘I’m comfortable with the machine making those decisions as long as it doesn’t conflict with some other higher level command.’”

Roman Yampolskiy, an AI researcher at the University of Louisville, suggested that we might be even closer to giving AI decision-making power than many realize.

“In many ways we have already surrendered control to machines,” Yampolskiy said. “AIs make over 85% of all stock trades, control operation of power plants, nuclear reactors, electric grid, traffic light coordination and in some cases military nuclear response aka “dead hand.” Complexity and speed required to meaningfully control those sophisticated processes prevent meaningful human control. We are simply not quick enough to respond to ultrafast events, such as those in algorithmic trading and more and more seen in military drones. We are also not capable enough to keep thousands of variables in mind or to understand complicated mathematical models. Our reliance on machines will only increase but as long as they make good decisions (decisions we would make if we were smart enough, had enough data and enough time) we are OK with them making such decisions. It is only in cases where machine decisions diverge from ours that we would like to be able to intervene. Of course figuring out cases in which we diverge is exactly the unsolved Value Alignment Problem.”

Greene also elaborated on this idea: “The worry is when you have machines that are making more complicated and consequential decisions than ‘where do to put the next footstep.’ When you have a machine that can behave in an open-ended flexible way, how do you delegate anything without delegating everything? When you have someone who works for you and you have some problem that needs to be solved and you say, ‘Go figure it out,’ you don’t specify, ‘But don’t murder anybody in the process. Don’t break any laws and don’t spend all the company’s money trying to solve this one small-sized problem.’ There are assumptions in the background that are unspecified and fairly loose, but nevertheless very important.

“I like the spirit of this principle. It’s a specification of what follows from the more general idea of responsibility, that every decision is either made by a person or specifically delegated to the machine. But this one will be especially hard to implement once AI systems start behaving in more flexible, open-ended ways.”

Trust and Responsibility

AI is often compared to a child, both in terms of what level of learning a system has achieved and also how the system is learning. And just as we would be with a child, we’re hesitant to give a machine too much control until it’s proved it can be trusted to be safe and accountable. Artificial intelligence systems may have earned our trust when it comes to maps, financial trading, and the operation of power grids, but some question whether this trend can continue as AI systems become even more complex or when safety and well-being are at greater risk.

John Havens, the Executive Director of The IEEE Global Initiative for Ethical Considerations in Artificial Intelligence and Autonomous Systems, explained, “Until universally systems can show that humans can be completely out of the loop and more often than not it will be beneficial, then I think humans need to be in the loop.”

“However, the research I’ve seen also shows that right now is the most dangerous time, where humans are told, ‘Just sit there, the system works 99% of the time, and we’re good.’ That’s the most dangerous situation,” he added, in reference to recent research that has found people stop paying attention if a system, like a self-driving car, rarely has problems. The research indicates that when problems do arise, people struggle to refocus and address the problem.

“I think it still has to be humans delegating first,” Havens concluded.

In addition to the issues already mentioned with decision-making machines, Patrick Lin, a philosopher at California Polytechnic State University, doesn’t believe it’s clear who would be held responsible if something does go wrong.

“I wouldn’t say that you must always have meaningful human control in everything you do,” Lin said. “I mean, it depends on the decision, but also I think this gives rise to new challenges. … This is related to the idea of human control and responsibility. If you don’t have human control, it could be unclear who’s responsible … the context matters. It really does depend on what kind of decisions we’re talking about, that will help determine how much human control there needs to be.”

Susan Schneider, a philosopher at the University of Connecticut, also worried about how these problems could be exacerbated if we achieve superintelligence.

“Even now it’s sometimes difficult to understand why a deep learning system made the decisions that it did,” she said, adding later, “If we delegate decisions to a system that’s vastly smarter than us, I don’t know how we’ll be able to trust it, since traditional methods of verification seem break down.”

What do you think?

Should humans be in control of a machine’s decisions at all times? Is that even possible? When is it appropriate for a machine to take over, and when do we need to make sure a person is “awake at the wheel,” so to speak? There are clearly times when machines are more equipped to safely address a situation than humans, but is that all that matters? When are you comfortable with a machine making decisions for you, and when would you rather remain in control?

This article is part of a series on the 23 Asilomar AI Principles. The Principles offer a framework to help artificial intelligence benefit as many people as possible. But, as AI expert Toby Walsh said of the Principles, “Of course, it’s just a start. … a work in progress.” The Principles represent the beginning of a conversation, and now we need to follow up with broad discussion about each individual principle. You can read the discussions about previous principles here.

Podcast: Balancing the Risks of Future Technologies with Andrew Maynard and Jack Stilgoe

What does it means for technology to “get it right,” and why do tech companies ignore long-term risks in their research? How can we balance near-term and long-term AI risks? And as tech companies become increasingly powerful, how can we ensure that the public has a say in determining our collective future?

To discuss how we can best prepare for societal risks, Ariel spoke with Andrew Maynard and Jack Stilgoe on this month’s podcast. Andrew directs the Risk Innovation Lab in the Arizona State University School for the Future of Innovation in Society, where his work focuses on exploring how emerging and converging technologies can be developed and used responsibly within an increasingly complex world. Jack is a senior lecturer in science and technology studies at University College London where he works on science and innovation policy with a particular interest in emerging technologies.

The following transcript has been edited for brevity, but you listen to the podcast above or read the full transcript here.

Ariel: Before we get into anything else, could you first define what risk is?

Andrew: The official definition of risk is it looks at the potential of something to cause harm, but it also looks at the probability. Say you’re looking at exposure to a chemical, risk is all about the hazardous nature of that chemical, its potential to cause some sort of damage to the environment or the human body, but then exposure that translates that potential into some sort of probability. That is typically how we think about risk when we’re looking at regulating things.

I actually think about risk slightly differently, because that concept of risk runs out of steam really fast, especially when you’re dealing with uncertainties, existential risk, and perceptions about risk when people are trying to make hard decisions and they can’t make sense of the information they’re getting. So I tend to think of risk as a threat to something that’s important or of value. That thing of value might be your health, it might be the environment; but it might be your job, it might be your sense of purpose or your sense of identity or your beliefs or your religion or your politics or your worldview.

As soon as we start thinking about risk in that sense, it becomes much broader, much more complex, but it also allows us to explore that intersection between different communities and their different ideas about what’s important and worth protecting.

Jack: I would draw attention to all of those things that are incalculable. When we are dealing with new technologies, they are often things to which we cannot assign probabilities and we don’t know very much about what the likely outcomes are going to be.

I think there is also a question of what isn’t captured when we talk about risk. Not all of the impacts of technology might be considered risk impacts. I’d say that we should also pay attention to all the things that are not to do with technology going wrong, but are also to do with technology going right. Technologies don’t just create new risks, they also benefit some people more than others. And they can create huge inequalities. If they’re governed well, they can also help close inequalities. But if we just focus on risk, then we lose some of those other concerns as well.

Andrew: Jack, so this obviously really interests me because to me an inequality is a threat to something that’s important to someone. Do you have any specific examples of what you think about when you think about inequalities or equality gaps?

Jack: Before we get into examples, the important thing is to bear in mind a trend with technology, which is that technology tends to benefit the powerful. That’s an overall trend before we talk about any specifics, which quite often goes against the rhetoric of technological change, because, often, technologies are sold as being emancipatory and helping the worst off in society – which they do, but typically they also help the better off even more. So there’s that general question.

I think in the specific, we can talk about what sorts of technologies do close inequities and which tend to exacerbate inequities. But it seems to me that just defining that as a social risk isn’t quite getting there.

Ariel: I would consider increasing inequality to be a risk. Can you guys talk about why it’s so hard to get agreement on what we actually define as a risk?

Andrew: People very quickly slip into defining risk in very convenient ways. So if you have a company or an organization that really wants to do something – and that doing something may be all the way from making a bucket load of money to changing the world in the ways they think are good – there’s a tendency for them to define risk in ways that benefit them.

So, for instance, if you are the maker of an incredibly expensive drug, and you work out that that drug is going to be beneficial in certain ways with minimal side effects, but it’s only going to be available to a very few very rich number of people, you will easily define risk in terms of the things that your drug does not do, so you can claim with confidence that this is a risk-free or a low-risk product. But that’s an approach where you work out where the big risks are with your product and you bury them and you focus on the things where you think there is not a risk with your product.

That sort of extends across many, many different areas – this tendency to bury the big risks associated with a new technology and highlight the low risks to make your tech look much better than it is so you can reach the aims that you’re trying to achieve.

Jack: I quite agree, Andrew. I think what tends to happen is that the definition of risk gets socialized as being that stuff that society’s allowed to think about whereas the benefits are sort of privatized. The innovators are there to define who benefits and in what ways.

Andrew: I would agree. Though it also gets quite complex in terms of the social dialogue around that and who actually is part of those conversations and who has a say in those conversations.

To get back to your point, Ariel, I think there are a lot of organizations and individuals that want to do what they think is the right thing. But they also want the ability to decide for themselves what the right thing is rather than listening to other people.

Ariel: How do we address that?

Andrew: It’s a knotty problem, and it has its roots in how we are as people and as a society, how we’ve evolved. I think there are a number of ways forwards towards beginning to sort of pick apart the problem. A lot of those are associated with work that is carried out in the social sciences and humanities around how you make these processes more inclusive, how you bring more people to the table, how you begin listening to different perspectives, different sets of values and incorporating them into decisions rather than marginalizing groups that are inconvenient.

Jack: If you regard these things as legitimately political discussions rather than just technical discussions, then the solution is to democratize them and to try to wrest control over the direction of technology away from just the innovators and to see that as the subject of proper democratic conversation.

Andrew: And there are some very practical things here. This is where Jack and I might actually diverge in our perspectives. But from a purely business sense, if you’re trying to develop a new product or a new technology and get it to market, the last thing you can afford to do is ignore the nature of the population, the society that you’re trying to put that technology into. Because if you do, you’re going to run up against roadblocks where people decide they either don’t like the tech or they don’t like the way that you’ve made decisions around it or they don’t like the way that you’ve implemented it.

So from a business perspective, taking a long-term strategy, it makes far more sense to engage with these different communities and develop a dialogue around them so you understand the nature of the landscape that you’re developing a technology into. You can see ways of partnering with communities to make sure that that technology really does have a broad beneficial impact.

Ariel: Why do you think companies resist doing that?

Andrew: I think we’ve had centuries of training that says you don’t ask awkward questions because they potentially lead to you not being able to do what you want to do. It’s partly the mentality around innovation. But, also, it’s hard work. It takes a lot of effort, and it actually takes quite a lot of humility as well.

Jack: There’s a sort of well-defined law in technological change, which is that we overestimate the effect of technology in the short term and underestimate the effect of technology in the long term. Given that companies and innovators have to make short time horizon decisions, often they don’t have the capacity to take on board these big world-changing implications of technology.

If you look at something like the motorcar, it would have been inconceivable for Henry Ford to have imagined the world in which his technology would exist in 50 years time. Even though we know that the motorcar has led to the reshaping of large parts of America. It’s led to an absolutely catastrophic level of public health risk while also bringing about clear benefits of mobility. But those are big long-term changes that evolve very slowly, far slower than any company could appreciate.

Andrew: So can I play devil’s advocate here, Jack? With hindsight should Henry Ford have developed his production line process differently to avoid some of the impacts we now see of motor vehicles?

Jack: You’re right to say with hindsight it’s really hard to see what he might have done differently, because the point is the changes that I was talking about are systemic ones with responsibility shared across large parts of the system. Now, could we have done better at anticipating some of those things? Yes, I think we could have done, and I think had motorcar manufacturers talked to regulators and civil society at the time, they could have anticipated some of those things because there are also barriers that stop innovators from anticipating. There are actually things that force innovators time horizons to narrow.

Andrew: That’s one of the points that really interests me. It’s not this case of “do we, don’t we” with a certain technology, but could we do things better so we see more longer-term benefits and we see fewer hurdles that maybe we could have avoided if we had been a little smarter from the get-go.

Ariel: But how much do you think we can actually anticipate?

Andrew: Well, the basic answer is very little indeed. The one thing that we know about anticipating the future is that we’re always going to get it wrong. But I think that we can put plausible bounds around likely things that are going to happen. Simply from what we know about how people make decisions and the evidence around that, we know that if you ignore certain pieces of information, certain evidence, you’re going to make worse decisions in terms of projecting or predicting future pathways than if you’re actually open to evaluating different types of evidence.

By evidence, I’m not just meaning the scientific evidence, but I’m also thinking about what people believe or hold as valuable within society and what motivates them to do certain things and react in certain ways. All of that is important evidence in terms of getting a sense of what the boundaries are of a future trajectory.

Jack: Yes, we will always get our predictions wrong, but if anticipation is about preparing us for the future rather than predicting the future, then rightness or wrongness isn’t really the target. Instead, I would draw attention to the history of cases in which there has been willful ignorance of particular perspectives or particular evidence that has only been realized later – which, as you know better than anybody, the evidence of public health risk that has been swept under the carpet. We have to look first at the sort of incentives that prompt innovators to overlook that evidence.

Andrew: I think that’s so important. It’s worthwhile bringing up the Late lessons from early warnings report that came out of Europe a few years ago, which were a series of case studies of previous technological innovations over the last 100 years or so, looking at where innovators and companies and even regulators either missed important early warnings or willfully ignored them, and that led to far greater adverse impacts than there really should have been. I think there are a lot of lessons to be learned from those.

Ariel: I’d like to take that and move into some more specific examples now. Jack, I know you’re interested in self-driving vehicles. I was curious, how do we start applying that to these new technologies that will probably be, literally, on the road soon?

Jack: It’s extremely convenient for innovators to define risks in particular ways that suit their own ambitions. I think you see this in the way that the self-driving cars debate is playing out. In part, that’s because the debate is a largely American one and it emanates from an American car culture.

Here in Europe, we see a very different approach to transport with a very different emerging debate. So the trolley problem, the classic example of a risk issue where engineers very conveniently are able to treat it as an algorithmic challenge. How do we maximize public benefits and reduce public risk? Here in Europe where our transport systems are complicated, multimodal; where our cities are complicated, messy things, the self-driving car risks start to expand pretty substantially in all sorts of dimensions.

So the sorts of concerns that I would see for the future of self-driving cars relate more to what are sometimes called second order consequences. What sorts of worlds are these technologies likely to enable? What sorts of opportunities are they likely to constrain? I think that’s a far more important debate than the debate about how many lives a self-driving car will either save or take in its algorithmic decision-making.

Andrew: Jack, you have referred to the trolley problem as trolleys and follies. One of the things I really grapple with, and I think it’s very similar to what you were saying, is that the trolley problem seems to be a false or a misleading articulation of risk. It’s something which is philosophical and hypothetical, but actually doesn’t seem to bear much relation to the very real challenges and opportunities that we’re grappling with with these technologies.

Now, the really interesting thing here is, I get really excited about the self-driving vehicle technologies, partly living here in Tempe where Google and Uber and various other companies are testing them on the road now. But you have quite a different perspective in terms of how fast we’re going with the technology and how little thought there is into the longer term social consequences. But to put my full cards on the table, I can’t wait for better technologies in this area.

Jack: Well, without wishing to be too congenial, I am also excited about the potential for the technology. But what I know about past technology suggests that it may well end up gloriously suboptimal. I’m interested in a future involving self-driving cars that might actually realize some of the enormous benefits to, for example, bringing accessibility to people who currently can’t drive. The enormous benefits to public safety, to congestion, but making that work will not just involve a repetition of current dynamics of technological change. I think current ownership models in the US, current modes of transport in the US just are not conducive to making that happen. So I would love to see governments taking control of this and actually making it work in the same way as in the past, governments have taken control of transport and built public value transport systems.

Ariel: If governments are taking control of this and they’re having it done right, what does that mean?

Jack: The first thing that I don’t see any of within the self-driving car debate, because I just think we’re at too early a stage, is an articulation of what we want from self-driving cars. We have the Google vision, the Waymo vision of the benefits of self-driving cars, which is largely about public safety. But no consideration of what it would take to get that right. I think that’s going to look very different. I think to an extent Tempe is an easy case, because the roads in Arizona are extremely well organized. It’s sunny, pedestrians behave themselves. But what you’re not going to be able to do is take that technology and transport it to central London and expect it to do the same job.

So some understanding of desirable systems across different places is really important. That, I’m afraid, does mean sharing control between the innovators and the people who have responsibility for public safety, public transport and public space.

Andrew: Even though most people in this field and other similar fields are doing it for what they claim is for future benefits and the public good, there’s a huge gap between good intentions of doing the right thing and actually being able to achieve something positive for society. I think the danger is that good intentions go bad very fast if you don’t have the right processes and structures in place to translate them into something that benefits society. To do that, you’ve got to have partnerships and engagement with agencies and authorities that have oversight over these technologies, but also the communities and the people that are either going to be impacted by them or benefit by them.

Jack: I think that’s right. Just letting the benefits as stated by the innovators speak for themselves hasn’t worked in the past, and it won’t work here. We have to allow some sort of democratic discussion about that.

Ariel: I want to move forward in the future to more advanced technology, looking at more advanced artificial intelligence, even super intelligence. How do we address risks that are associated with that when a large number of researchers don’t even think this technology can be developed, or if it is developed, it’s still hundreds of years away? How do you address these really big unknowns and uncertainties?

Andrew: That’s a huge question. So I’m speaking here as something of a cynic of some of the projections of superintelligence. I think you’ve got to develop a balance between near and mid-term risks, but at the same time, work out how you take early action on trajectories so you’re less likely to see the emergence of those longer-term existential risks. One of the things that actually really concerns me here is if you become too focused on some of the highly speculative existential risks, you end up missing things which could be catastrophic in a smaller sense in the near to mid-term.

Pouring millions upon millions of dollars into solving a hypothetical problem around superintelligence and the threat to humanity sometime in the future, at the expense of looking at nearer-term things such as algorithmic bias, autonomous decision-making that cuts people out of the loop and a whole number of other things, is a risk balance that doesn’t make sense to me. Somehow, you’ve got to deal with these emerging issues, but in a way which is sophisticated enough that you’re not setting yourself up for problems in the future.

Jack: I think getting that balance right is crucial. I agree with your assessment that that balance is far too much, at the moment, in the direction of the speculative and long-term. One of the reasons why it is, is because that’s an extremely interesting set of engineering challenges. So I think the question would be on whose shoulders does the responsibility lie for acting once you recognize threats or risks like that? Typically, what you find when a community of scientists gathers to assess risks is that they frame the issue in ways that lead to scientific or technical solutions. It’s telling, I think, that in the discussion about superintelligence, the answer, either in the foreground or in the background, is normally more AI not less AI. And the answer is normally to be delivered by engineers rather than to be governed by politicians.

That said, I think there’s sort of cause for optimism if you look at the recent campaign around autonomous weapons. That would seem to be a clear recognition of a technologically mediated issue where the necessary action is not on the part of the innovators themselves but on all the people who are in control of our armed forces.

Andrew: I think you’re exactly right, Jack. I should clarify that even though there is a lot of discussion around speculative existential risks, there is also a lot of action on nearer-term issues such as the lethal autonomous weapons. But one of the things that I’ve been particularly struck with in conversations is the fear amongst technologists of losing control over the technology and the narrative. I’ve had conversations where people have said that they’re really worried about the potential down sides, the potential risks of where artificial intelligence is going. But they’re convinced that they can solve those problems without telling anybody else about them, and they’re scared that if they tell a broad public about those risks that they’ll be inhibited in doing the research and the development that they really want to do.

That really comes down to not wanting to relinquish control over technology. But I think that there has to be some relinquishment there if we’re going to have responsible development of these technologies that really focuses on how they could impact people both in the short as well as the long-term, and how as a society we find pathways forwards.

Ariel: Andrew, I’m really glad you brought that up. That’s one that I’m not convinced by, this idea that if we tell the public what the risks are, then suddenly the researchers won’t be able to do the research they want. Do you see that as a real risk for researchers?

Andrew: I think there is a risk there, but it’s rather complex. Most of the time, the public actually don’t care about these things. There are one or two examples; genetically modifying organisms is the one that always comes up. But that is a very unique and very distinct example. Most of the time, if you talk broadly about what’s happening with a new technology, people will say, that’s interesting, and get on with their lives. So there’s much less risk there about talking about it than I think people realize.

The other thing, though, is even if there is a risk of people saying “hold on a minute, we don’t like what’s happening here,” better to have that feedback sooner rather than later, because the reality is people are going to find out what’s happening. If they discover as a company or a research agency or a scientific group that you’ve been doing things that are dangerous and you haven’t been telling them about it, when they find out after the fact, people get mad. That’s where things get really messy.

[What’s also] interesting – you’ve got a whole group of people in the technology sphere who are very clearly trying to do what they think is the right thing. They’re not in it primarily for fame and money, but they’re in it because they believe that something has to change to build a beneficial future.

The challenge is, these technologists, if they don’t realize the messiness of working with people and society and they think just in terms of technological solutions, they’re going to hit roadblocks that they can’t get over. So this to me is why it’s really important that you’ve got to have the conversations. You’ve got to take the risk to talk about where things are going with the broader population. You’ve got to risk your vision having to be pulled back a little bit so it’s more successful in the long-term.

Ariel: I was hoping you could both touch on the impact of media as well and how that’s driving the discussion.

Jack: I think blaming the media is always the convenient thing to do. They’re the convenient target. I think the question is about actually the culture, which is extremely technologically utopian and which wants to believe that there are simple technological solutions to some of our most pressing problems. In that culture, it is understandable if seemingly seductive ideas, whether about artificial intelligence or about new transport systems, are taken. I would love there to be a more skeptical attitude so that when those sorts of claims are made, just as when any sort of political claim is made, that they are scrutinized and become the starting point for a vigorous debate about the world in which we want to live in. I think that is exactly what is missing from our current technological discourse.

Andrew: The media is a product of society. We are titillated by extreme, scary scenarios. The media is a medium through which that actually happens. I work a lot with journalists, and I’ve had very few experiences with being misrepresented or misquoted where it wasn’t my fault in the first place.

So I think we’ve got to think of two things when we think of media coverage. First of all, we’ve got to get smarter in how we actually communicate, and by we I mean the people that feel we’ve got something to say here. We’ve got to work out how to communicate in a way that makes sense with the journalists and the media that we’re communicating through. We’ve also got to realize that even though we might be outraged by a misrepresentation, that usually doesn’t get as much traction in society as we think it does. So we’ve got to be a little bit more laid back about how we see things reported.

Ariel: Is there anything else that you think is important to add?

Andrew: I would just sort of wrap things up. There has been a lot of agreement, but actually, and this is an important thing, it’s because most people, including people that are often portrayed as just being naysayers, are trying to ask difficult questions so we can actually build a better future through technology and through innovation in all its forms. I think it’s really important to realize that just because somebody asks difficult questions doesn’t mean they’re trying to stop progress, but they’re trying to make sure that that progress is better for everybody.

Jack: Hear, hear.

Help Support FLI This Giving Tuesday

We’ve accomplished a lot. FLI has only been around for a few years, but during that time, we’ve:

  • Helped mainstream AI safety research,
  • Funded 37 AI safety research grants,
  • Launched multiple open letters that have brought scientists and the public together for the common cause of a beneficial future,
  • Drafted the 23 Asilomar Principles which offer guidelines for ensuring that AI is developed beneficially for all,
  • Supported the successful efforts by the International Campaign to Abolish Nuclear Weapons (ICAN) to get a treaty UN treaty passed that bans and stigmatizes nuclear weapons (ICAN won this year’s Nobel Peace Prize for their work),
  • Supported efforts to advance negotiations toward a ban on lethal autonomous weapons with a video that’s been viewed over 30 millions times,
  • Launched a website that’s received nearly 3 million page views,
  • Broadened the conversation about how humanity can flourish rather than flounder with powerful technologies.

But that’s just the beginning. There’s so much more we’d like to do, but we need your help. On Giving Tuesday this year, please consider a donation to FLI.

Where would your money go?

  • More AI safety research,
  • More high-quality information and communication about AI safety,
  • More efforts to keep the future safe from lethal autonomous weapons,
  • More efforts to trim excess nuclear stockpiles & reduce nuclear war risk,
  • More efforts to guarantee a future we can all look forward to.

Please Consider a Donation to Support FLI

AI Researchers Create Video to Call for Autonomous Weapons Ban at UN

In response to growing concerns about autonomous weapons, a coalition of AI researchers and advocacy organizations released a fictitious video on Monday that depicts a disturbing future in which lethal autonomous weapons have become cheap and ubiquitous.

The video was launched in Geneva, where AI researcher Stuart Russell presented it at an event at the United Nations Convention on Conventional Weapons hosted by the Campaign to Stop Killer Robots.

Russell, in an appearance at the end of the video, warns that the technology described in the film already exists and that the window to act is closing fast.

Support for a ban has been mounting. Just this past week, over 200 Canadian scientists and over 100 Australian scientists in academia and industry penned open letters to Prime Minister Justin Trudeau and Malcolm Turnbull urging them to support the ban. Earlier this summer, over 130 leaders of AI companies signed a letter in support of this week’s discussions. These letters follow a 2015 open letter released by the Future of Life Institute and signed by more than 20,000 AI/Robotics researchers and others, including Elon Musk and Stephen Hawking.

These letters indicate both grave concern and a sense that the opportunity to curtail lethal autonomous weapons is running out.

Noel Sharkey of the International Committee for Robot Arms Control explains, “The Campaign to Stop Killer Robots is not trying to stifle innovation in artificial intelligence and robotics and it does not wish to ban autonomous systems in the civilian or military world. Rather we see an urgent need to prevent automation of the critical functions for selecting targets and applying violent force without human deliberation and to ensure meaningful human control for every attack.”

Drone technology today is very close to having fully autonomous capabilities. And many of the world’s leading AI researchers worry that if these autonomous weapons are ever developed, they could dramatically lower the threshold for armed conflict, ease and cheapen the taking of human life, empower terrorists, and create global instability. The US and other nations have used drones and semi-automated systems to carry out attacks for several years now, but fully removing a human from the loop is at odds with international humanitarian and human rights law.

A ban can exert great power on the trajectory of technological development without needing to stop every instance of misuse. Max Tegmark, MIT Professor and co-founder of the Future of Life Institute, points out, “People’s knee-jerk reaction that bans can’t help isn’t historically accurate: the bioweapon ban created such a powerful stigma that, despite treaty cheating, we have almost no bioterror attacks today and almost all biotech funding is civilian.”

As Toby Walsh, an AI professor at the University of New South Wales, argues: “The academic community has sent a clear and consistent message. Autonomous weapons will be weapons of terror, the perfect tool for those who have no qualms about the terrible uses to which they are put. We need to act now before this future arrives.”

More than 70 countries are participating in the meeting taking place November 13 – 17 organized by the 2016 Fifth Review Conference at the UN, which established a Group of Governmental Experts on lethal autonomous weapons. The meeting is chaired by Ambassador Amandeep Singh Gill of India, and the countries will continue negotiations of what could become an historic international treaty.

For more information about autonomous weapons, see the following resources:

Developing Ethical Priorities for Neurotechnologies and AI

Private companies and military sectors have moved beyond the goal of merely understanding the brain to that of augmenting and manipulating brain function. In particular, companies such as Elon Musk’s Neuralink and Bryan Johnson’s Kernel are hoping to harness advances in computing and artificial intelligence alongside neuroscience to provide new ways to merge our brains with computers.

Musk also sees this as a means to help address both AI safety and human relevance as algorithms outperform humans in one area after another. He has previously stated, “Some high bandwidth interface to the brain will be something that helps achieve a symbiosis between human and machine intelligence and maybe solves the control problem and the usefulness problem.”

In a comment in Nature, 27 people from The Morningside Group outlined four ethical priorities for the emerging space of neurotechnologies and artificial intelligence. The authors include neuroscientists, ethicists and AI engineers from Google, top US and global Universities, and several non-profit research organizations such as AI Now and The Hastings Center.

A Newsweek article describes their concern, “Artificial intelligence could hijack brain-computer interfaces and take control of our minds.” While this is not exactly the warning the Group describes, they do suggest we are in store for some drastic changes:

…we are on a path to a world in which it will be possible to decode people’s mental processes and directly manipulate the brain mechanisms underlying their intentions, emotions and decisions; where individuals could communicate with others simply by thinking; and where powerful computational systems linked directly to people’s brains aid their interactions with the world such that their mental and physical abilities are greatly enhanced.

The authors suggest that although these advances could provide meaningful and beneficial enhancements to the human experience, they could also exacerbate social inequalities, enable more invasive forms of social manipulation, and threaten core fundamentals of what it means to be human. They encourage readers to consider the ramifications of these emerging technologies now.

Referencing the Asilomar AI Principles and other ethical guidelines as a starting point, they call for a new set of guidelines that specifically address concerns that will emerge as groups like Elon Musk’s startup Neuralink and other companies around the world explore ways to improve the interface between brains and machines. Their recommendations cover four key areas: privacy and consent; agency and identity; augmentation; and bias.

Regarding privacy and consent, they posit that the right to keep neural data private is critical. To this end, they recommend opt-in policies, strict regulation of commercial entities, and the use of blockchain-based techniques to provide transparent control over the use of data. In relation to agency and identity, they recommend that bodily and mental integrity, as well as the ability to choose our actions, be enshrined in international treaties such as the Universal Declaration of Human Rights.

In the area of augmentation, the authors discuss the possibility of an augmentation arms race of soldiers in the pursuit of so-called “super-soldiers” that are more resilient to combat conditions. They recommend that the use of neural technology for military purposes be stringently regulated. And finally, they recommend the exploration of countermeasures, as well as diversity in the design process, in order to prevent widespread bias in machine learning applications.

The ways in which AI will increasingly connect with our bodies and brains pose challenging safety and ethical concerns that will require input from a vast array of people. As Dr. Rafael Yuste of Columbia University, a neuroscientist who co-authored the essay, told STAT, “the ethical thinking has been insufficient. Science is advancing to the point where suddenly you can do things you never would have thought possible.”

MIRI’s November 2017 Newsletter

Eliezer Yudkowsky has written a new book on civilizational dysfunction and outperformance: Inadequate Equilibria: Where and How Civilizations Get Stuck. The full book will be available in print and electronic formats November 16. To preorder the ebook or sign up for updates, visit equilibriabook.com.

We’re posting the full contents online in stages over the next two weeks. The first two chapters are:

  1. Inadequacy and Modesty (discussion: LessWrong, EA Forum, Hacker News)
  2. An Equilibrium of No Free Energy (discussion: LessWrong, EA Forum)

Research updates

General updates

News and links

Podcast: AI Ethics, the Trolley Problem, and a Twitter Ghost Story with Joshua Greene and Iyad Rahwan

As technically challenging as it may be to develop safe and beneficial AI, this challenge also raises some thorny questions regarding ethics and morality, which are just as important to address before AI is too advanced. How do we teach machines to be moral when people can’t even agree on what moral behavior is? And how do we help people deal with and benefit from the tremendous disruptive change that we anticipate from AI?

To help consider these questions, Joshua Greene and Iyad Rawhan kindly agreed to join the podcast. Josh is a professor of psychology and member of the Center for Brain Science Faculty at Harvard University, where his lab has used behavioral and neuroscientific methods to study moral judgment, focusing on the interplay between emotion and reason in moral dilemmas. He’s the author of Moral Tribes: Emotion, Reason and the Gap Between Us and Them. Iyad is the AT&T Career Development Professor and an associate professor of Media Arts and Sciences at the MIT Media Lab, where he leads the Scalable Cooperation group. He created the Moral Machine, which is “a platform for gathering human perspective on moral decisions made by machine intelligence.”

In this episode, we discuss the trolley problem with autonomous cars, how automation will affect rural areas more than cities, how we can address potential inequality issues AI may bring about, and a new way to write ghost stories.

This transcript has been heavily edited for brevity. You can read the full conversation here.

Ariel: How do we anticipate that AI and automation will impact society in the next few years?

Iyad: AI has the potential to extract better value from the data we’re collecting from all the gadgets, devices and sensors around us. We could use this data to make better decisions, whether it’s micro-decisions in an autonomous car that takes us from A to B safer and faster, or whether it’s medical decision-making that enables us to diagnose diseases better, or whether it’s even scientific discovery, allowing us to do science more effectively, efficiently and more intelligently.

Joshua: Artificial intelligence also has the capacity to displace human value. To take the example of using artificial intelligence to diagnose disease. On the one hand it’s wonderful if you have a system that has taken in all of the medical knowledge we have in a way that no human could and uses it to make better decisions. But at the same time that also means that lots of doctors might be out of a job or have a lot less to do. This is the double-edged sword of artificial intelligence, the value it creates and the human value that it displaces.

Ariel: Can you explain what the trolley problem is and how does that connect to this question of what do autonomous vehicles do in situations where there is no good option?

Joshua: One of the original versions of the trolley problem goes like this (we’ll call it “the switch case”): A trolley is headed towards five people and if you don’t do anything, they’re going to be killed, but you can hit a switch that will turn the trolley away from the five and onto a side track. However on that side track, there’s one unsuspecting person and if you do that, that person will be killed.

The question is: is it okay to hit the switch to save those five people’s lives but at the cost of saving one life? In this case, most people tend to say yes. Then we can vary it a little bit. In “the footbridge case,” the situation is different as follows: the trolley is now headed towards five people on a single track, over that track is a footbridge and on that footbridge is a large person wearing a very large backpack. You’re also on the bridge and the only way that you can save those five people from being hit by the trolley is to push that big person off of the footbridge and onto the tracks below.

Assume that it will work, do you think it’s okay to push the guy off the footbridge in order to save five lives? Here, most people say no, and so we have this interesting paradox. In both cases, you’re trading one life for five, yet in one case it seems like it’s the right thing to do, in the other case it seems like it’s the wrong thing to do.

One of the classic objections to these dilemmas is that they’re unrealistic. My view is that the point is not that they’re realistic, but instead that they function like high contrast stimuli. If you’re a vision researcher and you’re using flashing black and white checkerboards to study the visual system, you’re not using that because that’s a typical thing that you look at, you’re using it because it’s something that drives the visual system in a way that reveals its structure and dispositions.

In the same way, these high contrast, extreme moral dilemmas can be useful to sharpen our understanding of the more ordinary processes that we bring to moral thinking.

Iyad: The trolley problem can translate in a cartoonish way to a scenario with which an autonomous car is faced with only two options. The car is going at a speed limit on a street and due to mechanical failure is unable to stop and is going to hit it a group of five pedestrians. The car can swerve and hit a bystander. Should the car swerve or should it just plow through the five pedestrians?

This has a structure similar to the trolley problem because you’re making similar tradeoffs between one and five people and the decision is not being taken on the spot, it’s actually happening at the time of the programming of the car.

There is another complication in which the person being sacrificed to save the greater number of people is the person in the car. Suppose the car can swerve to avoid the five pedestrians but as a result falls off a cliff. That adds another complication especially that programmers are going to have to appeal to customers. If customers don’t feel safe in those cars because of some hypothetical situation that may take place in which they’re sacrificed, that pits the financial incentives against the potentially socially desirable outcome, which can create problems.

A question that raises itself is: Is it going to ever happen? How many times do we face these kinds of situations as we drive today? So the argument goes: these situations are going to be so rare that they are irrelevant and that autonomous cars promise to be substantially safer than human-driven cars that we have today, that the benefits significantly outweigh the costs.

There is obviously truth to this argument, if you take the trolley problem scenario literally. But what the autonomous car version of the trolley problem is doing, is it’s abstracting the tradeoffs that are taking place every microsecond, even now.

Imagine you’re driving on the road and there is a large truck on the lane to your left and as a result you choose to stick a little bit further to the right, just to minimize risk in case this car gets off its lane. Now suppose that there could be a cyclist later on the right hand side, what you’re effectively doing in this small maneuver is slightly reducing risk to yourself but slightly increasing risk to the cyclist. These sorts of decisions are being made millions and millions of times every day.

Ariel: Applying the trolley problem to self-driving cars seems to be forcing the vehicle and thus the programmer of the vehicle to make a judgment call about whose life is more valuable. Can we not come up with some other parameters that don’t say that one person’s life is more valuable than someone else’s?

Joshua: I don’t think that there’s any way to avoid doing that. If you’re a driver, there’s no way to avoid answering the question, how cautious or how aggressive am I going to be. You can not explicitly answer the question; you can say I don’t want to think about that, I just want to drive and see what happens. But you are going to be implicitly answering that question through your behavior, and in the same way, autonomous vehicles can’t avoid the question. Either the people who are designing the machines, training the machines or explicitly programming to behave in certain ways, they are going to do things that are going to affect the outcome.

The cars will constantly be making decisions that inevitably involve value judgments of some kind.

Ariel: To what extent have we actually asked customers what it is that they want from the car? In a completely ethical world, I would like the car to protect the person who’s more vulnerable, who would be the cyclist. In practice, I have a bad feeling I’d probably protect myself.

Iyad: We could say we want to treat everyone equally. On the other hand, you have this self-protective instinct which presumably as a consumer, that’s what you want to buy for yourself and your family. On the other hand you also care for vulnerable people. Different reasonable and moral people can disagree on what the more important factors and considerations should be and I think this is precisely why we have to think about this problem explicitly, rather than leave it purely to – whether it’s programmers or car companies or any particular single group of people – to decide.

Joshua: When we think about problems like this, we have a tendency to binarize it, but it’s not a binary choice between protecting that person or not. It’s really going to be matters of degree. Imagine there’s a cyclist in front of you going at cyclist speed and you either have to wait behind this person for another five minutes creeping along much slower than you would ordinarily go, or you have to swerve into the other lane where there’s oncoming traffic at various distances. Very few people might say I will sit behind this cyclist for 10 minutes before I would go into the other lane and risk damage to myself or another car. But very few people would just blow by the cyclist in a way that really puts that person’s life in peril.

It’s a very hard question to answer because the answers don’t come in the form of something that you can write out in a sentence like, “give priority to the cyclist.” You have to say exactly how much priority in contrast to the other factors that will be in play for this decision. And that’s what makes this problem so interesting and also devilishly hard to think about.

Ariel: Why do you think this is something that we have to deal with when we’re programming something in advance and not something that we as a society should be addressing when it’s people driving?

Iyad: We very much value the convenience of getting from A to B. Our lifetime odds of dying from a car accident is more than 1%, yet somehow, we’ve decided to put up with this because of the convenience. As long as people don’t run through a red light or are not drunk, you don’t really blame them for fatal accidents, we just call them accidents.

But now, thanks to autonomous vehicles that can make decisions and reevaluate situations hundreds or thousands of times per second and adjust their plan and so on – we potentially have the luxury to make those decisions a bit better and I think this is why things are different now.

Joshua: With the human we can say, “Look, you’re driving, you’re responsible, and if you make a mistake and hurt somebody, you’re going to be in trouble and you’re going to pay the cost.” You can’t say that to a car, even a car that’s very smart by 2017 standards. The car isn’t going to be incentivized to behave better – the motivation has to be explicitly trained or programmed in.

Iyad: Economists say you can incentivize the people who make the cars to program them appropriately by fining them and engineering the product liability law in such a way that would hold them accountable and responsible for damages, and this may be the way in which we implement this feedback loop. But I think the question remains what should the standards be against which we hold those cars accountable.

Joshua: Let’s say somebody says, “Okay, I make self-driving cars and I want to make them safe because I know I’m accountable.” They still have to program or train the car. So there’s no avoiding that step, whether it’s done through traditional legalistic incentives or other kinds of incentives.

Ariel: I want to ask about some other research you both do. Iyad you look at how AI and automation impact us and whether that could be influenced by whether we live in smaller towns or larger cities. Can you talk about that?

Iyad: Clearly there are areas that may potentially benefit from AI because it improves productivity and it may lead to greater wealth, but it can also lead to labor displacement. It could cause unemployment if people aren’t able to retool and improve their skills so that they can work with these new AI tools and find employment opportunities.

Are we expected to experience this in a greater way or in a smaller magnitude in smaller versus bigger cities? On one hand there are lots of creative jobs in big cities and, because creativity is so hard to automate, it should make big cities more resilient to these shocks. On the other hand if you go back to Adam Smith and the idea of the division of labor, the whole idea is that individuals become really good at one thing. And this is precisely what spurred urbanization in the first industrial revolution. Even though the system is collectively more productive, individuals may be more automatable in terms of their narrowly-defined tasks.

But when we did the analysis, we found that indeed larger cities are more resilient in relative terms. The preliminary findings are that in bigger cities there is more production that requires social interaction and very advanced skills like scientific and engineering skills. People are better able to complement the machines because they have technical knowledge, so they’re able to use new intelligent tools that are becoming available, but they also work in larger teams on more complex products and services.

Ariel: Josh, you’ve done a lot of work with the idea of “us versus them.” And especially as we’re looking in this country and others at the political situation where it’s increasingly polarized along this line of city versus smaller town, do you anticipate some of what Iyad is talking about making the situation worse?

Joshua: I certainly think we should be prepared for the possibility that it will make the situation worse. The central idea is that as technology advances, you can produce more and more value with less and less human input, although the human input that you need is more and more highly skilled.

If you look at something like Turbo Tax, before you had lots and lots of accountants and many of those accountants are being replaced by a smaller number of programmers and super-expert accountants and people on the business side of these enterprises. If that continues, then yes, you have more and more wealth being concentrated in the hands of the people whose high skill levels complement the technology and there is less and less for people with lower skill levels to do. Not everybody agrees with that argument, but I think it’s one that we ignore at our peril.

Ariel: Do you anticipate that AI itself would become a “them,” or do you think it would be people working with AI versus people who don’t have access to AI?

Joshua: The idea of the AI itself becoming the “them,” I am agnostic as to whether or not that could happen eventually, but this would involve advances in artificial intelligence beyond anything we understand right now. Whereas the problem that we were talking about earlier – humans being divided into a technological, educated, and highly-paid elite as one group and then the larger group of people who are not doing as well financially – that “us-them” divide, you don’t need to look into the future, you can see it right now.

Iyad: I don’t think that the robot will be the “them” on their own, but I think the machines and the people who are very good at using the machines to their advantage, whether it’s economic or otherwise, will collectively be a “them.” It’s the people who are extremely tech savvy, who are using those machines to be more productive or to win wars and things like that. There would be some sort of evolutionary race between human-machine collectives.

Joshua: I think it’s possible that people who are technologically enhanced could have a competitive advantage and set off an economic arms race or perhaps even literal arms race of a kind that we haven’t seen. I hesitate to say, “Oh, that’s definitely going to happen.” I’m just saying it’s a possibility that makes a certain kind of sense.

Ariel: Do either of you have ideas on how we can continue to advance AI and address these divisive issues?

Iyad: There are two new tools at our disposal: experimentation and machine-augmented regulation.

Today, [there are] cars with a bull bar in front of them. These metallic bars at the front of the car increase safety for the passenger in the case of collision, but they have disproportionate impact on other cars, on pedestrians and cyclists, and they’re much more likely to kill them in the case of an accident. As a result, by making this comparison, by identifying that cars with bull bars are worse for certain group, the trade off was not acceptable, and many countries have banned them, for example the UK, Australia, and many European countries.

If there was a similar trade off being caused by a software feature, then, we wouldn’t know unless we allowed for experimentation as well as monitoring – if we looked at the data to identify whether a particular algorithm is making for very safe cars for customers, but at the expense of a particular group.

In some cases, these systems are going to be so sophisticated and the data is going to be so abundant that we won’t be able to observe them and regulate them in time. Think of algorithmic trading programs. No human being is able to observe these things fast enough to intervene, but you could potentially insert another algorithm, a regulatory algorithm or an oversight algorithm, that will observe other AI systems in real time on our behalf, to make sure that they behave.

Joshua: There are two general categories of strategies for making things go well. There are technical solutions to things and then there’s the broader social problem of having a system of governance that can be counted on to produce outcomes that are good for the public in general.

The thing that I’m most worried about is that if we don’t get our politics in order, especially in the United States, we’re not going to have a system in place that’s going to be able to put the public’s interest first. Ultimately, it’s going to come down to the quality of the government that we have in place, and quality means having a government that distributes benefits to people in what we would consider a fair way and takes care to make sure that things don’t go terribly wrong in unexpected ways and generally represents the interests of the people.

I think we should be working on both of these in parallel. We should be developing technical solutions to more localized problems where you need an AI solution to solve a problem created by AI. But I also think we have to get back to basics when it comes to the fundamental principles of our democracy and preserving them.

Ariel: As we move towards smarter and more ubiquitous AI, what worries you most and what are you most excited about?

Joshua: I’m pretty confident that a lot of labor is going to be displaced by artificial intelligence. I think it is going to be enormously politically and socially disruptive, and I think we need to plan now. With self-driving cars especially in the trucking industry, I think that’s going to be the first and most obvious place where millions of people are going to be out of work and it’s not going to be clear what’s going to replace it for them.

I’m excited about the possibility of AI producing value for people in a way that has not been possible before on a large scale. Imagine if anywhere in the world that’s connected to the Internet, you could get the best possible medical diagnosis for whatever is ailing you. That would be an incredible life-saving thing. And as AI teaching and learning systems get more sophisticated, I think it’s possible that people could actually get very high quality educations with minimal human involvement and that means that people all over the world could unlock their potential. And I think that that would be a wonderful transformative thing.

Iyad: I’m worried about the way in which AI and specifically autonomous weapons are going to alter the calculus of war. In order to aggress on another nation, you have to mobilize humans, you have to get political support from the electorate, you have to handle the very difficult process of bringing back people in coffins, and the impact that this has on electorates.

This creates a big check on power and it makes people think very hard about making these kinds of decisions. With AI, when you’re able to wage wars with very little loss to life, especially if you’re a very advanced nation that is at the forefront of this technology, then you have disproportionate power. It’s kind of like a nuclear weapon, but maybe more because it’s much more customizable. It’s not an all out or nothing – you could start all sorts of wars everywhere.

I think it’s going to be a very interesting shift in the way superpowers think about wars and I worry that this might make them trigger happy. I think a new social contract needs to be written so that this power is kept in check and that there’s more thought that goes into this.

On the other hand, I’m very excited about the abundance that will be created by AI technologies. We’re going to optimize the use of our resources in many ways. In health and in transportation, in energy consumption and so on, there are so many examples in recent years in which AI systems are able to discover ways in which even the smartest humans haven’t been able to optimize.

Ariel: One final thought: This podcast is going live on Halloween, so I want to end on a spooky note. And quite conveniently, Iyad’s group has created Shelley, which is a Twitter chatbot that will help you craft scary ghost stories. Shelley is, of course, a nod to Mary Shelley who wrote Frankenstein, which is the most famous horror story about technology. Iyad, I was hoping you could tell us a bit about how Shelley works.

Iyad: Yes, well this is our second attempt at doing something spooky for Halloween. Last year we launched the nightmare machine, which was using deep neural networks and style transfer algorithms to take ordinary photos and convert them into haunted houses and zombie-infested places. And that was quite interesting; it was a lot of fun. More recently, now we’ve launched Shelley, which people can visit on shelley.ai, and it is named after Mary Shelley who authored Frankenstein.

This is a neural network that generates text and it’s been trained on a very large data set of over 100 thousand short horror stories from a subreddit called No Sleep. And so it’s basically got a lot of human knowledge about what makes things spooky and scary, and the nice thing is that it generates part of the story and people can tweet back at it a continuation of the story and then basically take turns with the AI to craft stories. And we feature those stories on the website afterwards. if I’m correct, this is the first collaborative human-AI horror writing exercise ever.

Tokyo AI & Society Symposium

I just spent a week in Japan to speak at the inaugural symposium on AI & Society – my first conference in Asia. It was inspiring to take part in an increasingly global conversation about AI impacts, and interesting to see how the Japanese AI community thinks about these issues. Overall, Japanese researchers seemed more open to discussing controversial topics like human-level AI and consciousness than their Western counterparts. Most people were more interested in near-term AI ethics concerns but also curious about long term problems.

The talks were a mix of English and Japanese with translation available over audio (high quality but still hard to follow when the slides are in Japanese). Here are some tidbits from my favorite talks and sessions.

Danit Gal’s talk on China’s AI policy. She outlined China’s new policy report aiming to lead the world in AI by 2030, and discussed various advantages of collaboration over competition. It was encouraging to see that China’s AI goals include “establishing ethical norms, policies and regulations” and “forming robust AI safety and control mechanisms”. Danit called for international coordination to help ensure that everyone is following compatible concepts of safety and ethics.
danit_collage

Next breakthrough in AI panel (Yasuo Kuniyoshi from U Tokyo, Ryota Kanai from Araya and Marek Rosa from GoodAI). When asked about immediate research problems they wanted the field to focus on, the panelists highlighted intrinsic motivation, embodied cognition, and gradual learning. In the longer term, they encouraged researchers to focus on generalizable solutions and to not shy away from philosophical questions (like defining consciousness). I think this mindset is especially helpful for working on long-term AI safety research, and would be happy to see more of this perspective in the field.

Long-term talks and panel (Francesca Rossi from IBM, Hiroshi Nakagawa from U Tokyo and myself). I gave an overview of AI safety research problems in general and recent papers from my team. Hiroshi provocatively argued that a) AI-driven unemployment is inevitable, and b) we need to solve this problem using AI. Francesca talked about trustworthy AI systems and the value alignment problem. In the panel, we discussed whether long-term problems are a distraction from near-term problems (spoiler: no, both are important to work on), to what extent work on safety for current ML systems can carry over to more advanced systems (high-level insights are more likely to carry over than details), and other fun stuff.

Stephen Cave’s diagram of AI ethics issues. Helpfully color-coded by urgency.
stephen_cave_diagram

Luba Elliott’s talk on AI art. Style transfer has outdone itself with a Google Maps Mona Lisa.
google_maps_mona_lisa

There were two main themes I noticed in the Western presentations. People kept pointing out that AlphaGo is not AGI because it’s not flexible enough to generalize to hexagonal grids and such (this was before AlphaGo Zero came out). Also, the trolley problem was repeatedly brought up as a default ethical question for AI (it would be good to diversify this discussion with some less overused examples).

The conference was very well-organized and a lot of fun. Thanks to the organizers for bringing it together, and to all the great people I got to meet!

[This post originally appeared on the Deep Safety blog. Thanks to Janos Kramar for his feedback.]

Understanding Artificial General Intelligence — An Interview With Hiroshi Yamakawa

Click here to see this page in other languages : Japanese  

Artificial general intelligence (AGI) is something of a holy grail for many artificial intelligence researchers. Today’s narrow AI systems are only capable of specific tasks — such as internet searches, driving a car, or playing a video game — but none of the systems today can do all of these tasks. A single AGI would be able to accomplish a breadth and variety of cognitive tasks similar to that of people.

How close are we to developing AGI? How can we ensure that the power of AGI will benefit the world, and not just the group who develops it first? Will AGI become an existential threat for humanity, or an existential hope?

Dr. Hiroshi Yamakawa, Director of Dwango AI Laboratory, is one of the leading AGI researchers in Japan. Members of the Future of Life Institute sat down with Dr. Yamakawa and spoke with him about AGI and his lab’s progress in developing it. In this interview, Dr. Yamakawa explains how AI can model the human brain, his vision of a future where humans coexist with AGI, and why the Japanese think of AI differently than many in the West.

This transcript has been heavily edited for brevity. You can see the full conversation here.

Why did the Dwango Artificial Intelligence Laboratory make a large investment in [AGI]?

HY: Usable AI that has been developed up to now is essentially for solving specific areas or addressing a particular problem. Rather than just solving a number of problems using experience, AGI, we believe, will be more similar to human intelligence that can solve various problems which were not assumed in the design phase.

What is the advantage of the Whole Brain Architecture approach?

HY: The whole brain architecture is an engineering-based research approach “to create a human-like artificial general intelligence (AGI) by learning from the architecture of the entire brain.” Basically, this approach to building AGI is the integration of artificial neural networks and machine-learning modules while using the brain’s hard wiring as a reference.

I think it will be easier to create an AI with the same behavior and sense of values as humans this way. Even if superintelligence exceeds human intelligence in the near future, it will be comparatively easy to communicate with AI designed to think like a human, and this will be useful as machines and humans continue to live and interact with each other.

General intelligence is a function of many combined, interconnected features produced by learning, so we cannot manually break down these features into individual parts. Because of this difficulty, one meaningful characteristic of whole brain architecture is that though based on brain architecture, it is designed to be a functional assembly of parts that can still be broken down and used.

The functional parts of the brain are to some degree already present in artificial neural networks. It follows that we can build a roadmap of AGI based on these technologies as pieces and parts.

It is now said that convolutional neural networks have essentially outperformed the system/interaction between the temporal lobe and visual cortex in terms of image recognition tasks. At the same time, deep learning has been used to achieve very accurate voice recognition. In humans, the neocortex contains about 14 billion neurons, but about half of those can be partially explained with deep learning. From this point on, we need to come closer to simulating the functions of different structures of the brain, and even without the whole brain architecture, we need to be able to assemble several structures together to reproduce some behavioral level functions. Then, I believe, we’ll have a path to expand that development process to cover the rest of the brain functions, and finally integrate as whole brain..

You also started a non-profit, the Whole Brain Architecture Initiative. How does the non-profit’s role differ from the commercial work?

HY: The Whole Brain Architecture Initiative serves as an organization that helps promote whole brain AI architecture R&D as a whole.

The Basic Ideas of the WBAI:

  • Our vision is to create a world in which AI exists in harmony with humanity.
  • Our mission is to promote the open development of whole brain architecture.
    • In order to make human-friendly artificial general intelligence a public good for all of mankind, we seek to continually expand open, collaborative efforts to develop AI based on an architecture modeled after the brain.
  • Our values are Study, Imagine and Build.
    • Study: Deepen and spread our expertise.
    • Imagine: Broaden our views through public dialogue.
    • Build: Create AGI through open collaboration.

What do you think poses the greatest existential risk to global society in the 21st century?

HY: The risk is not just limited to AI; basically, as human scientific and technological abilities expand, and we become more empowered, risks will increase, too.

Imagine a large field where everyone only has weapons as dangerous as bamboo spears.  The risk that human beings would go extinct by killing each other is extremely small.  On the other hand, as technologies develop, we have bombs in a very small room and no matter who detonates the bomb, we approach a state of annihilation. That risk should concern everyone.

If there are only 10 people in the room, they will mutually monitor and trust each other. However, imagine trusting 10 billion people each with the ability to destroy everyone — such a scenario is beyond our ability to comprehend. Of course, technological development will advance not only offensive power but also defensive power, but it is not easy to have defensive power to contain attacking power at the same time. If scientific and technological development are promoted using artificial intelligence technology, for example, many countries will easily hold intercontinental ballistic fleets, and artificial intelligence can be extremely dangerous to living organisms by using nanotechnology. It could comprise a scenario to extinguish mankind by the development or use of dangerous substances.  Generally speaking, new offensive weapons are developed utilizing the progress of technology, and defensive weapons are developed to neutralize them. Therefore, it is inevitable that periods will exist where the offensive power needed to destroy humanity exceeds its defensive power.

What do you think is the greatest benefit that AGI can bring society?

HY: AGI’s greatest benefit comes from acceleration of development for science and technology. More sophisticated technology will offer solutions for global problems such as environmental issues, food problems and space colonization.

Here I would like to share my vision for the future: “In a desirable future, the happiness of all humans will be balanced against the survival of humankind under the support of superintelligence. In that future, society will be an ecosystem formed by augmented human beings and various public AIs, in what I dub ‘an ecosystem of shared intelligent agents’ (EcSIA).

“Although no human can completely understand EcSIA—it is too complex and vast—humans can control its basic directions. In implementing such control, the grace and wealth that EcSIA affords needs to be properly distributed to everyone.”

Assuming no global catastrophe halts progress, what are the odds of human level AGI in the next 10 years?

HY: I think there’s a possibility that it can happen soon, but taking the average of the estimates of people involved in WBAI, we came up with 2030.

In my current role as the editorial chairman for the Japanese Society of Artificial Intelligence (JSAI) journal, I’m promoting a plan to have a series of discussions starting in the July edition on the theme of “Singularity and AI,” in which we’ll have AI specialists discuss the singularity from a technical viewpoint. I want to help spread calm, technical views on the issue in this way, starting in Japan.

Once human level AGI is achieved, how long would you expect it to take for it to self-modify its way up to massive superhuman intelligence?

HY: If human-level AGI is achieved, it could take on the role of an AI researcher itself. Therefore, immediately after the AGI is built, it could start rapidly cultivating great numbers of AI researcher AI’s that work 24/7, and AI R&D would be drastically accelerated.

What probability do you assign to negative consequences as a result of badly done AI design or operation?

HY: If you include the risk of something like some company losing a lot of money, that will definitely happen.

The range of things that can be done with AI is becoming wider, and the disparity will widen between those who profit from it and those who do not. When that happens, the bad economic situation will give rise to dissatisfaction with the system, and that could create a breeding ground for war and strife. This could be perceived as the evils brought about by capitalism. It’s important that we try to curtail the causes of instability as much as possible.

Is it too soon for us to be researching AI Safety?

HY: I do not think it is at all too early to act for safety, and I think we should progress forward quickly. If possible, we should have several methods to be able to calculate the existential risk brought about by AGI.

Is there anything you think that the AI research community should be more aware of, more open about, or taking more action on?

HY: There are a number of actions that are obviously necessary. Based on this notion, we have established a number of measures like the Japanese Society for Artificial Intelligence Ethics in May 2015 (http://ai-elsi.org/ [in Japanese]), and subsequent Ethical Guidelines for AI researchers (http://ai-elsi.org/archives/514).

A majority of the content of these ethical guidelines expresses the standpoint that researchers should move forward with research that contributes to humanity and society. Additionally, one special characteristic of these guidelines is that the ninth principle listed, a call for ethical compliance of AI itself, states that AI in the future should also abide by the same ethical principles as AI researchers.

Japan, as a society, seems more welcoming of automation. Do you think the Japanese view of AI is different than that in the West?

HY: If we look at things from the standpoint of a moral society, we are all human, and without even looking from the viewpoints of one country or another, in general we should start with the mentality that we have more common characteristics than different.

When looking at AI from the traditional background of Japan, there is a strong influence from beliefs that spirits or “kami” are dwelling in all things. The boundary between living things and humans is relatively unclear, and along the same lines, the same boundaries for AI and robots are unclear. For this reason, in the past, robotic characters like “Tetsuwan Atom” (Astro Boy) and Doraemon were depicted as living and existing in the same world as humans, a theme that has been pervasive in Japanese anime for a long time.

From here on out, we will see humans and AI not as separate entities. Rather I think we will see the appearance of new combinations of AI and humans. Becoming more diverse in this way will certainly improve our chances of survival.

As a very personal view, I think that “surviving intelligence” is something that should be preserved in the future because I feel that it is very fortunate that we have established an intelligent society now, beyond the stormy sea of evolution.   Imagine a future in which our humanity is living with intelligent extraterrestrials after first contact. We will start caring about the survival of humanity but also intelligent extraterrestrials.  If that happens, one future scenario is that our dominant values will be extended to the survival of intelligence rather than the survival of the human race itself.

Hiroshi Yamakawa is the Director of Dwango AI Laboratory, Director and Chief Editor of the Japanese Society for Artificial Intelligence, a Fellow Researcher at the Brain Science Institute at Tamagawa University, and the Chairperson of the Whole Brain Architecture Initiative. He specializes in cognitive architecture, concept acquisition, neuro-computing, and opinion collection. He is one of the leading researchers working on AGI in Japan.

To learn more about Dr. Yamakawa’s work, you can read the full interview transcript here.

This interview was prepared by Eric Gastfriend, Jason Orlosky, Mamiko Matsumoto, Benjamin Peterson, Kazue Evans, and Tucker Davey. Original interview date: April 5, 2017. 

DeepMind’s AlphaGo Zero Becomes Go Champion Without Human Input

DeepMind’s AlphaGo Zero AI program just became the Go champion of the world without human data or guidance. This new system marks a significant technological jump from the AlphaGo program which beat Go champion Lee Sedol in 2016.

The game of Go has been played for more than 2,500 years and is widely viewed as not only a game, but a complex art form.  And a popular one at that. When the artificially intelligent AlphaGo from DeepMind played its first game against Sedol in March 2016, 60 million viewers tuned in to watch in China alone. AlphaGo went on to win four of five games, surprising the world and signifying a major achievement in AI research.

Unlike the chess match between Deep Blue and Garry Kasparov in 1997, AlphaGo did not win by brute force computing alone. The more complex programming of AlphaGo amazed viewers not only with the excellency of its play, but also with its creativity. The infamous “move 37” in game two was described by Go player Fan Hui as “So beautiful.” It was also so unusual that one of the commentators thought it was a mistake. Fan Hui explained, “It’s not a human move. I’ve never seen a human play this move.”

In other words, AlphaGo not only signified an iconic technological achievement, but also shook deeply held social and cultural beliefs about mastery and creativity. Yet, it turns out that AlphaGo was only the beginning. Today, DeepMind announced AlphaGo Zero.

Unlike AlphaGo, AlphaGo Zero was not shown a single human game of Go from which to learn. AlphaGo Zero learned entirely from playing against itself, with no prior knowledge of the game. Although its first games were random, the system used what DeepMind is calling a novel form of reinforcement learning to combine a neural network with a powerful search algorithm to improve each time it played.

In a DeepMind blog about the announcement, the authors write, “This technique is more powerful than previous versions of AlphaGo because it is no longer constrained by the limits of human knowledge. Instead, it is able to learn tabula rasa from the strongest player in the world: AlphaGo itself.”

Though previous AIs from DeepMind have mastered Atari games without human input, as the authors of the Nature article note, “the game of Go, widely viewed as the grand challenge for artificial intelligence, [requires] a precise and sophisticated lookahead in vast search spaces.” While the old Atari games were much more straightforward, the new AI system for AlphaGo Zero had to master the strategy for immediate moves, as well as how to anticipate moves that might be played far into the future.

That this was done all without human demonstrations also takes the program a step beyond the original AlphaGo systems. But in addition to that, this new system learned with fewer input features than its predecessors, and while the original AlphaGo systems required two separate neural networks, AlphaGo Zero was built with only one.

AlphaGo Zero is not marginally better than its predecessor, but in an entirely new class of “superhuman performance” with an intelligence that is notably more general. After just three days of playing against itself (4.9 million times), AlphaGo Zero beat AlphaGo by 100 games to 0. It independently learned the ancient secrets of the masters, but also chose moves and developed strategies never before seen among human players.

Co-founder​ ​and​ ​CEO of ​DeepMind, Demis​ ​Hassabis, said: “It’s amazing to see just how far AlphaGo has come in only two years. AlphaGo Zero is now the strongest version of our program and shows how much progress we can make even with less computing power and zero use of human data.”

Hassabis continued, “Ultimately we want to harness algorithmic breakthroughs like this to help solve all sorts of pressing real world problems like protein folding or designing new materials. If we can make the same progress on these problems that we have with AlphaGo, it has the potential to drive forward human understanding and positively impact all of our lives.”