Cybersecurity and Machine Learning

When it comes to cybersecurity, no nation can afford to slack off. If a nation’s defense systems cannot anticipate how an attacker will try to fool them, then an especially clever attack could expose military secrets or use disguised malware to cause major networks to crash.

A nation’s defense systems must keep up with the constant threat of attack, but this is a difficult and never-ending process. It seems that the defense is always playing catch-up.

Ben Rubinstein, a professor at the University of Melbourne in Australia, asks: “Wouldn’t it be good if we knew what the malware writers are going to do next, and to know what type of malware is likely to get through the filters?”

In other words, what if defense systems could learn to anticipate how attackers will try to fool them?

 

Adversarial Machine Learning

In order to address this question, Rubinstein studies how to prepare machine-learning systems to catch adversarial attacks. In the game of national cybersecurity, these adversaries are often individual hackers or governments who want to trick machine-learning systems for profit or political gain.

Nations have become increasingly dependent on machine-learning systems to protect against such adversaries. Unaided by humans, machine-learning systems in anti-malware and facial recognition software have the ability to learn and improve their function as they encounter new data. As they learn, they become better at catching adversarial attacks.

Machine-learning systems are generally good at catching adversaries, but they are not completely immune to threats, and adversaries are constantly looking for new ways to fool them. Rubinstein says, “Machine learning works well if you give it data like it’s seen before, but if you give it data that it’s never seen before, there’s no guarantee that it’s going to work.”

With adversarial machine learning, security agencies address this weakness by presenting the system with different types of malicious data to test the system’s filters. The system then digests this new information and learns how to identify and capture malware from clever attackers.

 

Security Evaluation of Machine-Learning Systems

Rubinstein’s project is called “Security Evaluation of Machine-Learning Systems”, and his ultimate goal is to develop a software tool that companies and government agencies can use to test their defenses. Any company or agency that uses machine-learning systems could run his software against their system. Rubinstein’s tool would attack and try to fool the system in order to expose the system’s vulnerabilities. In doing so, his tool anticipates how an attacker could slip by the system’s defenses.

The software would evaluate existing machine-learning systems and find weak spots that adversaries might try to exploit – similar to how one might defend a castle.

“We’re not giving you a new castle,” Rubinstein says, “we’re just going to walk around the perimeter and look for holes in the walls and weak parts of the castle, or see where the moat is too shallow.”

By analyzing many different machine-learning systems, his software program will pick up on trends and be able to advise security agencies to either use a different system or bolster the security of their existing system. In this sense, his program acts as a consultant for every machine-learning system.

Consider a program that does facial recognition. This program would use machine learning to identify faces and catch adversaries that pretend to look like someone else.

Rubinstein explains: “Our software aims to automate this security evaluation so that it takes an image of a person and a program that does facial recognition, and it will tell you how to change its appearance so that it will evade detection or change the outcome of machine learning in some way.”

This is called a mimicry attack – when an adversary makes one instance (one face) look like another, and thereby fools a system.

To make this example easier to visualize, Rubinstein’s group built a program that demonstrates how to change a face’s appearance to fool a machine-learning system into thinking that it is another face.

In the image below, the two faces don’t look alike, but the left image has been modified so that the machine-learning system thinks it is the same as the image on the right. This example provides insight into how adversaries can fool machine-learning systems by exploiting quirks.

ben-rubinstein-facial-recognition

When Rubinstein’s software fools a system with a mimicry attack, security personnel can then take that information and retrain their program to establish more effective security when the stakes are higher.

 

While Rubinstein’s software will help to secure machine-learning systems against adversarial attacks, he has no illusions about the natural advantages that attackers enjoy. It will always be easier to attack a castle than to defend it, and the same holds true for a machine-learning system. This is called the ‘asymmetry of cyberwarfare.’

“The attacker can come in from any angle. It only needs to succeed at one point, but the defender needs to succeed at all points,” says Rubinstein.

In general, Rubinstein worries that the tools available to test machine-learning systems are theoretical in nature, and put too much responsibility on the security personnel to understand the complex math involved. A researcher might redo the mathematical analysis for every new learning system, but security personnel are unlikely to have the time or resources to keep up.

Rubinstein aims to “bring what’s out there in theory and make it more applied and more practical and easy for anyone who’s using machine learning in a system to evaluate the security of their system.”

With his software, Rubinstein intends to help level the playing field between attackers and defenders. By giving security agencies better tools to test and adapt their machine-learning systems, he hopes to improve the ability of security personnel to anticipate and guard against cyberattacks.

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

Supervising AI Growth

When Apple released its software application, Siri, in 2011, iPhone users had high expectations for their intelligent personal assistants. Yet despite its impressive and growing capabilities, Siri often makes mistakes. The software’s imperfections highlight the clear limitations of current AI: today’s machine intelligence can’t understand the varied and changing needs and preferences of human life.

However, as artificial intelligence advances, experts believe that intelligent machines will eventually – and probably soon – understand the world better than humans. While it might be easy to understand how or why Siri makes a mistake, figuring out why a superintelligent AI made the decision it did will be much more challenging.

If humans cannot understand and evaluate these machines, how will they control them?

Paul Christiano, a Ph.D. student in computer science at UC Berkeley, has been working on addressing this problem. He believes that to ensure safe and beneficial AI, researchers and operators must learn to measure how well intelligent machines do what humans want, even as these machines surpass human intelligence.

 

Semi-supervised Learning

The most obvious way to supervise the development of an AI system also happens to be the hard way. As Christiano explains: “One way humans can communicate what they want, is by spending a lot of time digging down on some small decision that was made [by an AI], and try to evaluate how good that decision was.”

But while this is theoretically possible, the human researchers would never have the time or resources to evaluate every decision the AI made. “If you want to make a good evaluation, you could spend several hours analyzing a decision that the machine made in one second,” says Christiano.

For example, suppose an amateur chess player wants to understand a better chess player’s previous move. Merely spending a few minutes evaluating this move won’t be enough, but if she spends a few hours she could consider every alternative and develop a meaningful understanding of the better player’s moves.

Fortunately for researchers, they don’t need to evaluate every decision an AI makes in order to be confident in its behavior. Instead, researchers can choose “the machine’s most interesting and informative decisions, where getting feedback would most reduce our uncertainty,“ Christiano explains.

“Say your phone pinged you about a calendar event while you were on a phone call,” he elaborates, “That event is not analogous to anything else it has done before, so it’s not sure whether it is good or bad.” Due to this uncertainty, the phone would send the transcript of its decisions to an evaluator at Google, for example. The evaluator would study the transcript, ask the phone owner how he felt about the ping, and determine whether pinging users during phone calls is a desirable or undesirable action. By providing this feedback, Google teaches the phone when it should interrupt users in the future.

This active learning process is an efficient method for humans to train AIs, but what happens when humans need to evaluate AIs that exceed human intelligence?

Consider a computer that is mastering chess. How could a human give appropriate feedback to the computer if the human has not mastered chess? The human might criticize a move that the computer makes, only to realize later that the machine was correct.

With increasingly intelligent phones and computers, a similar problem is bound to occur. Eventually, Christiano explains, “we need to handle the case where AI systems surpass human performance at basically everything.”

If a phone knows much more about the world than its human evaluators, then the evaluators cannot trust their human judgment. They will need to “enlist the help of more AI systems,” Christiano explains.

 

Using AIs to Evaluate Smarter AIs

When a phone pings a user while he is on a call, the user’s reaction to this decision is crucial in determining whether the phone will interrupt users during future phone calls. But, as Christiano argues, “if a more advanced machine is much better than human users at understanding the consequences of interruptions, then it might be a bad idea to just ask the human ‘should the phone have interrupted you right then?’” The human might express annoyance at the interruption, but the machine might know better and understand that this annoyance was necessary to keep the user’s life running smoothly.

In these situations, Christiano proposes that human evaluators use other intelligent machines to do the grunt work of evaluating an AI’s decisions. In practice, a less capable System 1 would be in charge of evaluating the more capable System 2. Even though System 2 is smarter, System 1 can process a large amount of information quickly, and can understand how System 2 should revise its behavior. The human trainers would still provide input and oversee the process, but their role would be limited.

This training process would help Google understand how to create a safer and more intelligent AI – System 3 – which the human researchers could then train using System 2.

Christiano explains that these intelligent machines would be like little agents that carry out tasks for humans. Siri already has this limited ability to take human input and figure out what the human wants, but as AI technology advances, machines will learn to carry out complex tasks that humans cannot fully understand.

 

Can We Ensure that an AI Holds Human Values?

As Google and other tech companies continue to improve their intelligent machines with each evaluation, the human trainers will fulfill a smaller role. Eventually, Christiano explains, “it’s effectively just one machine evaluating another machine’s behavior.”

Ideally, “each time you build a more powerful machine, it effectively models human values and does what humans would like,” says Christiano. But he worries that these machines may stray from human values as they surpass human intelligence. To put this in human terms: a complex intelligent machine would resemble a large organization of humans. If the organization does tasks that are too complex for any individual human to understand, it may pursue goals that humans wouldn’t like.

In order to address these control issues, Christiano is working on an “end-to-end description of this machine learning process, fleshing out key technical problems that seem most relevant.” His research will help bolster the understanding of how humans can use AI systems to evaluate the behavior of more advanced AI systems. If his work succeeds, it will be a significant step in building trustworthy artificial intelligence.

You can learn more about Paul Christiano’s work here.

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

How Can AI Learn to Be Safe?

As artificial intelligence improves, machines will soon be equipped with intellectual and practical capabilities that surpass the smartest humans. But not only will machines be more capable than people, they will also be able to make themselves better. That is, these machines will understand their own design and how to improve it – or they could create entirely new machines that are even more capable.

The human creators of AIs must be able to trust these machines to remain safe and beneficial even as they self-improve and adapt to the real world.

Recursive Self-Improvement

This idea of an autonomous agent making increasingly better modifications to its own code is called recursive self-improvement. Through recursive self-improvement, a machine can adapt to new circumstances and learn how to deal with new situations.

To a certain extent, the human brain does this as well. As a person develops and repeats new habits, connections in their brains can change. The connections grow stronger and more effective over time, making the new, desired action easier to perform (e.g. changing one’s diet or learning a new language). In machines though, this ability to self-improve is much more drastic.

An AI agent can process information much faster than a human, and if it does not properly understand how its actions impact people, then its self-modifications could quickly fall out of line with human values.

For Bas Steunebrink, a researcher at the Swiss AI lab IDSIA, solving this problem is a crucial step toward achieving safe and beneficial AI.

Building AI in a Complex World

Because the world is so complex, many researchers begin AI projects by developing AI in carefully controlled environments. Then they create mathematical proofs that can assure them that the AI will achieve success in this specified space.

But Steunebrink worries that this approach puts too much responsibility on the designers and too much faith in the proof, especially when dealing with machines that can learn through recursive self-improvement. He explains, “We cannot accurately describe the environment in all its complexity; we cannot foresee what environments the agent will find itself in in the future; and an agent will not have enough resources (energy, time, inputs) to do the optimal thing.”

If the machine encounters an unforeseen circumstance, then that proof the designer relied on in the controlled environment may not apply. Says Steunebrink, “We have no assurance about the safe behavior of the [AI].”

Experience-based Artificial Intelligence

Instead, Steunebrink uses an approach called EXPAI (experience-based artificial intelligence). EXPAI are “self-improving systems that make tentative, additive, reversible, very fine-grained modifications, without prior self-reasoning; instead, self-modifications are tested over time against experiential evidences and slowly phased in when vindicated, or dismissed when falsified.”

Instead of trusting only a mathematical proof, researchers can ensure that the AI develops safe and benevolent behaviors by teaching and testing the machine in complex, unforeseen environments that challenge its function and goals.

With EXPAI, AI machines will learn from interactive experience, and therefore monitoring their growth period is crucial. As Steunebrink posits, the focus shifts from asking, “What is the behavior of an agent that is very intelligent and capable of self-modification, and how do we control it?” to asking, “How do we grow an agent from baby beginnings such that it gains both robust understanding and proper values?”

Consider how children grow and learn to navigate the world independently. If provided with a stable and healthy childhood, children learn to adopt values and understand their relation to the external world through trial and error, and by examples. Childhood is a time of growth and learning, of making mistakes, of building on success – all to help prepare the child to grow into a competent adult who can navigate unforeseen circumstances.

Steunebrink believes that researchers can ensure safe AI through a similar, gradual process of experience-based learning. In an architectural blueprint developed by Steunebrink and his colleagues, the AI is constructed “starting from only a small amount of designer-specific code – a seed.” Like a child, the beginnings of the machine will be less competent and less intelligent, but it will self-improve over time, as it learns from teachers and real-world experience.

As Steunebrink’s approach focuses on the growth period of an autonomous agent, the teachers, not the programmers, are most responsible for creating a robust and benevolent AI. Meanwhile, the developmental stage gives researchers time to observe and correct an AI’s behavior in a controlled setting where the stakes are still low.

The Future of EXPAI

Steunebrink and his colleagues are currently creating what he describes as a “pedagogy to determine what kind of things to teach to agents and in what order, how to test what the agents understand from being taught, and, depending on the results of such tests, decide whether we can proceed to the next steps of teaching or whether we should reteach the agent or go back to the drawing board.”

A major issue Steunebrink faces is that his method of experience-based learning diverges from the most popular methods for improving AI. Instead of doing the intellectual work of crafting a proof-backed optimal learning algorithm on a computer, EXPAI requires extensive in-person work with the machine to teach it like a child.

Creating safe artificial intelligence might prove to be more a process of teaching and growth rather than a function of creating the perfect mathematical proof. While such a shift in responsibility may be more time-consuming, it could also help establish a far more comprehensive understanding of an AI before it is released into the real world.

Steunebrink explains, “A lot of work remains to move beyond the agent implementation level, towards developing the teaching and testing methodologies that enable us to grow an agent’s understanding of ethical values, and to ensure that the agent is compelled to protect and adhere to them.”

The process is daunting, he admits, “but it is not as daunting as the consequences of getting AI safety wrong.”

If you would like to learn more about Bas Steunebrink’s research, you can read about his project here, or visit http://people.idsia.ch/~steunebrink/. He is also the co-founder of NNAISENSE, which you can learn about at https://nnaisense.com/.

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

Training Artificial Intelligence to Compromise

Click here to see this page in other languages : Chinese  

Imagine you’re sitting in a self-driving car that’s about to make a left turn into on-coming traffic. One small system in the car will be responsible for making the vehicle turn, one system might speed it up or hit the brakes, other systems will have sensors that detect obstacles, and yet another system may be in communication with other vehicles on the road. Each system has its own goals — starting or stopping, turning or traveling straight, recognizing potential problems, etc. — but they also have to all work together toward one common goal: turning into traffic without causing an accident.

Harvard professor and FLI researcher, David Parkes, is trying to solve just this type of problem. Parkes told FLI, “The particular question I’m asking is: If we have a system of AIs, how can we construct rewards for individual AIs, such that the combined system is well behaved?”

Essentially, an AI within a system of AIs — like that in the car example above — needs to learn how to meet its own objective, as well as how to compromise so that it’s actions will help satisfy the group objective. On top of that, the system of AIs needs to consider the preferences of society. The safety of the passenger in the car or a pedestrian in the crosswalk is a higher priority than turning left.

Training a well-behaved AI

Because environments like a busy street are so complicated, an engineer can’t just program an AI to act in some way to always achieve its objectives. AIs need to learn proper behavior based on a rewards system. “Each AI has a reward for its action and the action of the other AI,” Parkes explained. With the world constantly changing, the rewards have to evolve, and the AIs need to keep up not only with how their own goals change, but also with the evolving objectives of the system as a whole.

The idea of a rewards-based learning system is something most people can likely relate to. Who doesn’t remember the excitement of a gold star or a smiley face on a test? And any dog owner has experienced how much more likely their pet is to perform a trick when it realizes it will get a treat. A reward for an AI is similar.

A technique often used in designing artificial intelligence is reinforcement learning. With reinforcement learning, when the AI takes some action, it receives either positive or negative feedback. And it then tries to optimize its actions to receive more positive rewards. However, the reward can’t just be programmed into the AI. The AI has to interact with its environment to learn which actions will be considered good, bad or neutral. Again, the idea is similar to a dog learning that tricks can earn it treats or praise, but misbehaving could result in punishment.

More than this, Parkes wants to understand how to distribute rewards to subcomponents – the individual AIs – in order to achieve good system-wide behavior. How often should there be positive (or negative) reinforcement, and in reaction to which types of actions?

For example, if you were to play a video game without any points or lives or levels or other indicators of success or failure, you might run around the world killing or fighting aliens and monsters, and you might eventually beat the game, but you wouldn’t know which specific actions led you to win. Instead, games are designed to provide regular feedback and reinforcement so that you know when you make progress and what steps you need to take next. To train an AI, Parkes has to determine which smaller actions will merit feedback so that the AI can move toward a larger, overarching goal.

Rather than programming a reward specifically into the AI, Parkes shapes the way rewards flow from the environment to the AI in order to promote desirable behaviors as the AI interacts with the world around it.

But this is all for just one AI. How do these techniques apply to two or more AIs?

Training a system of AIs

Much of Parkes’ work involves game theory. Game theory helps researchers understand what types of rewards will elicit collaboration among otherwise self-interested players, or in this case, rational AIs. Once an AI figures out how to maximize its own reward, what will entice it to act in accordance with another AI? To answer this question, Parkes turns to an economic theory called mechanism design.

Mechanism design theory is a Nobel-prize winning theory that allows researchers to determine how a system with multiple parts can achieve an overarching goal. It is a kind of “inverse game theory.” How can rules of interaction – ways to distribute rewards, for instance – be designed so individual AIs will act in favor of system-wide and societal preferences? Among other things, mechanism design theory has been applied to problems in auctions, e-commerce, regulations, environmental policy, and now, artificial intelligence.

The difference between Parkes’ work with AIs and mechanism design theory is that the latter requires some sort of mechanism or manager overseeing the entire system. In the case of an automated car or a drone, the AIs within have to work together to achieve group goals, without a mechanism making final decisions. As the environment changes, the external rewards will change. And as the AIs within the system realize they want to make some sort of change to maximize their rewards, they’ll have to communicate with each other, shifting the goals for the entire autonomous system.

Parkes summarized his work for FLI, saying, “The work that I’m doing as part of the FLI grant program is all about aligning incentives so that when autonomous AIs decide how to act, they act in a way that’s not only good for the AI system, but also good for society more broadly.”

Parkes is also involved with the One Hundred Year Study on Artificial Intelligence, and he explained his “research with FLI has informed a broader perspective on thinking about the role that AI can play in an urban context in the near future.” As he considers the future, he asks, “What can we see, for example, from the early trajectory of research and development on autonomous vehicles and robots in the home, about where the hard problems will be in regard to the engineering of value-aligned systems?”

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

The Evolution of AI: Can Morality be Programmed?

Click here to see this page in other languages: Chinese  

The following article was originally posted on Futurism.com.

Recent advances in artificial intelligence have made it clear that our computers need to have a moral code. Disagree? Consider this: A car is driving down the road when a child on a bicycle suddenly swerves in front of it. Does the car swerve into an oncoming lane, hitting another car that is already there? Does the car swerve off the road and hit a tree? Does it continue forward and hit the child?

Each solution comes with a problem: It could result in death.

It’s an unfortunate scenario, but humans face such scenarios every day, and if an autonomous car is the one in control, it needs to be able to make this choice. And that means that we need to figure out how to program morality into our computers.

Vincent Conitzer, a Professor of Computer Science at Duke University, recently received a grant from the Future of Life Institute in order to try and figure out just how we can make an advanced AI that is able to make moral judgments…and act on them.

MAKING MORALITY

At first glance, the goal seems simple enough—make an AI that behaves in a way that is ethically responsible; however, it’s far more complicated than it initially seems, as there are an amazing amount of factors that come into play. As Conitzer’s project outlines, “moral judgments are affected by rights (such as privacy), roles (such as in families), past actions (such as promises), motives and intentions, and other morally relevant features. These diverse factors have not yet been built into AI systems.”

That’s what we’re trying to do now.

In a recent interview with Futurism, Conitzer clarified that, while the public may be concerned about ensuring that rogue AI don’t decide to wipe-out humanity, such a thing really isn’t a viable threat at the present time (and it won’t be for a long, long time). As a result, his team isn’t concerned with preventing a global-robotic-apocalypse by making selfless AI that adore humanity. Rather, on a much more basic level, they are focused on ensuring that our artificial intelligence systems are able to make the hard, moral choices that humans make on a daily basis.

So, how do you make an AI that is able to make a difficult moral decision?

Conitzer explains that, to reach their goal, the team is following a two path process: Having people make ethical choices in order to find patterns and then figuring out how that can be translated into an artificial intelligence. He clarifies, “what we’re working on right now is actually having people make ethical decisions, or state what decision they would make in a given situation, and then we use machine learning to try to identify what the general pattern is and determine the extent that we could reproduce those kind of decisions.”

In short, the team is trying to find the patterns in our moral choices and translate this pattern into AI systems. Conitzer notes that, on a basic level, it’s all about making predictions regarding what a human would do in a given situation, “if we can become very good at predicting what kind of decisions people make in these kind of ethical circumstances, well then, we could make those decisions ourselves in the form of the computer program.”

However, one major problem with this is, of course, that morality is not objective — it’s neither timeless nor universal.

Conitzer articulates the problem by looking to previous decades, “if we did the same ethical tests a hundred years ago, the decisions that we would get from people would be much more racist, sexist, and all kinds of other things that we wouldn’t see as ‘good’ now. Similarly, right now, maybe our moral development hasn’t come to its apex, and a hundred years from now people might feel that some of the things we do right now, like how we treat animals, is completely immoral. So there’s kind of a risk of bias and with getting stuck at whatever our current level of moral development is.”

And of course, there is the aforementioned problem regarding how complex morality is. “Pure altruism, that’s very easy to address in game theory, but maybe you feel like you owe me something based on previous actions. That’s missing from the game theory literature, and so that’s something that we’re also thinking about a lot—how can you make this, what game theory calls ‘Solutions Concept’—sensible? How can you compute these things?”

To solve these problems, and to help figure out exactly how morality functions and can (hopefully) be programmed into an AI, the team is combining the methods from computer science, philosophy, and psychology “That’s, in a nutshell, what our project is about,” Conitzer asserts.

But what about those sentient AI? When will we need to start worrying about them and discussing how they should be regulated?

THE HUMAN-LIKE AI

According to Conitzer, human-like artificial intelligence won’t be around for some time yet (so yay! No Terminator-styled apocalypse…at least for the next few years).

“Recently, there have been a number of steps towards such a system, and I think there have been a lot of surprising advances….but I think having something like a ‘true AI,’ one that’s really as flexible, able to abstract, and do all these things that humans do so easily, I think we’re still quite far away from that,” Conitzer asserts.

True, we can program systems to do a lot of things that humans do well, but there are some things that are exceedingly complex and hard to translate into a pattern that computers can recognize and learn from (which is ultimately the basis of all AI).

“What came out of early AI research, the first couple decades of AI research, was the fact that certain things that we had thought of as being real benchmarks for intelligence, like being able to play chess well, were actually quite accessible to computers. It was not easy to write and create a chess-playing program, but it was doable.”

Indeed, today, we have computers that are able to beat the best players in the world in a host of games—Chess and Alpha Go, for example.

But Conitzer clarifies that, as it turns out, playing games isn’t exactly a good measure of human-like intelligence. Or at least, there is a lot more to the human mind. “Meanwhile, we learned that other problems that were very simple for people were actually quite hard for computers, or to program computers to do. For example, recognizing your grandmother in a crowd. You could do that quite easily, but it’s actually very difficult to program a computer to recognize things that well.”

Since the early days of AI research, we have made computers that are able to recognize and identify specific images. However, to sum the main point, it is remarkably difficult to program a system that is able to do all of the things that humans can do, which is why it will be some time before we have a ‘true AI.’

Yet, Conitzer asserts that now is the time to start considering what the rules we will use to govern such intelligences. “It may be quite a bit further out, but to computer scientists, that means maybe just on the order of decades, and it definitely makes sense to try to think about these things a little bit ahead.” And he notes that, even though we don’t have any human-like robots just yet, our intelligence systems are already making moral choices and could, potentially, save or end lives.

“Very often, many of these decisions that they make do impact people and we may need to make decisions that we will typically be considered to be a morally loaded decision. And a standard example is a self-driving car that has to decide to either go straight and crash into the car ahead of it or veer off and maybe hurt some pedestrian. How do you make those trade-offs? And that I think is something we can really make some progress on. This doesn’t require superintelligent AI, this can just be programs that make these kind of trade-offs in various ways.”

But of course, knowing what decision to make will first require knowing exactly how our morality operates (or at least having a fairly good idea). From there, we can begin to program it, and that’s what Conitzer and his team are hoping to do.

So welcome to the dawn of moral robots.

This interview has been edited for brevity and clarity. 

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

Grants Timeline

Grants F.A.Q.

Grants RFP Overview

Grants Program Press Release

New International Grants Program Jump-Starts Research to Ensure AI Remains Beneficial

Elon-Musk-backed program signals growing interest in new branch of artificial intelligence research

July 1, 2015
Amid rapid industry investment in developing smarter artificial intelligence, a new branch of research has begun to take off aimed at ensuring that society can reap the benefits of AI while avoiding potential pitfalls.

The Boston-based Future of Life Institute (FLI) announced the selection of 37 research teams around the world to which it plans to award about $7 million from Elon Musk and the Open Philanthropy Project as part of a first-of-its-kind grant program dedicated to “keeping AI robust and beneficial”. The program launches as an increasing number of high-profile figures including Bill Gates, Elon Musk and Stephen Hawking voice concerns about the possibility of powerful AI systems having unintended, or even potentially disastrous, consequences. The winning teams, chosen from nearly 300 applicants worldwide, will research a host of questions in computer science, law, policy, economics, and other fields relevant to coming advances in AI.

The 37 projects being funded include:

  • Three projects developing techniques for AI systems to learn what humans prefer from observing our behavior, including projects at UC Berkeley and Oxford University
  • A project by Benja Fallenstein at the Machine Intelligence Research Institute on how to keep the interests of superintelligent systems aligned with human values
  • A project led by Manuela Veloso from Carnegie Mellon University on making AI systems explain their decisions to humans
  • A study by Michael Webb of Stanford University on how to keep the economic impacts of AI beneficial
  • A project headed by Heather Roff studying how to keep AI-driven weapons under “meaningful human control”
  • A new Oxford-Cambridge research center for studying AI-relevant policy

As Skype founder Jaan Tallinn, one of FLI’s founders, has described this new research direction, “Building advanced AI is like launching a rocket. The first challenge is to maximize acceleration, but once it starts picking up speed, you also need to to focus on steering.”

When the Future of Life Institute issued an open letter in January calling for research on how to keep AI both robust and beneficial, it was signed by a long list of AI researchers from academia, nonprofits and industry, including AI research leaders from Facebook, IBM, and Microsoft and the founders of Google’s DeepMind Technologies. It was seeing that widespread agreement that moved Elon Musk to seed the research program that has now begun.

“Here are all these leading AI researchers saying that AI safety is important”, said Musk at the time. “I agree with them, so I’m today committing $10M to support research aimed at keeping AI beneficial for humanity.”

“I am glad to have an opportunity to carry this research focused on increasing the transparency of AI robotic systems,” said Manuela Veloso, past president of the Association for the Advancement of Artificial Intelligence (AAAI) and winner of one of the grants.

“This grant program was much needed: because of its emphasis on safe AI and multidisciplinarity, it fills a gap in the overall scenario of international funding programs,” added Prof. Francesca Rossi, president of the International Joint Conference on Artificial Intelligence (IJCAI), also a grant awardee.

Tom Dietterich, president of the AAAI, described how his grant — a project studying methods for AI learning systems to self-diagnose when failing to cope with a new situation — breaks the mold of traditional research:

“In its early days, AI research focused on the ‘known knowns’ by working on problems such as chess and blocks world planning, where everything about the world was known exactly. Starting in the 1980s, AI research began studying the ‘known unknowns’ by using probability distributions to represent and quantify the likelihood of alternative possible worlds. The FLI grant will launch work on the ‘unknown unknowns’: How can an AI system behave carefully and conservatively in a world populated by unknown unknowns — aspects that the designers of the AI system have not anticipated at all?”

As Terminator Genisys debuts this week, organizers stressed the importance of separating fact from fiction. “The danger with the Terminator scenario isn’t that it will happen, but that it distracts from the real issues posed by future AI”, said FLI president Max Tegmark. “We’re staying focused, and the 37 teams supported by today’s grants should help solve such real issues.”

The full list of research grant winners can be found here. The plan is to fund these teams for up to three years, with most of the research projects starting by September 2015, and to focus the remaining $4M of the Musk-backed program on the areas that emerge as most promising.

FLI has a mission to catalyze and support research and initiatives for safeguarding life and developing optimistic visions of the future, including positive ways for humanity to steer its own course considering new technologies and challenges.

Contacts at the Future of Life Institute:

  • Max Tegmark: max@futureoflife.org
  • Meia Chita-Tegmark: meia@futureoflife.org
  • Jaan Tallinn: jaan@futureoflife.org
  • Anthony Aguirre: anthony@futureoflife.org
  • Viktoriya Krakovna: vika@futureoflife.org
  • Jesse Galef: jesse@futureoflife.org

 

Elon Musk donates $10M to keep AI beneficial

Thursday January 15, 2015

We are delighted to report that technology inventor Elon Musk, creator of Tesla and SpaceX, has decided to donate $10M to the Future of Life Institute to run a global research program aimed at keeping AI beneficial to humanity.

There is now a broad consensus that AI research is progressing steadily, and that its impact on society is likely to increase. A long list of leading AI-researchers have signed an open letter calling for research aimed at ensuring that AI systems are robust and beneficial, doing what we want them to do. Musk’s donation aims to support precisely this type of research: “Here are all these leading AI researchers saying that AI safety is important”, says Elon Musk. “I agree with them, so I’m today committing $10M to support research aimed at keeping AI beneficial for humanity.”

Musk’s announcement was welcomed by AI leaders in both academia and industry:

“It’s wonderful, because this will provide the impetus to jump-start research on AI safety”, said AAAI president Tom Dietterich. “This addresses several fundamental questions in AI research that deserve much more funding than even this donation will provide.”

“Dramatic advances in artificial intelligence are opening up a range of exciting new applications”, said Demis Hassabis, Shane Legg and Mustafa Suleyman, co-founders of DeepMind Technologies, which was recently acquired by Google. “With these newfound powers comes increased responsibility. Elon’s generous donation will support researchers as they investigate the safe and ethical use of artificial intelligence, laying foundations that will have far reaching societal impacts as these technologies continue to progress”.


Elon Musk and AAAI President Thomas Dietterich comment on the announcement
The $10M program will be administered by the Future of Life Institute, a non-profit organization whose scientific advisory board includes AI-researchers Stuart Russell and Francesca Rossi. “I love technology, because it’s what’s made 2015 better than the stone age”, says MIT professor and FLI president Max Tegmark. “Our organization studies how we can maximize the benefits of future technologies while avoiding potential pitfalls.”

The research supported by the program will be carried out around the globe via an open grants competition, through an application portal at http://futureoflife.org that will open by Thursday January 22. The plan is to award the majority of the grant funds to AI researchers, and the remainder to AI-related research involving other fields such as economics, law, ethics and policy (a detailed list of examples can be found here). “Anybody can send in a grant proposal, and the best ideas will win regardless of whether they come from academia, industry or elsewhere”, says FLI co-founder Viktoriya Krakovna.

“This donation will make a major impact”, said UCSC professor and FLI co-founder Anthony Aguirre: “While heavy industry and government investment has finally brought AI from niche academic research to early forms of a potentially world-transforming technology, to date relatively little funding has been available to help ensure that this change is actually a net positive one for humanity.”

“That AI systems should be beneficial in their effect on human society is a given”, said Stuart Russell, co-author of the standard AI textbook “Artificial Intelligence: a Modern Approach”. “The research that will be funded under this program will make sure that happens. It’s an intrinsic and essential part of doing AI research.”

Skype-founder Jaan Tallinn, one of FLI’s founders, agrees: “Building advanced AI is like launching a rocket. The first challenge is to maximize acceleration, but once it starts picking up speed, you also need to to focus on steering.”

Along with research grants, the program will also include meetings and outreach programs aimed at bringing together academic AI researchers, industry AI developers and other key constituents to continue exploring how to maximize the societal benefits of AI; one such meeting was held in Puerto Rico last week with many of the open-letter signatories.

“Hopefully this grant program will help shift our focus from building things just because we can, toward building things because they are good for us in the long term”, says FLI co-founder Meia Chita-Tegmark.

Contacts at Future of Life Institute:

  • Max Tegmark: max@futureoflife.org
  • Meia Chita-Tegmark: meia@futureoflife.org
  • Jaan Tallinn: jaan@futureoflife.org
  • Anthony Aguirre: anthony@futureoflife.org
  • Viktoriya Krakovna: vika@futureoflife.org

Contacts among AI researchers:

  • Prof. Tom Dietterich, President of the Association for the Advancement of Artificial Intelligence (AAAI), Director of Intelligent Systems: tgd@eecs.oregonstate.edu
  • Prof. Stuart Russell, Berkeley, Director of the Center for Intelligent Systems, and co-author of the standard textbook Artificial Intelligence: a Modern Approach: russell@cs.berkeley.edu
  • Prof. Bart Selman, co-chair of the AAAI presidential panel on long-term AI futures: selman@cs.cornell.edu
  • Prof. Francesca Rossi, Professor of Computer Science, University of Padova and Harvard University, president of the International Joint Conference on Artificial Intelligence (IJCAI): frossi@math.unipd.it
  • Prof. Murray Shanahan, Imperial College: m.shanahan@imperial.ac.uk


Max Tegmark interviews Elon Musk about his life, his interest in the future of humanity and the background to his donation