Anca Dragan Interview
The following is an interview with Anca Dragan about the Beneficial AI 2017 conference and The Asilomar Principles that it produced. Dragan is an Assistant Professor in the Electrical Engineering & Computer Sciences Department at UC Berkeley and co-PI for the Center for Human Compatible AI.
Q. From your perspective what were the highlights of the conference?
“I think a big highlight was the mixture of people – the fact that we had people from industry and academia, and people from diverse fields: computer science, economics, law, philosophy.
“I like that there was a variety of topics: not just what would happen with a superintelligence, but also economic and policy issues around jobs, for instance.”
Q. Why did you choose to sign the AI principles that emerged from discussions at the conference?
“I signed the principles to show my support for thinking more carefully about the technology we are developing and its role in and impact on society.
“Some principles seemed great on the surface, but I worry about turning them into policies. I think policies would have to be much more nuanced. Much like it is difficult to write down a utility function for a robot, it is difficult to write down a law: we can’t imagine all the different scenarios – especially for a future world – and be certain that the law would be true to our intent.”
Q. Why do you think that AI researchers should weigh in on such issues as opposed to simply doing technical work?
“I think it’s very important for AI researchers to weigh in on AI safety issues. Otherwise, we are putting the conversation in the hands of people who are well-intentioned, but unfamiliar with how the technology actually works. The scenarios portrayed in science fiction are far from the real risks, but that doesn’t mean that real risks don’t exist. We should all work together to identify and mediate them.”
Q. Explain what you think of the following principles:
18) AI Arms Race: An arms race in lethal autonomous weapons should be avoided.
“I fully agree. It’s scary to think of an AI arms race happening. It’d be the equivalent of very cheap and easily accessible nuclear weapons, and that would not fare well for us. My main concern is what to do about it and how to avoid it. I am not qualified to make such judgements, but I assume international treaties would have to occur here.”
20) Importance: Advanced AI could represent a profound change in the history of life on Earth, and should be planned for and managed with commensurate care and resources.
“Ultimately, we work on AI because we believe it can have a strong positive impact on the world. But the more capable the technology becomes, the easier it becomes to misuse it – or perhaps, the effects of misusing it become more drastic. That is why it is so important, as we make progress, to start thinking more strongly about what role AI will play.
“As the AI capabilities advance, we have to take a step back and ask ourselves: are we solving the right problem? Is there a better problem definition that will more likely result in benefits to humanity?
“For instance, we have always defined AI agents as rational. That means they maximize expected utility. Thus far, utility is assumed to be known. But if you think about it, there is no gospel specifying utility. We are assuming that some *person* somewhere will know exactly what utility to specify for their agent. Well, it turns out, we don’t work like that: it is really hard for people, including AI experts, to specify utility functions. We try our best, but when the system goes ahead and optimizes for what we inputted, the result is sometimes surprising, and not in a good way. This suggests that our definition of an AI agent is predicated on a wrong assumption. We’ve already started seeing that in robotics – the definition of how a robot should move didn’t account for people, the definition of how a robot should learn from demonstration assumed that people can provide perfect demonstrations to a robot, etc. – I assume we are going to see this more and more in AI as a whole. We have to stop making implicit assumptions about people and end-users of AI, and rigorously tackle that head-on, putting people into the equation.”
21) Risks: Risks posed by AI systems, especially catastrophic or existential risks, must be subject to planning and mitigation efforts commensurate with their expected impact.
“An immediate risk is agents producing unwanted, surprising behavior. Even if we plan to use AI for good, things can go wrong, precisely because we are bad at specifying objectives and constraints for AI agents. Their solutions are often not what we had in mind.”
10) Value Alignment: Highly autonomous AI systems should be designed so that their goals and behaviors can be assured to align with human values throughout their operation.
“This is one step toward helping AI figure out what it should do, and continuously refining the goals should be an ongoing process between humans and AI.
“At Berkeley, we think the key to value alignment is agents not taking their objectives or utility functions for granted: we already know that people are bad at specifying them. Instead, we think that the agent should collaboratively work with people to understand and verify its utility function. We think it’s important for agents to have uncertainty about their objectives, rather than assuming they are perfectly specified, and treat human input as valuable observations about the true, underlying, desired objective.”
6) Safety: AI systems should be safe and secure throughout their operational lifetime, and verifiably so where applicable and feasible.
“We all agree that we want our systems to be safe. More interesting is what do we mean by “safe”, and what are acceptable ways of verifying safety.
“Traditional methods for formal verification that prove (under certain assumptions) that a system will satisfy desired constraints seem difficult to scale to more complex and even learned behavior. Moreover, as AI advances, it becomes less clear what these constraints should be, and it becomes easier to forget important constraints. We might write down that we want the robot to always obey us when we tell it to turn off, but a sufficiently intelligent robot might find ways to even prevent us from telling it so. This sounds far off and abstract, but even right now it every so often happens that we write down one constraint, when what we really have in mind is something more general. All of these issues make it difficult to prove safety. And while that’s true, does it mean that we just give up and go home? Probably not — it probably means that we need to rethink what we mean by safe, perhaps building in safety from the get-go as opposed to designing a capable system and adding safety after. It opens up difficult but important areas of research that we’re very excited about here at the Center for Human-Compatible AI.”
Q. Are there other principles you want to comment on?
“Value alignment is a big one. Robots aren’t going to try to revolt against humanity, but they’ll just try to optimize whatever we tell them to do. So we need to make sure to tell them to optimize for the world we actually want.”
Shared Prosperity – Paraphrased: We have to make progress on addressing impacts on the economy and developing policies for a world in which more and more of our resource production is going to be automated. We need to address the overall impacts of more and more being automated. It seems good to give people a chance to do what they want, but there’s an income distribution problem: “if all the resources are automated, then who actually controls the automation? Is it everyone or is it a few select people?”
Q. Assuming all goes well, what do you think a world with advanced beneficial AI would look like? What are you striving for with your AI work?
“I envision a future in which AI can eliminate resource scarcity, such as food, healthcare, and even education. A world in which people are not limited by their physical impairments. And ultimately, a world in which machines augment our intelligence and enable us to do things that are difficult to achieve on our own.”
“In robotics, and in AI as a whole, we’ve been defining the problem by leaving people out. As we get improvements in function and capability, it is time to think about how AI will be used, and put people back into the equation.”
About the Future of Life Institute
The Future of Life Institute (FLI) is a global non-profit with a team of 20+ full-time staff operating across the US and Europe. FLI has been working to steer the development of transformative technologies towards benefitting life and away from extreme large-scale risks since its founding in 2014. Find out more about our mission or explore our work.