AI Safety Research

Wendell Wallach

Lecturer

Yale Interdisciplinary Center for Bioethics

wendell.wallach@yale.edu

Project: Control and Responsible Innovation in the Development of Autonomous Machines

Amount Recommended:    $180,000

Project Summary

Driverless cars, service robots, surveillance drones, computer networks collecting data, and autonomous weapons are just a few examples of increasingly intelligent technologies scientists are developing. As they progress, researchers face a series of questions about whether these machines can be designed and engineered to take morally significant actions previously reserved for human actors. Can they ensure that artificially intelligent systems will always be demonstrably beneficial, safe, controllable, and sensitive to human values? Many individuals and groups have begun tackling the various subprojects entailed in this challenge. They are, however, often unaware of efforts in complementary fields. Thus they lose opportunities for creative collaboration, miss gaps in their own research, and reproduce work being performed by potential colleagues. The Hastings Center proposes to convene a series of three solution-directed workshops with national and international experts in the various pertinent fields. Together they will develop collaborative strategies and research projects, and forge an outline for a comprehensive plan to insure autonomous systems will be demonstrably beneficial, and that this innovative research progresses in a responsible manner. The results of the workshop will be conveyed through a special report, a dedicated edition of a scholarly journal, and two public symposia.

Technical Abstract

The vast array of challenges entailed in designing, engineering, and implementing demonstrably beneficial, safe and controllable AI systems are slowly being addressed by scholars working on distinct research trajectories across many disciplines. They are often unaware of efforts in complementary fields, thus losing opportunities for creative synergies, missing gaps in their own research, and reproducing the work of potential colleagues. The Hastings Center proposes to convene a series of three solution-directed workshops with national and international experts in the varied fields. Together they will address trans-disciplinary questions, develop collaborative strategies and research projects, and forge an outline for a comprehensive plan encompassing the many elements of ensuring autonomous systems will be demonstrably beneficial, and that this innovative research progresses in a responsible manner. The workshops’ research and policy agenda will be published as a Special Report of the journal Hastings Center Report and in short form in a science or engineering journal. Findings will also be presented through two public symposia, one of which will be webcast and available on demand. We anticipate significant progress given the high caliber of the people who are excited by this project and have already committed to join their workshops.

wendell-wallach-grant3

Silo Busting in AI Research

Artificial intelligence may seem like a computer science project, but if it’s going to successfully integrate with society, then social scientists must be more involved.

Developing an intelligent machine is not merely a problem of modifying algorithms in a lab. These machines must be aligned with human values, and this requires a deep understanding of ethics and the social consequences of deploying intelligent machines.

Getting people with a variety of backgrounds together seems logical enough in theory, but in practice, what happens when computer scientists, AI developers, economists, philosophers, and psychologists try to discuss AI issues? Do any of them even speak the same language?

Social scientists and computer scientists will come at AI problems from very different directions. And if they collaborate, everybody wins. Social scientists can learn about the complex tools and algorithms used in computer science labs, and computer scientists can become more attuned to the social and ethical implications of advanced AI.

Through transdisciplinary learning, both fields will be better equipped to handle the challenges of developing AI, and society as a whole will be safer.

Silo Busting

Too often, researchers focus on their narrow area of expertise, rarely reaching out to experts in other fields to solve common problems. AI is no different, with thick walls – sometimes literally – separating the social sciences from the computer sciences. This process of breaking down walls between research fields is often called silo-busting.

If AI researchers largely operate in silos, they may lose opportunities to learn from other perspectives and collaborate with potential colleagues. Scientists might miss gaps in their research or reproduce work already completed by others, because they were secluded away in their silo. This can significantly hamper the development of value-aligned AI.

To bust these silos, Wendell Wallach organized workshops to facilitate knowledge-sharing among leading computer and social scientists. Wallach, a consultant, ethicist, and scholar at Yale University’s Interdisciplinary Center for Bioethics, holds these workshops at The Hastings Center, where he is a senior advisor.

With co-chairs Gary Marchant, Stuart Russell, and Bart Selman, Wallach held the first workshop in April 2016. “The first workshop was very much about exposing people to what experts in all of these different fields were thinking about,” Wallach explains. “My intention was just to put all of these people in a room and hopefully they’d see that they weren’t all reinventing the wheel, and recognize that there were other people who were engaged in similar projects.”

The workshop intentionally brought together experts from a variety of viewpoints, including engineering ethics, philosophy, and resilience engineering, as well as participants from the Institute of Electrical and Electronics Engineers (IEEE), the Office of Naval Research, and the World Economic Forum (WEF). Wallach recounts, “some were very interested in how you implement sensitivity to moral considerations in AI computationally, and others were more interested in how AI changes the societal context.”

Other participants studied how the engineers of these systems may be susceptible to harmful cognitive biases and conflicts of interest, while still others focused on governance issues surrounding AI. Each of these viewpoints is necessary for developing beneficial AI, and The Hastings Center’s workshop gave participants the opportunity to learn from and teach each other.

But silo-busting is not easy. Wallach explains, “everybody has their own goals, their own projects, their own intentions, and it’s hard to hear someone say, ‘maybe you’re being a little naïve about this.’” When researchers operate exclusively in silos, “it’s almost impossible to understand how people outside of those silos did what they did,” he adds.

The intention of the first workshop was not to develop concrete strategies or proposals, but rather to open researchers’ minds to the broad challenges of developing AI with human values. “My suspicion is, the most valuable things that came out of this workshop would be hard to quantify,” Wallach clarifies. “It’s more like people’s minds were being stretched and opened. That was, for me, what this was primarily about.”

The workshop did yield some tangible results. For example, Marchant and Wallach introduced a pilot project for the international governance of AI, and nearly everyone at the workshop agreed to work on it. Since then, the IEEE, the International Committee of the Red Cross, the UN, the World Economic Forum, and other institutions have agreed to become active partners with The Hastings Center in building global infrastructure to ensure that AI and Robotics are beneficial.

This transdisciplinary cooperation is a promising sign that Wallach’s efforts are succeeding in strengthening the global response to AI challenges.

Value Alignment

Wallach and his co-chairs held a second workshop at the end of October. The participants were mostly scientists, but also included social theorists, a legal scholar, philosophers, and ethicists. The overall goal remained – to bust AI silos and facilitate transdisciplinary cooperation – but this workshop had a narrower focus.

“We made it more about value alignment and machine ethics,” he explains. “The tension in the room was between those who thought the problem [of value alignment] was imminently solvable and those who were deeply skeptical about solving the problem at all.”

In general, Wallach observed that “the social scientists and philosophers tend to overplay the difficulties [of creating AI with full value alignment] and computer scientists tend to underplay the difficulties.”

Wallach believes that while computer scientists will build the algorithms and utility functions for AI, they will need input from social scientists to ensure value alignment. “If a utility function represents 100,000 inputs, social theorists will help the AI researchers understand what those 100,000 inputs are,” he explains. “The AI researchers might be able to come up with 50,000-60,000 on their own, but they’re suddenly going to realize that people who have thought much more deeply about applied ethics are perhaps sensitive to things that they never considered.”

“I’m hoping that enough of [these researchers] learn each other’s language and how to communicate with each other, that they’ll recognize the value they can get from collaborating together,” he says. “I think I see evidence of that beginning to take place.”

Moving Forward

Developing value-aligned AI is a monumental task with existential risks. Experts from various perspectives must be willing to learn from each other and adapt their understanding of the issue.

In this spirit, The Hastings Center is leading the charge to bring the various AI silos together. After two successful events that resulted in promising partnerships, Wallach and his co-chairs will hold their third workshop in Spring 2018. And while these workshops are a small effort to facilitate transdisciplinary cooperation on AI, Wallach is hopeful.

“It’s a small group,” he admits, “but it’s people who are leaders in these various fields, so hopefully that permeates through the whole field, on both sides.”

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

Workshops

  1. Control and Responsible Innovation in the Development of Autonomous Systems Workshop: April 24-26, 2016. The Hastings Center, Garrison, NY.
    • The four co-­chairs (Gary Marchant, Stuart Russell, Bart Selman, and Wendell Wallach) and The Hastings Center staff (particularly Mildred Solomon and Greg Kaebnick) designed this first workshop. This workshop was focused on exposing participants to relevant research progressing in an array of fields, stimulating extended reflection upon key issues and beginning a process of dismantling intellectual silos and loosely knitting the represented disciplines into a transdisciplinary community. Twenty-five participants gathered at The Hastings Center in Garrison, NY from April 24th – 26th, 2016. The workshop included representatives from key institutions that have entered this space, including IEEE, the Office of Naval Research, the World Economic Forum, and of course AAAI.
  2. Wallach and others are planning a second workshop, scheduled for October 30-November 1, 2016. The invitees for the second workshop are primarily scientists, but also include social theorists, legal scholars, philosophers, and ethicists. The expertise of the social scientists will be drawn upon in clarifying the application of research in cognitive science and legal and ethical theory to the development of autonomous systems. Not all of the invitees to the second workshop have considered the challenge of developing beneficial trustworthy artificial agents. However, Wallach and his team believe they are bringing together brilliant and creative minds to collectively address this challenge. They hope that scientific and intellectual leaders, new to the challenge and participating in the second workshop, will take on the development of beneficial, robust, safe, and controllable AI as a serious research agenda.