We focus on current and future complex AI autonomous systems that integrate sensors, computation, and actuation to perform tasks of benefit to humans. Examples of such systems are auto-pilots, medical assistants, internet-of-things components, and mobile service robots. One of the key aspects to bring such complex AI systems to safe and acceptable existence is the ability for such systems to provide transparency on their representations, interpretations, choices, and decisions, in summary, their internal state.
We believe that, to build AI systems that are safe, as well as accepted and trusted by humans, we need to equip them with the capability to explain their actions, recommendations, and inferences. Our proposed project aims at researching on the specification, formalization, and generation of explanations, with a concrete focus on seamlessly integrated AI systems that sense and reason about multi-modal information in symbiosis with humans. As a result, humans will be able to query robots for explanations about their recommendations or actions, and carry any needed corrections.
AI systems have long been challenged with providing explanations about their reasoning. Automated theorem provers, explanation-based learning systems, and conflict-based constraint solvers are examples where inference is supplemented by the underlying processed knowledge and rules.
We focus on current and future complex AI autonomous systems that integrate perception, cognition, and action, in tasks to service humans. These systems can be viewed as cyber-physical-social systems, such as auto-pilots, medical assistants, internet-of-things components, and mobile service robots.
We propose to research on bringing such complex AI systems to safe and acceptable existence by providing transparency on their representations, interpretations, choices, and decisions. We will develop mining techniques to enable the analysis and explanation of temporally-logged sensory and execution data, constrained by the underlying behavior architecture, as well as the uncertainty of the sensed environment. We will address the need for probabilistic and knowledge-based inference; the variety of input data modalities; and the coordination of multiple reasoning agents.
We will concretely research on autonomous mobile service robots, such as CoBots, as well as quadrotors. We envision humans setting queries about the robots performance and the choice of their actions. Our generated explanations will increase the understanding, and robot safety.
In the future, service robots equipped with artificial intelligence (AI) are bound to be a common sight. These bots will help people navigate crowded airports, serve meals, or even schedule meetings.
As these AI systems become more integrated into daily life, it is vital to find an efficient way to communicate with them. It is obviously more natural for a human to speak in plain language rather than a string of code. Further, as the relationship between humans and robots grows, it will be necessary to engage in conversations, rather than just give orders.
This human-robot interaction is what Manuela M. Veloso’s research is all about. Veloso, a professor at Carnegie Mellon University, has focused her research on CoBots, autonomous indoor mobile service robots which transport items, guide visitors to building locations, and traverse the halls and elevators. The CoBot robots have been successfully autonomously navigating for several years now, and have traveled more than 1,000km. These accomplishments have enabled the research team to pursue a new direction, focusing now on novel human-robot interaction.
“If you really want these autonomous robots to be in the presence of humans and interacting with humans, and being capable of benefiting humans, they need to be able to talk with humans” Veloso says.
Communicating With CoBots
Veloso’s CoBots are capable of autonomous localization and navigation in the Gates-Hillman Center using WiFi, LIDAR, and/or a Kinect sensor (yes, the same type used for video games).
The robots navigate by detecting walls as planes, which they match to the known maps of the building. Other objects, including people, are detected as obstacles, so navigation is safe and robust. Overall, the CoBots are good navigators and are quite consistent in their motion. In fact, the team noticed the robots could wear down the carpet as they traveled the same path numerous times.
Because the robots are autonomous, and therefore capable of making their own decisions, they are out of sight for large amounts of time while they navigate the multi-floor buildings.
The research team began to wonder about this unaccounted time. How were the robots perceiving the environment and reaching their goals? How was the trip? What did they plan to do next?
“In the future, I think that incrementally we may want to query these systems on why they made some choices or why they are making some recommendations,” explains Veloso.
The research team is currently working on the question of why the CoBots took the route they did while autonomous. The team wanted to give the robots the ability to record their experiences and then transform the data about their routes into natural language. In this way, the bots could communicate with humans and reveal their choices and hopefully the rationale behind their decisions.
Levels of Explanation
The “internals” underlying the functions of any autonomous robots are completely based on numerical computations, and not natural language. For example, the CoBot robots in particular compute the distance to walls, assigning velocities to their motors to enable the motion to specific map coordinates.
Asking an autonomous robot for a non-numerical explanation is complex, says Veloso. Furthermore, the answer can be provided in many potential levels of detail.
“We define what we call the ‘verbalization space’ in which this translation into language can happen with different levels of detail, with different levels of locality, with different levels of specificity.”
For example, if a developer is asking a robot to detail their journey, they might expect a lengthy retelling, with details that include battery levels. But a random visitor might just want to know how long it takes to get from one office to another.
Therefore, the research is not just about the translation from data to language, but also the acknowledgment that the robots need to explain things with more or less detail. If a human were to ask for more detail, the request triggers CoBot “to move” into a more detailed point in the verbalization space.
“We are trying to understand how to empower the robots to be more trustable through these explanations, as they attend to what the humans want to know,” says Veloso. The ability to generate explanations, in particular at multiple levels of detail, will be especially important in the future, as the AI systems will work with more complex decisions. Humans could have a more difficult time inferring the AI’s reasoning. Therefore, the bot will need to be more transparent.
For example, if you go to a doctor’s office and the AI there makes a recommendation about your health, you may want to know why it came to this decision, or why it recommended one medication over another.
Currently, Veloso’s research focuses on getting the robots to generate these explanations in plain language. The next step will be to have the robots incorporate natural language when humans provide them with feedback. “[The CoBot] could say, ‘I came from that way,’ and you could say, ‘well next time, please come through the other way,’” explains Veloso.
These sorts of corrections could be programmed into the code, but Veloso believes that “trustability” in AI systems will benefit from our ability to dialogue, query, and correct their autonomy. She and her team aim at contributing to a multi-robot, multi-human symbiotic relationship, in which robots and humans coordinate and cooperate as a function of their limitations and strengths.
“What we’re working on is to really empower people – a random person who meets a robot – to still be able to ask things about the robot in natural language,” she says.
In the future, when we will have more and more AI systems that are able to perceive the world, make decisions, and support human decision-making, the ability to engage in these types of conversations will be essential.
This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.
Abstract: With a growing number of robots performing autonomously without human intervention, it is difficult to understand what the robot’s experience along their routes during execution without looking at execution logs. Rather than looking through logs, our goal is for robots to respond to queries in natural language about what they experience and what routes they have chosen. We propose verbalization as the process of converting route experiences into natural language, and highlight the importance of varying verbalizations based on user preferences. We present our verbalization space representing different dimensions that verbalizations can be varied, and our algorithm for automatically generating them on our CoBot robot. Then we present our study of how users can request different verbalizations indialog. Using the study data, we learn a language model to map user dialog to the verbalization space. Finally, we demonstrate the use of the learned model within a dialog system in order for any user to request information about CoBots route experience at varying levels of detail.
Abstract: Autonomous mobile robots navigate in our spaces by planning and executing routes to destinations. When a mobile robot appears at a location, there is no clear way to understand what navigational path the robot planned and experienced just by looking at it. In this work, we address the generation of narrations of autonomous mobile robot navigation experiences. We contribute the concept of verbalization as a parallel to the well-studied concept of visualization. Through verbalizations, robots can describe through language what they experience, in particular in their paths. For every executed path, we consider many possible verbalizations that could be generated. We introduce the verbalization space that covers the variability of utterances that the robot may use to narrate its experience to different humans. We present an algorithm for segmenting a path and mapping each segment to an utterance, as a function of the desired point in the verbalization space, and demonstrate its application using our mobile service robot moving in our buildings. We believe our verbalization space and algorithm are applicable to different narrative aspects for many mobile robots, including autonomous cars.
Invited talk at the OSTP/NYU Workshop on The Social and Economic Implications of Artificial Intelligence Technologies in the Near-Term, NYC, July 2016.
Invited talk at the Intelligent Autonomous Vehicles Conference, Leipzig, July 2016.