AI Safety Research

Brian Ziebart

Assistant Professor

Department of Computer Science

University of Illinois at Chicago

Project: Towards Safer Inductive Learning

Amount Recommended:    $134,247

Project Summary

“I don’t know” is a safe and appropriate answer that people provide to many posed questions. To appropriately act in a variety of complex tasks, our artificial intelligence systems should incorporate similar levels of uncertainty. Instead, state-of-the-art statistical models and algorithms that enable computer systems to answer such questions based on previous experience often produce overly confident answers. Due to widely used modeling assumptions, this is particularly true when new questions come from situations that differ substantially from previous experience. In other words, exactly when human-level intelligence provides less certainty when generalizing from the known to the unknown, artificial intelligence tends to provide more. Rather than trying to engineer fixes to this phenomenon into existing methods, We propose a more pessimistic approach based on the question: “What is the worst-case possible for predictive data that still matches with previous experiences (observations)?” We propose to analyze the theoretical benefits of this approach and demonstrate its applied benefits on prediction tasks.

Technical Abstract

Reliable inductive reasoning that uses previous experiences to make predictions of unseen information in new situations is a key requirement for enabling useful artificial intelligence systems.

Tasks ranging over recognizing objects in camera images, predicting the outcomes of possible autonomous system controls, and understanding the intentions of other intelligent entities each depend on this type of reasoning. Unfortunately, existing techniques produce significant unforeseen errors when the underlying statistical assumptions they are based upon do not hold in reality. The nearly ubiquitous assumption that estimated relationships in future situations will be similar to previous experiences (i.e., past and future data is assumed to be exchangeable or independent and identically distributed–IID–according to a common distribution) is particularly brittle when employed within artificial intelligence systems that autonomously interact with the physical world. We propose an adversarial formulation for cost-sensitive prediction under covariate shift—a relaxation of this statistical assumption. This approach provides robustness to data shifts between predictive model estimation and deployment while incorporating mistake-specific costs for different errors that can be tied to application outcomes. We propose theoretical analysis and experimental investigation of this approach for standard and active learning tasks.


  1. Chen, Xiangli, et al. Robust Covariate Shift Regression. International Conference on Artificial Intelligence and Statistics (AISTATS), 2016.
  2. Fathony, Rizal, et al. Multiclass Classification:  A Risk Minimization Perspective. Neural Information Processing Systems (NIPS), 2016.


  1. Ethics for Artificial Intelligence: IJCAI 2016. July 9, 2016. NY.
    • This workshop focussed on selecting papers which speak to the themes of law and autonomous vehicles, ethics of autonomous systems, and superintelligence.

Ongoing Projects/Recent Progress

  1. Covariate shift: This team’s main progress for the ARM approach to covariate shift is two-fold. They have successfully extended its applicability to learning for regression settings under covariate shift using logarithmic loss as the performance measure. They have established a new formulation that allows assumptions to be expressed about how each input-specific feature,i, generalizes with respect to covariate shift. These researchers believe this flexibility will prove essential for applying ARM for covariate shift to high-dimensional or structured prediction tasks.
  2. Non-convex  losses: A  longstanding  gap  between  theory  and  practice  has  existed  for multi-class support vector machines: formulations that provide Fisher consistency (i.e., loss minimization given infinite date) typically perform worse than inconsistent formulations on finite amounts of data in practice.  From the ARM formulation for 0-1 loss,  these researchers derive an equivalent ERM loss function, which they term AL0-1, and close this gap by establishing Fisher consistency and showing competitive performance on finite amounts of data. They view this result as a fundamental and substantial endorsement for the ARM approach as a whole, from which they are exploring the optimization of additional performance measures that cannot be converted to an ERM loss in such a manner.