Skip to content

Andrew Ilyas

Position
Professor
Organisation
Carnegie Mellon University
Biography

Why do you care about AI Existential Safety?

AI models are becoming both increasingly capable and increasingly influential on our businesses, governments, and societies. At the same time, these models are for the most part unpredictable: we have trouble anticipating the failure modes and harms that models will have until it’s too late. The focus of my research is on improving this notion of predictability of AI systems, which I believe is fundamental to ensuring that AI has a long-term positive (and importantly, non-negative) impact on humanity.

Please give at least one example of your research interests related to AI existential safety:

A lot of my early work in AI was about adversarial examples – probing and understanding how AI systems can fail in unexpected and non-human ways. It turns out that this behavior is a direct microcosm of the alignment problem at the heart of AI existential risk: the (perfectly valid) way that computer vision classifiers viewed data simply didn’t align with our human perspective. Much of my later research has aimed to create systems that behave as intended, even in unfamiliar conditions, and whose failure modes and shortcomings align with those of humans.

Sign up for the Future of Life Institute newsletter

Join 40,000+ others receiving periodic updates on our work and focus areas.
cloudmagnifiercrossarrow-up linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram