Andrew Ilyas

Position

Professor

Organisation

Carnegie Mellon University

Member of

AI Safety Community Faculty

Biography

Why do you care about AI Existential Safety?

AI models are becoming both increasingly capable and increasingly influential on our businesses, governments, and societies. At the same time, these models are for the most part unpredictable: we have trouble anticipating the failure modes and harms that models will have until it’s too late. The focus of my research is on improving this notion of predictability of AI systems, which I believe is fundamental to ensuring that AI has a long-term positive (and importantly, non-negative) impact on humanity.

Please give at least one example of your research interests related to AI existential safety:

A lot of my early work in AI was about adversarial examples – probing and understanding how AI systems can fail in unexpected and non-human ways. It turns out that this behavior is a direct microcosm of the alignment problem at the heart of AI existential risk: the (perfectly valid) way that computer vision classifiers viewed data simply didn’t align with our human perspective. Much of my later research has aimed to create systems that behave as intended, even in unfamiliar conditions, and whose failure modes and shortcomings align with those of humans.

Andrew Ilyas

Sign up for the Future of Life Institute newsletter