Skip to content

Eric Wong

Position
Assistant professor
Organisation
University of Pennsylvania
Biography

Why do you care about AI Existential Safety?

Recent growth in the capabilities of AI systems has made it increasingly possible to substitute human effort with AI. Concurrently, the pursuit of efficiency creates a strong incentive to reduce or eliminate human oversight of these systems. This trend accelerates the risk of increasingly autonomous systems operating without human oversight, making it difficult to predict and mitigate downstream dangers. The core challenge is that the complexity, speed, and potential impact of these systems may soon outpace traditional methods of manual supervision and empirical validation and lead to large-scale, catastrophic consequences. My hope is that by studying the fundamentals of AI safety in conjunction with the risks of frontier models, we can lay the foundation for scalable oversight and control that allows safety research to keep pace with AI systems and mitigate future existential risks.

Please give at least one example of your research interests related to AI existential safety:

What is possible or impossible in safety research given an increasingly capable AI-enabled adversary? Concretely, how can we control or monitor AI systems without relying on exploiting flaws or imperfect knowledge within the AI system? As capabilities continue to grow, any long-lasting safety mechanisms that can prevent existential risks must presume that the target AI system is highly capable, potentially more than ourselves or any other monitoring system. To ensure that safety research does not become out-scaled, we must first understand fundamentally what safety problems are even possible to solve when against a highly competent adversary.

Sign up for the Future of Life Institute newsletter

Join 40,000+ others receiving periodic updates on our work and focus areas.
cloudmagnifiercrossarrow-up linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram