Skip to content

Ethan Perez

Organisation
New York University
Biography

Why do you care about AI Existential Safety?

AI has done great good and great harm in the world already, and the potential benefits and harms will only grow as we develop more capable systems. The amount of harm or good done by AI often depends on how similar its training objective is to the objective we actually care about; the greater the misalignment, the greater the harm (including possibility of existential catastrophes). I’m interested in reducing such misalignment by developing training objectives that better capture we actually care about, even though such objectives are often hard to quantify and evaluate. In particular, the aim of my research is to train AI to tell us novel, true statements about the world rather human-like statements as current systems do. In doing so, I also hope that we learn insights about AI alignment that are useful more broadly for maximizing the good and minimize the harm from AI.

Please give one or more examples of research interests relevant to AI existential safety:

My research focuses on aligning language models with human preferences, e.g., for content that is helpful, honest, and harmless. In particular, I am excited about developing learning algorithms that outdo humans at generating such content, by producing text that is free of social biases, cognitive biases, common misconceptions, and other limitations.

Sign up for the Future of Life Institute newsletter

Join 40,000+ others receiving periodic updates on our work and cause areas.
cloudmagnifiercrossarrow-up linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram