Fellowship Winners 2022

These are the winners of our 2022 grant programs for research concerning the safe development and deployment of AI.

Vitalik Buterin PhD Fellows

The Vitalik Buterin PhD Fellowship in AI Existential Safety is for students starting PhD programs in 2022 who plan to work on AI existential safety research, or for existing PhD students who would not otherwise have funding to work on AI existential safety research. It will fund students for 5 years of their PhD, with extension funding possible. At universities in the US, UK, or Canada, annual funding will cover tuition, fees, and the stipend of the student’s PhD program up to $40,000, as well as a fund of $10,000 that can be used for research-related expenses such as travel and computing. At universities not in the US, UK or Canada, the stipend amount will be adjusted to match local conditions. Fellows will also be invited to workshops where they will be able to interact with other researchers in the field. Here you can read more about the program.

Mouse-over or tap a profile to reveal more information:

University of Cambridge

Anwar, Usman

Massachusetts Institute of Technology

Casper, Stephen

ETH Zurich

Chen, Xin Cynthia

UC Berkeley

Jenner, Erik

Max Planck Institute

Jin, Zhijing

UC Berkeley

Jones, Erik

UC Berkeley

Pan, Alexander

UC Berkeley

Treutlein, Johannes

University of Cambridge

Anwar, Usman

Also an ‘Open Philanthropy AI Fellow’

Advisor: David Krueger
Research on inverse constrained reinforcement learning, impact regularizers, and distribution generalization

Usman is a PhD student at the University of Cambridge. His research interests span Reinforcement Learning, Deep Learning and Cooperative AI.  Usman’s goal in AI research is to develop useful, versatile and human-aligned AI systems that can learn from humans and each other. His research focuses on identifying the factors which make it difficult to develop human-aligned AI systems and developing techniques to work around these factors. In particular, he is interested in exploring ways through which rich human preferences and desires may be adaptively communicated to the AI agents, especially in complex scenarios such as multi-agent planning and time-varying preferences with the ultimate goal of both broadening the scope of tasks that AI agents can undertake as well as making the AI agents more aligned and trustworthy. For publications and other details, please visit https://uzman-anwar.github.io
Massachusetts Institute of Technology

Casper, Stephen

Advisor: Dylan Hadfield-Menell
Research on interpretable and robust AI

Stephen “Cas” Casper is a Ph.D student at MIT in Computer Science (EECS) in the Algorithmic Alignment Group advised by Dylan Hadfield-Menell. Formerly, he has worked with the Harvard Kreiman Lab and the Center for Human-Compatible AI. His main focus is in developing tools for more interpretable and robust AI. Research interests of his include interpretability, adversaries, robust reinforcement learning, and decision theory. He is particularly interested in finding (mostly) automated ways of finding/fixing flaws in how deep neural networks handle human-interpretable concepts. He is also an Effective Altruist trying to do the most good he can. You can visit his website here.
ETH Zurich

Chen, Xin Cynthia

Also an ‘Open Philanthropy AI Fellow’

Advisor: Andreas Krause
Research on safe reinforcement learning

Cynthia is an incoming PhD student at ETH Zurich, supervised by Prof. Andreas Krause. She is broadly interested in building AI systems that are aligned with human preferences, especially under situations where mistakes are costly and human signals are sparse. She aspires to develop AI solutions that can improve the world in the long run. Prior to ETH, Cynthia interned at the Center for Human-Compatible AI at UC Berkeley and graduated with honours from the University of Hong Kong. You can find out more about Cynthia’s research on her website.
UC Berkeley

Jenner, Erik

Also an ‘Open Philanthropy AI Fellow’

Advisor: Stuart Russell
Research on developing principled techniques for aligning even very powerful AI systems with human values

Erik is an incoming CS PhD student at UC Berkeley, advised by Stuart Russell. He is interested in developing techniques for aligning AI with human values that could scale to very powerful future AI systems. Previously, he has worked on reward learning with the Center for Human-Compatible AI, focusing on interpretability of reward models and on a better theoretical understanding of the structure of reward functions. Before joining Berkeley, Erik did a MSc in Artificial Intelligence at the University of Amsterdam, and an undergrad in physics at the University of Heidelberg.. For more information about his research, see his website https://ejenner.com.
Max Planck Institute

Jin, Zhijing

Also an ‘Open Philanthropy AI Fellow’

Advisor: Bernhard Schölkopf
Research on promoting NLP for social good and improving AI by connecting NLP with causality

Zhijing (she/her) is a PhD student in Computer Science at Max Planck Institute, Germany, and ETH Zürich, Switzerland. She is co-supervised by Prof Bernhard Schoelkopf, Rada Mihalcea, Mrinmaya Sachan and Ryan Cotterell. She is broadly interested in making natural language processing (NLP) systems better serve for humanities. Specifically, she uses causal inference to improve the robustness and explainability of language models (as part of the “inner alignment” goal), and make language models align with human values (as part of the “outer alignment” goal). Previously, Zhijing received her bachelor’s degree at the University of Hong Kong, during which she had visiting semesters at MIT and National Taiwan University. She was also a research intern at Amazon AI with Prof Zheng Zhang. For more information, see her website.

UC Berkeley

Jones, Erik

Advisor: Jacob Steinhardt
Research on robustness of forthcoming machine learning systems

Erik is a PhD Student in Computer Science at UC Berkeley advised by Jacob Steinhardt. He aims to make future machine learning systems more predictable, reliable, and aligned with human preferences. His current work studies how structured prediction tasks induce new robustness challenges. Previously, Erik received his B.S. in Mathematics and M.S. in Computer Science from Stanford.
UC Berkeley

Pan, Alexander

Advisor: Jacob Steinhardt
Research on alignment, adversarial robustness, and anomaly detection

Alex is an incoming PhD student in Computer Science at UC Berkeley, advised by Jacob Steinhardt. He is interested in making ML systems more robust and aligned with human values through better empirical and theoretical understanding of potential failure modes. Currently, he is working on using language to construct more resilient value functions and better control machine learning models. Previously, Alex completed a B.S. in mathematics and computer science from Caltech. For more information, visit his website.
UC Berkeley

Treutlein, Johannes

Also an ‘Open Philanthropy AI Fellow’

Advisor: Stuart Russell
Research on objective robustness and learned optimization

Johannes Treutlein is an incoming PhD student in Computer Science at UC Berkeley. He is broadly interested in empirical and theoretical research to ensure that AI systems remain safe and reliable with increasing capabilities. He is currently working on investigating learned optimization in machine learning models and on developing models whose objectives generalize robustly out of distribution. Previously, Johannes studied computer science and mathematics at the University of Toronto, the University of Oxford, and the Technical University of Berlin. For more information, visit his website.

Vitalik Buterin Postdoctoral Fellows

The Vitalik Buterin Postdoctoral Fellowship in AI Existential Safety is designed to support promising researchers for postdoctoral appointments starting in the fall semester of 2022 to work on AI existential safety research. Funding is for three years subject to annual renewals based on satisfactory progress reports. For host institutions in the US, UK, or Canada, the Fellowship includes an annual $80,000 stipend and a fund of up to $10,000 that can be used for research-related expenses such as travel and computing. At universities not in the US, UK or Canada, the fellowship amount will be adjusted to match local conditions. Here you can read more about the program.

Mouse-over or tap a profile to reveal more information:

UC Berkeley - Center for Human-Compatible AI (CHAI)

Stiennon, Nisan

UC Berkeley - Center for Human-Compatible AI (CHAI)

Stiennon, Nisan

Advisor: Andrew Critch
Research on agent foundations

Nisan earned his PhD in mathematics at Stanford in 2013, in the field of algebraic topology. He was a software engineer at Google (2014–2018) where he did ML engineering on YouTube’s recommendation system. In 2018 he joined OpenAI’s Alignment team, training language models including GPT-3 to summarize articles and even entire books using human feedback. Starting in 2021 he focused on game theory and theoretical computer science, funded by a grant and a summer research fellowship from the Center on Long-Term Risk. His research seeks to understand the notions of cooperation, competition, and bargaining between agents both human and artificial. Nisan will carry out his work at the Center for Human-Compatible Artificial Intelligence at UC Berkeley.