He He
Why do you care about AI Existential Safety?
As AI systems become more competent and deployed into critical social sectors, it is concerning that their profound impact on society is often studied in a post-hoc way (e.g., influence on election, social polarization). While it is hard to predict the future trajectory (several more paradigm shifts might need to happen before we reach general or strong AI), I think “improving human wellbeing” should be the central objective from the very beginning when we design AI systems.
Please give one or more examples of research interests relevant to AI existential safety:
My current research interest lies in trustworthy AI, with a focus on natural language technologies. To make reliable and safe decisions, the learning system must avoid catastrophic failure when facing unfamiliar (out-of-distribution) scenarios. We aim to understand what types of distribution shifts incur risk and how to mitigate them. We are also excited by the prospect of AI systems collaborating with and learning from their human partners through natural language interaction. To this end, our work has focused on factual text generation models (that do not lie) and collaborative dialogue agents.