My research interests in AI safety include value alignment or preference learning, including accounting for human biases and suboptimality; assistive agents that empower people without having to infer their intentions; and robustness of learned rewards and of predictive human policies.
Anca Dragan
Why do you care about AI Existential Safety?
I am highly skeptical that we can extrapolate current progress in AI to “general AI” anytime soon. However, I believe that the current AI paradigms fail to enable us to design AI agents in ways that avoid negative side effects — which can be catastrophic for society even when using capable yet narrow AI tools. I believe we need to think about the design of AI systems differently, and empower designers to anticipate and avoid undesired outcomes.
Please give one or more examples of research interests relevant to AI existential safety: