
Andrea Wynn
Why do you care about AI Existential Safety?
I believe the development of AI systems is one of the most transformative technological advancements of our time. It could significantly benefit humanity, though it also comes with equally significant risks if not aligned with human intentions. AI systems will always encounter situations where human oversight is limited or infeasible. In these cases especially, making sure AI behaves as we want and need it to is a must – including complex or implicit desires such as human values, social norms, and common sense. AI existential safety is fundamentally about safeguarding humanity’s future. If we can create trustworthy and aligned AI systems, beyond just mitigating AI risks, they’ll also empower humanity to tackle really complex global challenges in ways that we previously couldn’t imagine.
Please give at least one example of your research interests related to AI existential safety:
AI existential safety is directly related to the critical challenge of ensuring that AI systems understand and align with human goals in complex and safety-critical scenarios, particularly when the AI system operates semi-autonomously or a human is unable to provide direct oversight in all situations. At a fundamental level, my research enables an AI agent to robustly infer what a human wants from it, including complex or implicit desires such as human values, social norms, and common sense. This is essential for preventing catastrophic failures stemming from misaligned incentives or goals, particularly in high-stakes applications where humans may need to partially or entirely rely on an AI system’s judgment. My work aims to mitigate risks posed by AI systems, ensuring they operate safely and in alignment with human intentions even if their abilities surpass those of humans.