Alex Turner
Google DeepMind
Why do you care about AI Existential Safety?
AI Existential Safety seems like a fork in the road for humanity’s future. AI is a powerful technology, and I think it will go very wrong by default. I think that we are on a “hinge of history”—that in retrospect, this century may considered be the most important century in human history. We still have time on the clock to make AGI go right. Let’s use it to the fullest.
Please give one or more examples of research interests relevant to AI existential safety:
I am conducting research on “steering vectors” to manipulate and control language models at runtime, alongside exploring mechanistic interpretability of AI behaviors in maze-solving contexts and the theory of value formation in AI.