Tristan Trim

Position

Undergraduate

Organisation

University of Victoria

Member of

AI Safety Community Researchers

Biography

Why do you care about AI Existential Safety?

When I was a child I was very angry a great deal of the time. It felt to me sometimes as if there was a deep sense of wrongness in the world and I needed to correct it in order to feel at peace. As I grew older I realized most of the things that would upset me were trivial compared to the depth of my feelings. I have, of course, managed to develop strategies to manage my emotions and keep myself happier, but I still feel connected to those deep, upsetting emotions, only now, I’ve found things that seem to have enough weight to justify the ways that I feel. The extinction of life on earth is such a thing. Not only is it severe enough to balance how I feel, it seems to me I am vastly incapable of feeling the true depth of emotion that is warranted.

Caring about existential safety feels like finally connecting to something I have sought since I was born, and my particular focus on AI x-risk is because of my affection for math and computer science, and my feeling that AI superintelligence represents a trap in our reality. Once we have put reality into a configuration where it contains an ASI, it is very unlikely we will be able to alter the trajectory of reality from then on.

Please give at least one example of your research interests related to AI existential safety:

I completed my honours project “Mechanistic Interpretability of Reinforcement Learning Agents“. It describes a novel method for exploring latent spaces extended from the work of Mingwei Li. I think this direction could be very synergistic with current mechinterp focus on SAEs, as one of the big problems in working with linear projection is finding valuable angles to look from, and I think this is what SAEs are doing in finding “semantically interpretable vectors”.

I list more of my research interests here. Here they are copied for reference:

I wish to deepen my understanding of mathematics and explore Vanessa Kosoy’s “Learning-Theoretic Agenda”, in which she builds on the mathematical agent model, “AIXI”, introduced by Marcus Hutter. Read more.
I would like to familiarize myself with “Provably safe systems: the only path to controllable AGI” by Max Tegmark and Steve Omohundro.
And I am interested in “Cyborgism” by NicholasKees and janus.

Tristan Trim

Sign up for the Future of Life Institute newsletter