Jared Moore
Why do you care about AI Existential Safety?
The AI systems of today have already transformed the world in a myriad of ways, bad and good. Our uncertainty about the capacity of future systems necessitates a cautious approach. Work on AI safety now will reduce path-dependent risks from both any possible rogue agents and from the everyday havoc that can come from systems we don’t understand.
Please give at least one example of your research interests related to AI existential safety:
We want AIs to do what we want, to abide by our values. Thus I propose homing in on value alignment—getting a better picture of what we mean by values and what we mean by alignment. Clarifying these conceptual gaps and formalizing fuzzy notions of value will help develop technical and conceptual countermeasures against AI risks.
One way to do so is by identifying which values to learn in any downstream system. I have done so in part by formalizing the values and strategies that people take. Still, we must further operationalize those values in AI systems in order to align to them. In ongoing work I measure the degree of alignment between LLMs and humans as it concerns moral disagreements.