Skip to content
Grant

Factored Cognition: Amplifying Human Cognition for Safely Scalable AGI

Amount recommended
$225,000.00
Grant program
Primary investigator
Owain Evans, Oxford University
Technical abstract

Our goal is to understand how Machine Learning can be used for AGI in a way that is 'safely scalable', i.e. becomes increasingly aligned with human interests as the ML components improve. Existing approaches to AGI (including RL and IRL) are arguably not safely scalable: the agent can become un-aligned once its cognitive resources exceed those of the human overseer. Christiano's Iterated Distillation and Amplification (IDA) is a promising alternative. In IDA, the human and agent are 'amplified' into a resourceful (but slow) overseer by allowing the human to make calls to the previous iteration of the agent. By construction, this overseer is intended to always stay ahead of the agent being overseen.
Could IDA produce highly capable aligned agents given sufficiently advanced ML components? While we cannot directly get empirical evidence today, we can study it indirectly by running amplification with humans as stand-ins for AI. This corresponds to the study of 'factored cognition', the question of whether sophisticated reasoning can be broken down into many small and mostly independent sub-tasks. We will explore schemes for factored cognition empirically and exploit automation via ML to tackle larger tasks.

Published by the Future of Life Institute on 1 February, 2023

Sign up for the Future of Life Institute newsletter

Join 40,000+ others receiving periodic updates on our work and cause areas.
cloudmagnifiercrossarrow-up linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram