Can Defense in Depth Work for AI? (with Adam Gleave)

Adam Gleave is co-founder and CEO of FAR.AI. In this cross-post from The Cognitive Revolution Podcast, he joins to discuss post-AGI scenarios and AI safety challenges. The conversation explores his three-tier framework for AI capabilities, gradual disempowerment concerns, defense-in-depth security, and research on training less deceptive models. Topics include timelines, interpretability limitations, scalable oversight techniques, and FAR.AI’s vertically integrated approach spanning technical research, policy advocacy, and field-building.
CHAPTERS:
(00:00) A Positive Post-AGI Vision
(10:07) Surviving Gradual Disempowerment
(16:34) Defining Powerful AIs
(27:02) Solving Continual Learning
(35:49) The Just-in-Time Safety Problem
(42:14) Can Defense-in-Depth Work?
(49:18) Fixing Alignment Problems
(58:03) Safer Training Formulas
(01:02:24) The Role of Interpretability
(01:09:25) FAR.AI's Vertically Integrated Approach
(01:14:14) Hiring at FAR.AI
(01:16:02) The Future of Governance
Related episodes

How We Keep Humans in Control of AI (with Beatrice Erkers)

Breaking the Intelligence Curse (with Luke Drago)
