MIRI’s New Technical Research Agenda

Published:

March 18, 2015

Author:

Nate Soares

Luke Muehlhauser is the Executive Director of the Machine Intelligence Research Institute (MIRI), a research institute devoted to studying the technical challenges of ensuring desirable behavior from highly advanced AI agents, including those capable of recursive self-improvement. In this guest blog post, he delves into MIRI’s new research agenda.

MIRI’s current research agenda — summarized in “Aligning Superintelligence with Human Interests” — is focused on technical research problems that must be solved in order to eventually build smarter-than-human AI systems that are reliably aligned with human interests: How can we create an AI agent that will reliably pursue the goals it is given? How can we formally specify beneficial goals? How can we ensure that this agent will assist and cooperate with its programmers as they improve its design, given that mistakes in the initial version are inevitable?

Within this broad area of research, MIRI specializes in problems which have three properties:

(a) We focus on research questions that cannot be delegated to future human-level AI systems (HLAIs). HLAIs will have the incentives and capability to improve e.g. their own machine vision algorithms, but if an HLAI’s preferences themselves are mis-specified, in may never have an incentive to “fix” the mis-specification itself.

(b) We focus on research questions that are tractable today. In the absence of concrete HLAI designs to test and verify, research on these problems must be theoretical and exploratory, but such research should be technical whenever possible so that clear progress can be shown, e.g. by discovering unbounded formal solutions for problems we currently don’t know how to solve even given unlimited computational resources. Such exploratory work is somewhat akin to the toy models Butler Lampson used to study covert channel communication two decades before covert channels were observed in the wild, and is also somewhat akin to quantum algorithms research long before any large-scale quantum computer is built.

(c) We focus on research questions that are uncrowded. Research on e.g. formal verification for near-future AI systems already receives significant funding, whereas MIRI’s chosen research problems otherwise receive limited attention.

Example research problems we will study include:

(1) Corrigibility. How can we build an advanced agent that cooperates with what its creators regard as a corrective intervention in its design, despite default incentives for rational agents to resist attempts to shut them down or modify their preferences?

(2) Value learning. Direct specification of broad human preferences in an advanced agents’ reward/value function is impractical. How can we build an advanced AI that will safely learn to act as intended?

This content was first published at futureoflife.org on March 18, 2015.

About the Future of Life Institute

The Future of Life Institute (FLI) is the world’s oldest and largest AI think tank, with a team of 35+ full-time staff operating across the US and Europe. FLI has been working to steer the development of transformative technologies towards benefitting life and away from extreme large-scale risks since its founding in 2014. Find out more about our mission or explore our work.

Our content

Related content

Other posts about AI, Partner Orgs

If you enjoyed this content, you also might also be interested in:

Should AIs be people too?

The Dutch East India company was among the first modern companies to receive legal personhood. Should we reconsider what personhood means in the age of AI?

19 June, 2026

Governor DeSantis Directs Florida State Agencies to Partner with Future of Life Institute to Shield Families from AI Harm

The collaboration will produce a Crisis Counselor Training Curriculum and a statewide AI Harms Reporting Form targeting dangerous AI companion applications

9 March, 2026

Statement from Max Tegmark on the Department of War’s ultimatum

"Our safety and basic rights must not be at the mercy of a company's internal policy; lawmakers must work to codify these overwhelmingly popular red lines into law."

27 February, 2026

The U.S. Public Wants Regulation (or Prohibition) of Expert‑Level and Superhuman AI

Three‑quarters of U.S. adults want strong regulations on AI development, preferring oversight akin to pharmaceuticals rather than industry "self‑regulation."

19 October, 2025