Skip to content

Andy Zou

Biography

Why do you care about AI Existential Safety?

Artificial Intelligence (AI) now excels in a wide array of different fields — biology, visual art, writing, math, coding, and game playing. Despite the positive impacts they bring, these systems still lack important properties which raises urgent concerns in light of the heightened risks and responsibilities. Consequently, substantial effort has been focused on improving the safety of AI systems. For instance, for current AI systems one might want to make autonomous vehicles more reliable or make language generations more factually correct. However, these short-term risks are not the only ones that require attention. There is a growing consensus in the machine learning (ML) research community that AI systems may soon reach and surpass human-level intelligence across a broad range of tasks. In some cases, these AI systems may resemble autonomous agents sharing our space and resources, deployed both in the physical and digital worlds to freely interact with humans. It is absolutely paramount that as general capabilities improve quickly, safety performance follows the same trend.

Please give at least one example of your research interests related to AI existential safety:

Currently, I am creating testbeds to track agent's deceptive and power-seeking behaviors in complex environments. This could serve as a first step towards developing agents that can reason about tradeoffs, resist temptations, and also developing methods that better guide or regulate agents. Separately, I also work on robustness, proxy gaming, and monitoring.

Read more
Our content

Content from this author

Sort order

Sort order

Category

Category
  • AI (317)
  • Recent News (137)
  • Featured (63)
  • Partner Orgs (37)
  • Grants Program (31)
  • AI Research (25)
  • Nuclear (19)
  • FLI projects (17)
  • Biotech (16)
  • AI Safety Principles (15)

Content type

Content type
  • Posts (202)
  • Podcasts (68)
  • Newsletters (35)
  • Open Letters (9)
  • Cause areas (1)
  • FLA Awards (1)
  • Resources (1)

Number of results

June 8, 2023
podcast
June 1, 2023

Yawen Duan

person
June 1, 2023
person
June 1, 2023

Caspar Oesterheld

person
June 1, 2023
person
June 1, 2023

Kayo Yin

person
June 1, 2023
person
June 1, 2023

Dr. Peter S. Park

person
June 1, 2023
person
May 26, 2023
podcast
May 12, 2023
podcast
April 16, 2023

Landon Klein

person
April 16, 2023
person
April 13, 2023

Signature email verified

page
April 13, 2023
page
1 2 3 12

Load more

Sign up for the Future of Life Institute newsletter

Join 20,000+ others receiving periodic updates on our work and cause areas.
View previous editions
cloudmagnifiercrossarrow-up linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram