OpenAI Unconference on Machine Learning

Published:

October 17, 2016

Author:

Viktoriya Krakovna

Last weekend, I attended OpenAI’s self-organizing conference on machine learning (SOCML 2016), meta-organized by Ian Goodfellow (thanks Ian!). It was held at OpenAI’s new office, with several floors of large open spaces. The unconference format was intended to encourage people to present current ideas alongside with completed work. The schedule mostly consisted of 2-hour blocks with broad topics like “reinforcement learning” and “generative models”, guided by volunteer moderators. I especially enjoyed the sessions on neuroscience and AI and transfer learning, which had smaller and more manageable groups than the crowded popular sessions, and diligent moderators who wrote down the important points on the whiteboard. Overall, I had more interesting conversation but also more auditory overload at SOCML than at other conferences.

To my excitement, there was a block for AI safety along with the other topics. The safety session became a broad introductory Q&A, moderated by Nate Soares, Jelena Luketina and me. Some topics that came up: value alignment, interpretability, adversarial examples, weaponization of AI.

AI safety discussion group (image courtesy of Been Kim)

One value alignment question was how to incorporate a diverse set of values that represents all of humanity in the AI’s objective function. We pointed out that there are two complementary problems: 1) getting the AI’s values to be in the small part of values-space that’s human-compatible, and 2) averaging over that space in a representative way. People generally focus on the ways in which human values differ from each other, which leads them to underestimate the difficulty of the first problem and overestimate the difficulty of the second. We also agreed on the importance of allowing for moral progress by not locking in the values of AI systems.

Nate mentioned some alternatives to goal-optimizing agents – quantilizers and approval-directed agents. We also discussed the limitations of using blacklisting/whitelisting in the AI’s objective function: blacklisting is vulnerable to unforeseen shortcuts and usually doesn’t work from a security perspective, and whitelisting hampers the system’s ability to come up with creative solutions (e.g. the controversial move 37 by AlphaGo in the second game against Sedol).

Been Kim brought up the recent EU regulation on the right to explanation for algorithmic decisions. This seems easy to game due to lack of good metrics for explanations. One proposed metric was that a human would be able to predict future model outputs from the explanation. This might fail for better-than-human systems by penalizing creative solutions if applied globally, but seems promising as a local heuristic.

Ian Goodfellow mentioned the difficulties posed by adversarial examples: an imperceptible adversarial perturbation to an image can make a convolutional network misclassify it with very high confidence. There might be some kind of No Free Lunch theorem where making a system more resistant to adversarial examples would trade off with performance on non-adversarial data.

We also talked about dual-use AI technologies, e.g. advances in deep reinforcement learning for robotics that could end up being used for military purposes. It was unclear whether corporations or governments are more trustworthy with using these technologies ethically: corporations have a profit motive, while governments are more likely to weaponize the technology.

More detailed notes by Janos coming soon! For a detailed overview of technical AI safety research areas, I highly recommend reading Concrete Problems in AI Safety.

This content was first published at futureoflife.org on October 17, 2016.

About the Future of Life Institute

The Future of Life Institute (FLI) is the world’s oldest and largest AI think tank, with a team of 35+ full-time staff operating across the US and Europe. FLI has been working to steer the development of transformative technologies towards benefitting life and away from extreme large-scale risks since its founding in 2014. Find out more about our mission or explore our work.

Our content

Related content

Other posts about AI, Recent News

If you enjoyed this content, you also might also be interested in:

Should AIs be people too?

The Dutch East India company was among the first modern companies to receive legal personhood. Should we reconsider what personhood means in the age of AI?

19 June, 2026

Governor DeSantis Directs Florida State Agencies to Partner with Future of Life Institute to Shield Families from AI Harm

The collaboration will produce a Crisis Counselor Training Curriculum and a statewide AI Harms Reporting Form targeting dangerous AI companion applications

9 March, 2026

Statement from Max Tegmark on the Department of War’s ultimatum

"Our safety and basic rights must not be at the mercy of a company's internal policy; lawmakers must work to codify these overwhelmingly popular red lines into law."

27 February, 2026

The U.S. Public Wants Regulation (or Prohibition) of Expert‑Level and Superhuman AI

Three‑quarters of U.S. adults want strong regulations on AI development, preferring oversight akin to pharmaceuticals rather than industry "self‑regulation."

19 October, 2025