Why do you care about AI Existential Safety?
My research is at the intersection of human and machine learning. I use mathematical, computational, and behavioral methods to better understand cooperation in humans, and how such cooperation can be used to develop AI systems that seamlessly integrate into society. Of course, a central issue in such work is understanding long run outcomes of human and AI dynamics, which directly addresses AI existential safety. Among the papers we have authored are mathematical foundations for human-AI cooperative communication in which we build on cognitive science models and provide mathematical proofs of robustness to violations of common ground; we extended this work to proofs of consistency for sequential interactions between humans and machines; and we further extended the work to identify and shed light on violations of alignment that can arise in curriculum learning and other reinforcement learning settings in which the AI system can plan further into the future than the human.
Please give at least one example of your research interests related to AI existential safety:
Wang, P., Wang, J., Paranamana, P., & Shafto, P. (2020). A mathematical theory of cooperative communication. Advances in Neural Information Processing Systems (NeurIPS). Wang, J., Wang, P., & Shafto, P. (2020). Sequential cooperative Bayesian inference. International Conference on Machine Learning (ICML). Sheller, B., Wang, J., & Shafto, P. (preprint). The alignment problem in curriculum learning.