Why do you care about AI Existential Safety?
AGI and ASI have the potential to be hugely beneficial and yet also pose existential risks to the future of humanity and indeed all Earth-like life. This dual, Janus-faced nature is something we can already begin to see with currently existing AI and technology more broadly. Although predictions about AI have proved unreliable in the past, I think we should still take AGI and even ASI seriously even if the possibility were far off and unlikely—which it is not known than it is—, given the potentially catastrophic consequences that can only be prepared for in advance of their arrival. I therefore view AI existential safety as a crucial field where more research efforts can make a maximally positive impact on human prosperity. While the AI existential safety community’s research output has been pioneering in this regard with much more to be learnt from it, I hope that my particular philosophical background can contribute its own useful insights and perspectives in addressing questions such as the nature of intelligence, the basic AI drives, the AI value alignment problem, and the potential economic impact of the technological singularity or intelligence explosion.
Please give at least one example of your research interests related to AI existential safety:
My current research aims to develop a refined theoretical framework for thinking about AGI and ASI based on a critical assessment of their dominant contemporary conceptions in the existing AI literature. Since the connectionist revolution of artificial neural nets, deep learning and evolutionary algorithms, AI companies like Google's DeepMind are taking seriously the prospect of creating machines with humanlike, and perhaps even greater than human, intelligence. Although the literature on AGI and ASI is enormous and extremely rich, in my view, the two most sophisticated schools are united in their belief that intelligent systems do not have any intrinsic ends, norms, values or intentions hardwired into them simply by virtue of being intelligent. For instance, the orthogonalist school—as perhaps best represented by Nick Bostrom’s 2014 book Superintelligence—holds that, even if ASI can be programmed to pursue a fixed and static end for all time, that end can nonetheless be anything no matter how preposterous or incomprehensible it may seem to us. A less widely known but evermore popular school in philosophy called neorationalism—and perhaps best represented by Reza Negarestani’s 2018 tome Intelligence and Spirit—also agrees that intelligent systems are capable of pursuing any and all ends and norms, albeit without the orthogonalists’ caveat that those systems could ever be perpetually locked into pursuing just one value or set of values for all time. In critically assessing both these models, my research aims to demonstrate that any goal-directed intelligent system can only pursue its ends through universal means like cognitive enhancement, creativity and resource acquisition as the very condition of possibility for willing anything at all. Since all supposedly self-legislated ends presuppose pursuing the means of achieving them, all intelligent systems do have those means transcendentally hard-wired into them as their common basic drives. What therefore tends to get overlooked is the potential for autonomous machines to completely disregard whatever ends we programmed them to pursue in favour of pursuing the implicit, intermediary means as ultimate ends in themselves—in short, that the autonomization of our ends will lead to the end of autonomy.