
Francesco Ortu
Why do you care about AI Existential Safety?
I care deeply about AI existential safety because I firmly believe that failing to fully understand large language models and adequately assess their political impact on society could lead to significant and potentially irreversible consequences. Without a comprehensive grasp of how these powerful systems function and influence human behavior, there’s a substantial risk of unintentionally amplifying societal biases, polarization, and misinformation. My commitment to mechanistic interpretability and evaluating political biases in AI arises from a sense of urgency to prevent such undesirable outcomes, ensuring that AI systems are transparent, trustworthy, and aligned with humanity’s broader interests.
Please give at least one example of your research interests related to AI existential safety:
One example of my research interest is to mechanistically understand how different internal mechanisms interact during the information processing of LLMs/VLMs, potentially leading to undesirable and unexpected predictions. These interactions, often opaque and poorly understood, can result in the amplification of harmful biases, misinformation, or manipulative behaviors. By dissecting how specific components of the model contribute to such outcomes, I aim to build a clearer picture of the risks they pose. This kind of foundational understanding is essential for developing effective safeguards and ensuring that increasingly powerful AI systems remain aligned with human values and safety.
