Why do you care about AI Existential Safety?
Modern machine learning models are routinely trained on broad data at an immense scale. Such models learned via self-supervision on pre-text tasks can be performant for a broad range of downstream tasks. Yet, it also poses great challenges in trustworthiness that spans robustness, privacy, fairness, calibration, and interpretability. My work studies those concerns and proposes effective solutions to ensure safe model deployment that involves consequential decision making.
Please give at least one example of your research interests related to AI existential safety:
Trustworthy ML in the wild through scaling: How can we identify problematic behaviors of ML models in consequential decision-making and develop algorithmic tools to mitigate them?
Understanding and improving learning through reasoning: How can we leverage language to imbue useful inductive biases for reasoning to further make progress on the above trustworthy issues?