Skip to main content Skip to secondary navigation


Main content start

Research Interests

STAI does research across four broad domains: fairness, explainability, robustness, and security, guided by ethics.

Areas of Trustworthy AI

Our group aims to conduct cutting-edge work at the nexus of these areas. We place a strong emphasis on research directions which are theoretically novel, practically useful, and socially beneficial.

Current Research

Differential Privacy

Differential privacy (DP) is a formal notion of privacy that places bounds on the information leakage of training data. While differential privacy has been successfully applied to simple models and datasets, large models, and complex visual recognition tasks, remain out of reach. Our group works on technical innovations that enable differential private machine learning on large-scale tasks.

Invariant Risk Minimization

Machine learning models often fail to generalize; their performance greatly suffers when they are applied on out-of-domain tasks. Our group applies recent advances in invariant risk minimization to train robust, generalizable models on large-scale image recognition tasks.

Model Inspection

Trained deep learning models are often opaque and uninterpretable. Our group takes prominent trained architectures, and builds interpretability tools to understand the failure modes and biases within model weights. We are interested in interpreting learned representations in models, and aligning models to human behavior.