Dr Peter Kairouz

Biography

Dr. Peter Kairouz is a Senior Staff Research Scientist at Google, where he leads key research and engineering initiatives. His work advances technologies like federated learning, privacy auditing, and differential privacy, driving forward responsible AI developments.

Before joining Google, Dr Kairouz completed a Postdoctoral Fellowship at Stanford University and earned his Ph.D. from the University of Illinois at Urbana-Champaign (UIUC).

Awards

Dr Kairouz is the recipient of several prestigious awards, including:

The 2012 Roberto Padovani Scholarship from Qualcomm’s Research Center
The 2015 ACM SIGMETRICS Best Paper Award
The 2015 Qualcomm Innovation Fellowship Finalist Award
The 2016 Harold L. Olesen Award for Excellence in Undergraduate Teaching from UIUC
The 2021 ACM CCS Best Paper Award

Dr Kairouz has organized numerous workshops and delivered tutorials on private learning and analytics at top-tier conferences, and he continues to serve in key editorial and leadership roles within the machine learning community.

Abstract

Large language models (LLMs) present significant opportunities in content generation, question answering, and information retrieval, yet their training, fine-tuning, and deployment introduce significant privacy challenges. This crash course offers a concise overview of privacy-preserving machine learning (ML) in the context of this evolving landscape and the risks associated with LLMs.

The course illuminates four key privacy principles inspired by known LLM vulnerabilities when handling user data: data minimization, data anonymization, transparency/consent, and verifiability.

Focusing on practical applications, you’ll explore federated learning (FL) as a data minimization technique, covering its various flavors, algorithms, and implementations. You’ll then examine differential privacy (DP) as a gold standard for anonymization, learning about its properties, variants, and applications in conjunction with FL, including production deployments with formal privacy guarantees.

In scenarios where achieving strong user-level DP proves difficult, you’ll discover a robust, task-and-model-agnostic membership inference attack to quantify risk by accurately estimating the actual leakage (empirical epsilon) in a single training run. You’ll see how these state-of-the-art techniques systematically mitigate many privacy risks, albeit sometimes with trade-offs in computation or performance.

The course also examines verifiability through open-sourcing privacy tech and trusted execution environments. Finally, you’ll be introduced to the open research questions, challenges, and compelling future research directions that are shaping the future of privacy-preserving ML for foundation models.