Building Sustainable and Collaborative Data-Ecosystems

Sai Praneeth Karimireddy, USC
Seminar
LANS Seminar Graphic featuring the title and date for the event.

Abstract: Data is the most important factor determining the quality of an AI system. However, The data commons that current AI relies on is fast collapsing. This issue is only exacerbated when considering more valuable data (e.g. healthcare) which are firmly locked behind privacy and incentive barriers. We will examine how tools from optimization, statistics, and economics can be combined to reimagine AI infrastructure and build sustainable data ecosystems. This talk will be largely based on these three papers:

  1. SCAFFOLD: Stochastic Controlled Averaging for Federated Learning (arxiv)
  2. Evaluating and Incentivizing Diverse Data Contributions in Collaborative Learning (arxiv)
  3. Data Acquisition via Experimental Design for Decentralized Data Markets (arxiv)

Bio:  Sai Praneeth Karimireddy is an Assistant Professor in the Thomas Lord Department of Computer Science at USC. Before this, he was a postdoc at UC Berkeley with Michael Jordan and did my PhD among the picturesque mountains of EPFL with Martin Jaggi. He also co-leads the Data Quality and Federated Learning for Health. Praneeth’s research studies data challenges in machine learning, including privacy, incentives and data markets, and AI for health. Praneeth’s work has seen widespread adoption both by public health organizations (e.g., Doctors Without Borders, the Cancer Registry of Norway) and by industries such as MetaGoogleOpenAI, and Owkin. It has also been recognized by numerous awards such as the Patrick Denantes Memorial Prize for the best thesis in computer science, the Chorafas Foundation Award for exceptional applied research, and multiple best paper awards.

See all upcoming talks at https://www.anl.gov/mcs/lans-seminars