Day 2 Program (Oct 9, Sun)

9:00am Breakfast!
9:30am Three daunting challenges of federated learning: privacy leakage, label deficiency, and resource constraints
Salman Avestimehr (USC and FedML)
Federated learning (FL) has emerged as a promising approach to to enable decentralized machine learning directly at the edge, in order to enhance users' privacy, comply with regulations, and reduce development costs. In this talk, I will provide an overview of FL and highlight three fundamental challenges for landing FL into practice: (1) privacy and security guarantees for FL; (2) label scarcity at the edge; and (3) FL over resource-constrained edge nodes. I will also provide a brief overview of FedML (https://fedml.ai), which is a platform that enables zero-code, lightweight, cross-platform, and provably secure federated learning and analytics.
10:00am A pan-disciplinary view of distributed & private computation: Statistics, Geometry, ML & Social Choice
Praneeth Vepakomma (MIT)
Data in today's world is increasingly siloed across a wide variety of entities with varying resource constraints in collaboratively processing such data in order to draw actionable insights and wisdom. The quality of such wisdom is substantially better if such data is centralized at one spot prior to processing it, but this is prohibited due to privacy regulations, computational constraints, communication constraints, trade-secrets, trust issues and competition. This necessitates the development of distributed algorithms that are resource efficient for the entities involved while also preserving the privacy of this data and still ensure that the quality of wisdom obtained is on par with the case of data centralization. This talk covers some novel methods for the same in a pan-disciplinary manner tackling upstream problems with view-points in statistics, geometry, machine learning and social choice. Upstream problems are those root problems which when solved result in feeding into the solutions for several downstream problems leading to multi-pronged downstream impact.
10:30am Federated learning, Model Parallelism, and Variational Inference
Eric Xing (Mohamed Bin Zayed University of Artificial Intelligence)
11:00am Coffee & Tea Break
11:30am The Fed-BioMed Project: Deploying Federated Learning to Real-World Healthcare Applications
Marco Lorenzi (Inria Sophia Antipolis)
Fed-BioMed is a research and development initiative aiming at translating federated learning (FL) to healthcare applications. The deployment of federated learning requires to tackle important challenges to meet the strict requirements of real-world conditions. While typical methodological issues concern the definition of optimization frameworks to handle clients' heterogeneity and guarantee unbiasedness, the translation of federated learning research into practice poses novel technical and societal questions. Typical problems to be addressed concern FL security, scalability and interoperability, which motivate novel research directions and promote the close interactions between researchers, technicians and healthcare practitioners.
During the talk I will provide an illustrations of the interplay between methodological development and translational effort that characterise the development of the Fed-BioMed FL platform, and discuss our current effort in delivering FL in hospitals networks.
12:00pm Modern Tools for Collaborative Medical Image Analysis
Holger Roth (NVIDIA)
The COVID-19 pandemic has emphasized the need for large-scale collaborations by the clinical and scientific communities to tackle global healthcare challenges. However, regulatory constraints around data sharing and patient privacy might hinder access to genuinely representative patient populations on a global scale. Federated learning (FL) is a technology allowing us to work around such constraints while keeping patient privacy in mind. This talk will show how FL was used to predict clinical outcomes in patients with COVID-19 while allowing collaborators to retain governance over their data (Nature Medicine 2021). Furthermore, I will introduce several recent advances in FL, including quantifying potential data leakage, automated machine learning (AutoML) and neural architecture search (NAS), and personalization that can allow us to build more accurate and robust AI models.
12:30am Lunch Break
1:30pm Implementable and Equitable Collaborative Learning in Healthcare: Issues and Solutions for Resource-Constrained Environments
Mary-Anne Hartley (EPFL)
Predictive medicine has particular potential in remote and resource-limited settings, where models can be trained to make predictions that replace expensive tests or unavailable expertise. In short, such algorithms can save lives (or end them depending on model performance). So how can we use collaborative learning to equitably distribute model performance?
This talk covers some examples, issues, and potential solutions for implementing collaborative learning in high-stakes, low-resource healthcare settings.
2:00pm Deep Federated Learning in Helthcare
Shadi Albarqouni (University Hospital Bonn & Helmholtz Munich)
TBA.
2:30pm Secure and Federated Algorithms for Collaborative Genomic Studies
Hoon Cho (Broad Institute of MIT and Harvard)
Genomic data are commonly held in data silos due to their sensitivity. This presents a key hurdle in genomics research, where access to data from large and diverse cohorts is crucial for extracting accurate biomedical insights. In this talk, I will describe our recent progress in facilitating cross-institutional collaboration in genomics with novel algorithmic tools. Synthesizing a range of modern techniques from applied cryptography, distributed algorithms and statistical genetics, we develop practical, secure and federated protocols for essential analysis tasks in genomics. These include the two most widely-used methods for genome-wide association studies (GWAS) based on principal component analysis (PCA) and linear mixed models (LMM). We demonstrate our methods on biobank-scale genomic datasets including hundreds of thousands of genomes. Finally, I will describe our recent efforts to deploy our tools to conduct a joint study between two large-scale genomic data repositories in the US that have never been jointly analyzed due to data sharing restrictions. Our work lays the foundation for broader collaboration in biomedical research.
3:00pm Collaborative Learning in Healthcare - A Path to more Generalizable Algorithms?
Jayashree Kalpathy-Cramer (University of Colorado/MGH/Harvard Med. School)
Deep learning has demonstrated great potential in healthcare, specifically medical imaging. Despite the explosion of publications that utilize deep learning in healthcare, few DL algorithms have truly transformed patient care. An on-going challenge to the safe deployment of DL algorithms is their brittleness especially when applied to out-of-distribution data. Bias and fairness continue to be challenges. Large multi-institutional datasets can improve the model performance and generalization. However, due to a variety of factors including patient privacy, regulatory and financial considerations, such datasets are not trivial to generate. Collaborative learning approaches are potential mitigating strategies to develop more robust models by allowing models to be built using diverse datasets without the need for data sharing. Using examples from radiology, oncology and ophthalmology, we will discuss applications of collaborative learning in healthcare. We will conclude with a brief discussion of some of the practical and theoretical challenges of collaborative learning in healthcare.
3:30am Coffee & Tea Break
4:30pm Panel Discussion: Challenges and Opportunities in Practical Applications of Collaborative Learning
Panel discussing practical and deployment questions in CL.