| Home | Program Day 1 | Program Day 2 |
| 8:30am | Registration and Breakfast! |
| 9:15am | On Dynamics-Informed, Learning-Aware Mechanism Design |
| Michael I. Jordan (University of California, Berkeley) | |
| Statistical decisions are often given meaning in the context of other decisions, particularly when there are scarce resources to be shared. Managing such sharing is one of the classical goals of microeconomics, and it is given new relevance in the modern setting of large, human-focused datasets, and in data-analytic contexts such as classifiers and recommendation systems. I'll briefly discuss some recent projects that aim to explore the interface between machine learning and microeconomics, including strategic classification and the use of contract theory as a way to design mechanisms that perform statistical inference in the presence of information assymmetries. |
| 9:30am | Towards Trustworthy Federated Learning |
| Han Yu (Nanyang Technological University, Singapore) | |
| Federated Learning (FL) is an emerging area of AI focusing on training machine learning models in a privacy-preserving manner. The success of FL, especially in open collaboration settings, rests on being able to continuously attract high quality data owners to participate. This, at the same time, also opens the FL to adversaries trying to exploit other parties’ sensitive privacy information. It is important to adopt an ecosystem management approach to building trust and controlling risk in FL. In this talk, I will share with you some theoretical and translational research effort we made in this general direction, including data valuation under FL settings, fair treatment of FL participants, and making FL robust and scalable. |
| 10:00am | Training and Incentives in Collaborative Machine Learning |
| Nika Hagtalab (University of California, Berkeley) | |
| Many modern machine learning paradigms require large amounts of data and computation power that is rarely seen in one place or owned by one agent. In recent years, methods such as federated and collaborative learning have been embraced as an approach for bringing about collaboration across learning agents. In practice, the success of these methods relies upon our ability to pool together the efforts of large numbers of individual learning agents, data set owners, and curators. In this talk, I will discuss how recruiting, serving, and retaining these agents requires us to address agents’ needs, limitations, and responsibilities. In particular, I will discuss two major questions in this field. First, how can we design collaborative learning mechanisms that benefit agents with heterogeneous learning objectives? Second, how can we ensure that the burden of data collection and learning is shared equitably between agents? |
| 10:30am | Coffee & Tea Break |
| 11:00am | Security and Robustness of Collaborative Learning Systems |
| Anwar Hithnawi (ETH Zurich) | |
| Collaborative, secure ML paradigms have emerged as a compelling alternative for sensitive applications in the last few years. These paradigms eliminate the need to pool data centrally for ML training and thus ensure data sovereignty and alleviate the risks associated with the large-scale collection of sensitive data. Although they provide many privacy benefits, these systemsamplify ML robustness issues by exposing the learning process to an active attacker that can be present throughout the entire training process. In this talk, I will give an overview of the security and robustness challenges of Collaborative Learning Systems and highlight why a definitive solution to robustness in these systems is challenging. |
| 11:30am | TBA |
| Ce Zhang (ETH Zurich) | |
| TBA |
| 12:00pm | Mechanisms which Incentivize Data Sharing in Collaborative Learning |
| Sai Praneeth Karimireddy (UC Berkeley) | |
| In collaborative learning, under the expectation that other agents will share their data, rational agents may be tempted to engage in detrimental behavior such as free-riding where they contribute no data but still enjoy an improved model. In this work, we propose a framework to analyze the behavior of such rational data generators. We first show how a naive scheme leads to catastrophic levels of free-riding. Then, using ideas from contract theory, we show how to maximize the amount of data generated and provably prevent free-riding. |
| 12:30am | Lunch Break |
| 1:30pm | On 5th Generation of Local Training Methods in Federated Learning |
| Peter Richtarik (KAUST) | |
| Due to its communication-saving capabilities observed in practice, local training has been a key component of federated learning algorithms since the beginning of the field. However, much to the dismay of the theoreticians, the goal of confirming these practical benefits with satisfactory theory remained elusive. While our algorithmic understanding of local training already evolved through four generations---1) heuristic, 2) homogeneous, 3) sublinear and 4) linear---each advancing in one way or another over the previous one, the theoretical results belonging to this genre did not manage to provide communication complexity rates that would uncover any benefits coming from local training in the important heterogeneous data regime. In this talk, I will give a brief introduction to the 5th generation of local training methods, the first works of which were written in 2022. These methods enjoy strong theory, finally confirming that local training leads to theoretical communication acceleration. |
| 2:00pm | Heterogeneity-aware optimization |
| Sebastian U. Stich (CISPA) | |
|
We consider instances of collaborative learning problems where training data is stored in a decentralized manner and distributed across multiple client devices (this covers both decentralized and federated learning settings). We characterize the impact of data heterogeneity on the convergence rate of standard training algorithms used in such collaborative learning environments. We show how data-heterogeneity metrics can guide topology design for decentralized optimization, but also inspire bias-correction mechanisms to accelerated training. We will conclude by discussing limitations and open problems. |
| 2:30pm | Fair and Accurate Federated Learning under heterogeneous targets |
| Samuel Horvath (MBZUAI) | |
| Federated Learning (FL) has been gaining significant traction across different ML tasks, ranging from vision to keyboard predictions. In large-scale deployments, client heterogeneity is a fact and constitutes a primary problem for fairness, training performance and accuracy. Although significant efforts have been made into tackling statistical data heterogeneity, the diversity in the processing capabilities and network bandwidth of clients, termed system heterogeneity, has remained largely unexplored. Current solutions either disregard a large portion of available devices or set a uniform limit on the model's capacity, restricted by the least capable participants.In this talk, we introduce FjORD taht alleviates the problem of client system heterogeneity by tailoring the model width to the client's capabilities. |
| 3:00pm | FLECS: A Federated Learning Second-Order Framework via Compression and Sketching |
| Martin Takáč (MBZUAI) | |
| Inspired by the recent work FedNL (Safaryan et al, FedNL: Making Newton-Type Methods Applicable to Federated Learning), we propose a new communication efficient second-order framework for Federated learning, namely FLECS. The proposed method reduces the high-memory requirements of FedNL by the usage of an L-SR1 type update for the Hessian approximation which is stored on the central server. A low dimensional `sketch' of the Hessian is all that is needed by each device to generate an update, so that memory costs as well as number of Hessian-vector products for the agent are low. Biased and unbiased compressions are utilized to make communication costs also low. Convergence guarantees for FLECS are provided in both the strongly convex, and nonconvex cases, and local linear convergence is also established under strong convexity. Numerical experiments confirm the practical benefits of this new FLECS algorithm. |
| 3:30pm | Coffee & Tea Break |
| 4:00pm | Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top |
| Eduard Gorbunov (MBZUAI) | |
| Byzantine-robustness has been gaining a lot of attention due to the growth of the interest in collaborative and federated learning. However, many fruitful directions, such as the usage of variance reduction for achieving robustness and communication compression for reducing communication costs, remain weakly explored in the field. This work addresses this gap and proposes Byz-VR-MARINA - a new Byzantine-tolerant method with variance reduction and compression. A key message of our paper is that variance reduction is key to fighting Byzantine workers more effectively. At the same time, communication compression is a bonus that makes the process more communication efficient. We derive theoretical convergence guarantees for Byz-VR-MARINA outperforming previous state-of-the-art for general non-convex and Polyak-Lojasiewicz loss functions. Unlike the concurrent Byzantine-robust methods with variance reduction and/or compression, our complexity results are tight and do not rely on restrictive assumptions such as boundedness of the gradients or limited compression. Moreover, we provide the first analysis of a Byzantine-tolerant method supporting non-uniform sampling of stochastic gradients. Numerical experiments corroborate our theoretical findings. |
| 4:30pm | Panel Discussion: Theoretical advances and challenges in collaborative learning |
| Panel discussing theoretical questions in FL. |