Day 1 Program (Oct 8, Sat)

8:30am Registration and Breakfast!
9:15am On Dynamics-Informed, Learning-Aware Mechanism Design
Michael I. Jordan (University of California, Berkeley)
Statistical decisions are often given meaning in the context of other decisions, particularly when there are scarce resources to be shared. Managing such sharing is one of the classical goals of microeconomics, and it is given new relevance in the modern setting of large, human-focused datasets, and in data-analytic contexts such as classifiers and recommendation systems. I'll briefly discuss some recent projects that aim to explore the interface between machine learning and microeconomics, including strategic classification and the use of contract theory as a way to design mechanisms that perform statistical inference in the presence of information assymmetries.
9:30am Towards Trustworthy Federated Learning
Han Yu (Nanyang Technological University, Singapore)
Federated Learning (FL) is an emerging area of AI focusing on training machine learning models in a privacy-preserving manner. The success of FL, especially in open collaboration settings, rests on being able to continuously attract high quality data owners to participate. This, at the same time, also opens the FL to adversaries trying to exploit other parties’ sensitive privacy information. It is important to adopt an ecosystem management approach to building trust and controlling risk in FL. In this talk, I will share with you some theoretical and translational research effort we made in this general direction, including data valuation under FL settings, fair treatment of FL participants, and making FL robust and scalable.
10:00am Training and Incentives in Collaborative Machine Learning
Nika Hagtalab (University of California, Berkeley)
Many modern machine learning paradigms require large amounts of data and computation power that is rarely seen in one place or owned by one agent. In recent years, methods such as federated and collaborative learning have been embraced as an approach for bringing about collaboration across learning agents. In practice, the success of these methods relies upon our ability to pool together the efforts of large numbers of individual learning agents, data set owners, and curators. In this talk, I will discuss how recruiting, serving, and retaining these agents requires us to address agents’ needs, limitations, and responsibilities. In particular, I will discuss two major questions in this field. First, how can we design collaborative learning mechanisms that benefit agents with heterogeneous learning objectives? Second, how can we ensure that the burden of data collection and learning is shared equitably between agents?
10:30am Coffee & Tea Break
11:00am Security and Robustness of Collaborative Learning Systems
Anwar Hithnawi (ETH Zurich)
Collaborative, secure ML paradigms have emerged as a compelling alternative for sensitive applications in the last few years. These paradigms eliminate the need to pool data centrally for ML training and thus ensure data sovereignty and alleviate the risks associated with the large-scale collection of sensitive data. Although they provide many privacy benefits, these systemsamplify ML robustness issues by exposing the learning process to an active attacker that can be present throughout the entire training process. In this talk, I will give an overview of the security and robustness challenges of Collaborative Learning Systems and highlight why a definitive solution to robustness in these systems is challenging.
11:30am TBA
Ce Zhang (ETH Zurich)
TBA
12:00pm Mechanisms which Incentivize Data Sharing in Collaborative Learning
Sai Praneeth Karimireddy (UC Berkeley)
In collaborative learning, under the expectation that other agents will share their data, rational agents may be tempted to engage in detrimental behavior such as free-riding where they contribute no data but still enjoy an improved model. In this work, we propose a framework to analyze the behavior of such rational data generators. We first show how a naive scheme leads to catastrophic levels of free-riding. Then, using ideas from contract theory, we show how to maximize the amount of data generated and provably prevent free-riding.
12:30am Lunch Break
1:30pm On 5th Generation of Local Training Methods in Federated Learning
Peter Richtarik (KAUST)
Due to its communication-saving capabilities observed in practice, local training has been a key component of federated learning algorithms since the beginning of the field. However, much to the dismay of the theoreticians, the goal of confirming these practical benefits with satisfactory theory remained elusive. While our algorithmic understanding of local training already evolved through four generations---1) heuristic, 2) homogeneous, 3) sublinear and 4) linear---each advancing in one way or another over the previous one, the theoretical results belonging to this genre did not manage to provide communication complexity rates that would uncover any benefits coming from local training in the important heterogeneous data regime. In this talk, I will give a brief introduction to the 5th generation of local training methods, the first works of which were written in 2022. These methods enjoy strong theory, finally confirming that local training leads to theoretical communication acceleration.
2:00pm Heterogeneity-aware optimization
Sebastian U. Stich (CISPA)
We consider instances of collaborative learning problems where training data is stored in a decentralized manner and distributed across multiple client devices (this covers both decentralized and federated learning settings). We characterize the impact of data heterogeneity on the convergence rate of standard training algorithms used in such collaborative learning environments.
We show how data-heterogeneity metrics can guide topology design for decentralized optimization, but also inspire bias-correction mechanisms to accelerated training. We will conclude by discussing limitations and open problems.
2:30pm Fair and Accurate Federated Learning under heterogeneous targets
Samuel Horvath (MBZUAI)
Federated Learning (FL) has been gaining significant traction across different ML tasks, ranging from vision to keyboard predictions. In large-scale deployments, client heterogeneity is a fact and constitutes a primary problem for fairness, training performance and accuracy. Although significant efforts have been made into tackling statistical data heterogeneity, the diversity in the processing capabilities and network bandwidth of clients, termed system heterogeneity, has remained largely unexplored. Current solutions either disregard a large portion of available devices or set a uniform limit on the model's capacity, restricted by the least capable participants.In this talk, we introduce FjORD taht alleviates the problem of client system heterogeneity by tailoring the model width to the client's capabilities.
3:00pm FLECS: A Federated Learning Second-Order Framework via Compression and Sketching
Martin Takáč (MBZUAI)
Inspired by the recent work FedNL (Safaryan et al, FedNL: Making Newton-Type Methods Applicable to Federated Learning), we propose a new communication efficient second-order framework for Federated learning, namely FLECS. The proposed method reduces the high-memory requirements of FedNL by the usage of an L-SR1 type update for the Hessian approximation which is stored on the central server. A low dimensional `sketch' of the Hessian is all that is needed by each device to generate an update, so that memory costs as well as number of Hessian-vector products for the agent are low. Biased and unbiased compressions are utilized to make communication costs also low. Convergence guarantees for FLECS are provided in both the strongly convex, and nonconvex cases, and local linear convergence is also established under strong convexity. Numerical experiments confirm the practical benefits of this new FLECS algorithm.
3:30pm Coffee & Tea Break
4:00pm Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top
Eduard Gorbunov (MBZUAI)
Byzantine-robustness has been gaining a lot of attention due to the growth of the interest in collaborative and federated learning. However, many fruitful directions, such as the usage of variance reduction for achieving robustness and communication compression for reducing communication costs, remain weakly explored in the field. This work addresses this gap and proposes Byz-VR-MARINA - a new Byzantine-tolerant method with variance reduction and compression. A key message of our paper is that variance reduction is key to fighting Byzantine workers more effectively. At the same time, communication compression is a bonus that makes the process more communication efficient. We derive theoretical convergence guarantees for Byz-VR-MARINA outperforming previous state-of-the-art for general non-convex and Polyak-Lojasiewicz loss functions. Unlike the concurrent Byzantine-robust methods with variance reduction and/or compression, our complexity results are tight and do not rely on restrictive assumptions such as boundedness of the gradients or limited compression. Moreover, we provide the first analysis of a Byzantine-tolerant method supporting non-uniform sampling of stochastic gradients. Numerical experiments corroborate our theoretical findings.
4:30pm Panel Discussion: Theoretical advances and challenges in collaborative learning
Panel discussing theoretical questions in FL.