AI’s pivot towards decentralised Web3 AI infrastructures is marked by the resignation of Stability AI's CEO, Emad Mostaque, who tweeted “not going to beat centralized AI with more centralized AI.” His departure underlines the push to redirect governance, benefits and profit away from private corporations and towards the public.
In light of this transition, a paper written by three FLock.io researchers, among others (including leading professors such as Eric Xing, Carnegie Mellon University) has been published in the journal of IEEE Transactions on Artificial Intelligence.
It proposes a new Federated Learning peer-to-peer voting and reward-and-slash system that protects against malicious activities by FL participants. This interview with Zhipeng Wang, one of the core authors, explores the research.
FLock.io has recently launched its collaborative model training and fine-tuning platform aimed at redirecting governance and value accrual away from the minority of private corporations and into the hands of the public.
What are the core challenges federated learning faces today? What motivated the FLock team to address it?
Federated learning (FL) faces several challenges today: data privacy, centralisation, and susceptibility to poisoning attacks.
Data privacy is paramount as FL involves processing data across multiple devices, necessitating mechanisms to protect sensitive information. Centralisation, with FL's reliance on a central server for model aggregation, poses risks of a single point of failure and potential bottlenecks, limiting scalability. Poisoning attacks, where malicious actors introduce harmful data or model updates, threaten the integrity and effectiveness of FL systems.
Our research team was motivated to leverage blockchain to tackle these challenges, especially the poisoning attacks, to enhance FL's security, reliability, and applicability in large-scale and distributed environments.
Your paper proposes a novel FL system that utilises blockchain and distributed ledger technology. How does this contribute to a more secure and reliable FL environment, especially against malicious attacks?
In our research, we introduce a stake-based aggregation mechanism for FL that leverages blockchain technology.
This mechanism utilises blockchain's capabilities to manage the staking process of FL participants, rewarding or penalising them according to their actions. The blockchain infrastructure ensures the security and integrity of both the staking and incentive processes.
Furthermore, blockchain technology supports the verification of FL aggregation outcomes. For instance, smart contracts can be used to perform on-chain aggregation of small models, or to verify the off-chain aggregation while utilising decentralised storage solutions to store the aggregation results.
Such a system can mitigate the possible malicious activities by FL participants, enhancing the overall reliability and trustworthiness of the process. You mention the integration of a peer-to-peer voting mechanism and a reward-and-slash mechanism powered by on-chain smart contracts.
Can you break down how these mechanisms work within your system and how they help detect and deter dishonest behaviors among clients?
Drawing inspiration from Ethereum's Proof of Stake (PoS) consensus mechanism and the role-playing board game, The Resistance, we've developed a peer-to-peer (P2P) voting system coupled with reward and penalty mechanisms for Federated Learning (FL) systems.
-
Initiation of FL round: At the beginning of each FL round, some participants are randomly selected as proposers to perform local training and upload local updates to the on-chain or off-chain aggregator.
-
Global local update: The aggregator combines these local updates.
-
Voting process: Then, the randomly selected voters download the global local updates, perform local validation, and vote for acceptance or rejection.
-
Acceptance: If the majority of voters vote for acceptance, the global model will be updated. Those who vote for acceptance will be rewarded.
-
Rejection: Conversely, if the majority vote for rejection, the global model will not be updated. Those who voted for acceptance will be slashed.
Your work shows that the framework is robust against malicious client-side behaviors. Could you discuss the implications of these findings for the future of federated learning, particularly in multi-institutional collaborations?
The robustness of our framework against malicious client-side behaviors has profound implications for FL, especially in multi-institutional collaborations that handle sensitive data, such as in healthcare and finance. By ensuring the integrity and reliability of FL systems, our approach encourages wider adoption, addressing pressing concerns of data privacy and security.
This could revolutionize how institutions collaborate on machine learning projects, incentivising them to leverage collective data insights without compromising data confidentiality or system integrity.
In addition, our work also provided a solution on how to integrate blockchain into FL, accelerating the development of collaborative AI.
Looking ahead, what do you see as the next steps for research in this area?
Future research could focus on augmenting the security features of FL systems, exploring more complex attack scenarios and proposing sophisticated defense mechanisms. The integration of privacy-enhancing technologies, including Zero-Knowledge Proofs (ZKP), Differential Privacy (DP), Secure Multi-Party Computation, and Fully Homomorphic Encryption (FHE) with blockchain-enabled FL, promises to bolster data confidentiality significantly.
Moreover, optimising blockchain and smart contract implementations for greater scalability and efficiency will be crucial to support larger and more complex FL systems.
Investigating the applicability of these technologies in various domains and for different types of data will also be critical to unlocking the full potential of blockchain-based FL, or more generally, decentralised AI.
Co-authors of the paper:
Nanqing Dong, Shanghai Artificial Intelligence Laboratory, Shanghai, China
Zhipeng Wang, Department of Computing, Imperial College London, London, UK
Jiahao Sun, FLock.io, London, UK
Michael Kampffmeyer, Department of Physics and Technology at UiT, The Arctic University of Norway, Tromsø, Norway
William Knottenbelt, Department of Computing, Imperial College London, London, UK
Eric Xing, Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA