A Critical & Systematic Understanding the Decentralized AI Paradigm

TL;DR: Our newly published on DeAI is now available as a preprint! This work analyzes the functionalities of blockchain in DeAI, investigating how blockchain features contribute to enhancing the security, transparency, and trustworthiness of AI processes, while also ensuring fair incentives for AI data and model contributors.

Thanks to the tremendous hard work put in by our Research and Engineering teams, we are pleased to share that our research output on Decentralized Artificial Intelligence (DeAI) is now available as a preprint.

This paper presents a Systematization of Knowledge (SoK) for blockchain-based Decentralized Artificial Intelligence (DeAI) solutions. Specifically, this SoK proposes a taxonomy to classify existing DeAI protocols based on the model lifecycle, providing a critical and systematic understanding of the current landscape of DeAI protocols, and identifying their similarities and differences. Most importantly, this work explores where DeAI is today and where it could go next.

The rapid rise of AI is transforming industries such as healthcare, finance, and transportation. But along with these incredible benefits come real concerns about the centralization of AI. Centralization makes AI systems vulnerable to technical disruptions, embeds biases from controlling entities, and limits scalability and innovation due to processing bottlenecks and lack of diversity. When only a handful of entities have the resources to train advanced models, it stifles broader advancements in the AI space.

To address these challenges, DeAI has emerged as a promising solution that leverages the strengths of both blockchain and AI. Blockchain technology offers transparency, security, and decentralization, significantly enhancing the trustworthiness of AI systems. By integrating blockchain with AI, it is possible to create a more robust and decentralized ecosystem that addresses key issues like data privacy, model integrity, and equitable access to AI resources.

Lifecycle Phases of Decentralized AI

To this end, we propose a framework to ensure that AI processes are traceable and decentralized throughout their lifecycle. As shown in Figure 1, the lifecycle of a DeAI application consists of four phases: task-proposing, pre-training, training, post-training, and a feedback loop that may return to task proposal for model refining or fine-tuning. In addition, Table 1 summarizes almost 50 DeAI industry projects which are surveyed in our SoK. This, as well as other DeAI-related materials covered in the paper, are all now publicly available in this Github repository.

Figure1: A DeAI model lifecycle

Table1: Overview of DeAI Projects.

Pre-Training Phase

The pre-training phase generally comprises two stages: data preparation and compute.

Data Preparation

Data preparation involves processes such as data collection, cleaning, normalization, transformation, and feature selection. Recent analyses suggest that we are nearing the limit of available data for future models. This highlights critical challenges for centralized AI: data saturation, underrepresentation across domains, languages, and regions, and increasing privacy concerns that hinder access to domain-specific datasets. Decentralized AI offers a solution by enabling the use of diverse, domain-specific datasets while respecting data privacy, ultimately enhancing model robustness, scalability, and reducing biases.

Incentive Mechanisms for Data Contributions: Decentralized data preparation platforms often use blockchain-based reward mechanisms to incentivize high-quality data contributions. Platforms like Ocean Protocol and Vana tokenize data assets, allowing data providers to publish datasets as datatokens, which data consumers can purchase, creating a market-driven approach. Proof-of-contribution mechanisms assess data quality and assign scores, linking rewards to meaningful data contributions. Additionally, stake and reputation-based systems, such as those used in Fraction AI, reward contributors based on the quality of their work and their corresponding on-chain reputation, promoting high standards and accountability.

Data Privacy Protection: Privacy-enhancing technologies like public encryption and zero-knowledge proofs (ZKPs) are often employed to protect the privacy of data contributors in decentralized data preparation platforms. For instance, Vana uses ZKPs, such as Groth16, to verify the authenticity and integrity of data without revealing its full content. Similarly, Ocean Protocol employs encryption and blockchain-based access control mechanisms to protect data privacy, using NFTs as keys to unlock datasets, with smart contracts enforcing these access controls.

Compute

Compute resources play a pivotal role in the development of AI systems and directly impact model training and inference performance. Blockchain enables permissionless access to compute resources by allowing compute users and providers to interact without intermediaries like traditional cloud service providers. This creates a distributed network where anyone with idle hardware can contribute compute power, enhancing decentralization and making high-performance compute available for AI tasks. Platforms like Lilypad leverage blockchain to enable containerized workloads on idle compute nodes, promoting an open and accessible compute environment.

Incentives and Integrity in Compute Networks: Blockchain-based compute networks use incentive mechanisms to ensure efficient resource utilization. Tokenized economies, such as those in IO.NET and Akash, allow compute providers to monetize idle resources and enable users to access high-performance compute at competitive, market-driven prices. The transparency of blockchain ensures integrity and reliability by recording every task on-chain. With consensus mechanisms like proof-of-learning, blockchain-enabled DeAI protocols like Gensyn ensure that tasks are honestly executed and compute providers are fairly rewarded.

Training Phase

In the training phase, blockchain plays an equally important role in decentralizing AI processes:

Trustless Environment and Decentralized Training: Blockchain enables a trustless environment where participants do not need to rely on a central authority. All interactions—such as task assignments, model updates, and reward distributions—are recorded on an immutable ledger, ensuring transparency. Bittensor, for instance, encodes the rules governing the training process and record participant contributions on-chain, while Numerai provides a decentralized data science competition that promotes global collaboration without centralized oversight.

Decentralized Model Validation: Model updates are validated in a distributed manner. Validators within the blockchain network verify the correctness of these updates, ensuring that malicious or low-quality contributions do not negatively impact the system. Numerai, for example, aggregates independently trained models into a meta-model, evaluating performance based on individual contributions to ensure overall system integrity.

Consensus and Incentive Mechanisms: Blockchain-based training employs consensus and incentive mechanisms to reward participants for high-quality contributions. Bittensor's Yuma Consensus allocates rewards to miners and validators based on their contributions, while FLock uses staking to reward participants proportionally to their performance. These mechanisms encourage participants to develop reliable and effective models, fostering a collaborative ecosystem.

Post-Training Phase

Decentralized Model Inference

In traditional AI systems, model inference involves deploying a trained model on centralized servers to make predictions or decisions based on new input data. This centralized approach faces challenges related to information inefficiency and inference integrity. Blockchain-based decentralized model inference addresses these challenges by offering transparent and verifiable inference processes.

Decentralized Inference Verification: Two main approaches are used for inference verification—ZKP-based and optimistic proof-based verification. ZKP-based inference, like Sertn, allows participants to prove that an inference was conducted correctly without revealing underlying details, ensuring privacy while maintaining trust. Optimistic proof-based inference, such as ORA, operates under the assumption that results are correct unless challenged, incorporating an interactive fraud-proof protocol to resolve disputes.

Incentive Mechanisms for Reliable Inference: Incentive mechanisms, such as those used in Allora, encourage network participation by rewarding nodes for producing high-quality inferences and validating the work performed by others. This improves information exchange efficiency and ensures reliable inference outcomes.

Decentralized AI Agents

Blockchain-based decentralized agents present a promising solution to the challenges faced by traditional AI agents. By operating in a decentralized environment, these agents eliminate reliance on centralized infrastructure. Platforms like Fetch.AI enable the creation of autonomous economic agents that interact within an Open Economic Framework (OEF), facilitating efficient transactions without central control. Similarly, Delysium and Theoriq ensure secure communication, auditability, and the ability to collaborate on complex tasks.

Trust, Transparency, and Incentives for Decentralized Agents: Blockchain also enhances trust and transparency in decentralized agent interactions. By recording actions on-chain, agents can trust that their behaviors and agreements will be executed as predefined without third-party intermediaries. Incentive mechanisms are key to fostering collaboration among decentralized agents. Protocols like Morpheus use token-based incentive models to motivate agents to contribute resources, data, or computational power. This encourages agents to collaborate on complex tasks and innovate, ultimately improving system efficiency.

Decentralized AI Model Marketplace

Blockchain-based AI model marketplaces address challenges such as unfair compensation and lack of transparency. Traditional marketplaces often fail to fairly compensate model creators and use undisclosed algorithms for ranking models. Blockchain helps address these issues by enabling decentralization, transparency, and fair incentivization for contributors.

Tokenization of AI Assets and Fair Incentives: Blockchain tokenization allows AI assets, including models and datasets, to be represented as digital tokens, which can be traded or licensed with transparent ownership and secure provenance. Projects like BalanceDAO and SingularityNET empower model contributors by providing fair compensation through token-based rewards, ensuring that model creators are directly rewarded for their contributions. Similarly, the Sahara AI Marketplace also creates a decentralized hub for trading AI assets, ensuring rewards are distributed fairly.

Transparent Ranking and Trustworthy Ownership: Transparent ranking algorithms are another benefit of decentralized AI marketplaces. Platforms like Immutable Labs use blockchain-based models to provide fair model rankings and recommendations, ensuring contributors are presented transparently based on performance and reputation. Sahara AI enhances transparency by using non-fungible receipts as on-chain proof of AI model ownership, countering traditional biases and building trust in the marketplace.

Open Research Questions for DeAI

DeAI is still in its nascent stage, and despite significant efforts from the industry, we still have a long way to go. Thus, in the SoK, we further explore additional open research questions with the hope of contributing to DeAI's future, both in terms of research as well as its application:

Decentralized Solutions for Task Proposing: Although task proposing marks the beginning of an AI model’s lifecycle, a decentralized solution for this stage remains absent in the current DeAI landscape. A decentralized task-proposing platform typically requires solutions for both distributed learning algorithm preparation and decentralized code verification. The latter can offer a robust approach for DeAI by enabling objective, transparent, and efficient evaluations through consensus mechanisms, distributed validation, and reputation-based incentives. These features help address traditional verification challenges such as subjectivity, risks of collusion, and inefficiency. However, there is still a need for blockchain-enabled frameworks for decentralized code verification that can ensure both code security and operational efficiency in DeAI.
Security Issues Caused by Centralized Components in DeAI: Although DeAI protocols aim for decentralization, AI model training often relies on centralized third-party services. For example, on July 2, 2024, Bittensor faced a major security breach through its Python package on PyPi. A malicious actor uploaded a compromised package disguised as a Bittensor update, containing code that stole unencrypted private keys (coldkeys) during key decryption operations. This allowed unauthorized access to users’ wallets for fund transfers. The incident highlights a critical issue: in DeAI platforms, while the underlying blockchain itself is secure, the vulnerabilities in centralized third-party tools can undermine the overall system security.
Lightweight Privacy-Preserving DeAI Solutions: Our analysis of DeAI protocols shows a growing use of ZKPs for privacy, security, and integrity in decentralized ML. Examples include Vana’s proof of data contribution, OpSec’s task verification, and Sertn’s proof of service, illustrating ZKPs' importance. However, on-chain ZKP generation and verification still remain computationally expensive, posing scalability challenges and necessitating optimizations. While the OML project offers a promising concept of “AI-native cryptography”, which is tailored for continuous AI data representations rather than discrete data, realizing such lightweight solutions for DeAI requires further research and innovation.
Efficiency Evaluation and Scalability of DeAI: DeAI currently lacks standardized benchmarking frameworks tailored to its unique architecture, making it difficult to access and compare decentralized models' performance with advanced centralized models such as GPT and Llama. Developing robust evaluation criteria for DeAI models is an important research challenge. Moreover, the blockchain components in DeAI may introduce performance bottlenecks, especially when computations are performed on-chain. Moving heavy computations off-chain could reintroduce centralized control over critical elements of the AI pipeline, defeating blockchain’s purpose. Effective scaling of DeAI requires adaptive techniques for distributed model training, efficient communication, and convergence guarantees, yet real-world implementation and validation remain challenging. Addressing these issues is essential for DeAI to achieve both high performance and true decentralization.
Combining DeAI with Other Decentralized Applications: The integration of DeAI with other decentralized platforms offers promising new applications, particularly in areas such as IoT and DeFi. A notable example is PINAI, which aims to create a decentralized platform for Personal AI, prioritizing user privacy, data ownership, and crypto-economic security. PINAI builds on open-source AI, leveraging blockchain to facilitate a secure and transparent environment where AI models can access rich contextual data without compromising privacy. Additionally, DeAI’s integration with DeFi platforms, such as Giza ARMA agents, Compass Labs DeFi agents, and NOYA, can enhance algorithmic trading models, while its use in decentralized data marketplaces enables data sharing and monetization. Such combinations create new opportunities for building resilient, transparent, and efficient AI-driven applications across various domains.

More about FLock

At FLock, we pride ourselves as a research-driven organization, as most of our features in production originate from peer-reviewed academic work. To read our SoK, visit here. All DeAI-related materials covered in the paper are all now publicly available in this Github repository. FLock is fully committed to decentralizing AI. To this end, we put our invaluable research outputs into actual real-life application. To know more about FLock’s overall system design and technical components, see our Whitepaper.