Jensen Huang Called Out? SN3 Surged 5x in a Month—What Did It Actually Do?

In March 20, 2026, an unusual conversation took place on the All-In Venture Podcast.

Venture capital heavyweight Chamath Palihapitiya handed the mic to NVIDIA CEO Jensen Huang, mentioning that there’s a project on Bittensor “that achieved a pretty crazy technical feat”: training a large language model over the internet using distributed computing, fully decentralized, with no centralized data centers involved.

Huang didn’t shy away. He compared it to a “modern version of Folding@home,” the distributed project from the 2000s where ordinary users contributed idle computing power to fight protein folding problems.

Four days earlier, on March 16, Anthropic co-founder Jack Clark released a report on AI research progress, dedicating significant space to highlight and cite this breakthrough: the Bittensor ecosystem subnet Templar (SN3) completed distributed training of a 72-billion-parameter model (Covenant 72B), with performance comparable to Meta’s LLaMA-2 released in 2023.

Clark titled that section “Challenging AI Political Economy through Distributed Training,” emphasizing that this is a technology worth tracking—he envisions a future where edge devices adopt models trained via decentralized methods, while cloud AI continues to run proprietary large models.

Market reactions were slightly delayed but intense: SN3 surged over 440% in the past month, over 340% in the past two weeks, with a market cap reaching $130 million. The narrative explosion around the subnet directly fueled TAO token buying pressure. As a result, TAO soared, reaching $377 at one point, doubling in a month, with a fully diluted valuation around $7.5 billion.

The question is: what exactly did SN3 do? Why is it in the spotlight? How will the narrative around distributed training and decentralized AI evolve?

That 72B Model

To answer, we need to look at SN3’s achievements.

On March 10, 2026, Covenant AI published a technical report on arXiv announcing the completion of Covenant-72B training. It’s a 720-billion-parameter large language model, trained across over 70 independent nodes (about 20 nodes synchronized per round, each equipped with 8 B200 GPUs), on roughly 1.1 trillion tokens of data, pretraining the model.

Templar provided some benchmark data, comparing it to Meta’s LLaMA-2-70B from 2023. As Clark noted, Covenant-72B might be considered outdated by 2026 standards. Its MMLU score of 67.1 roughly matches LLaMA-2-70B’s 65.6.

However, the state-of-the-art models in 2026—GPT series, Claude, Gemini—have already trained models exceeding 1000 billion parameters on hundreds of thousands of GPUs, with order-of-magnitude improvements in reasoning, coding, and math capabilities. The performance gap isn’t just a few percentage points; it’s a fundamental difference. This reality shouldn’t be overshadowed by market hype.

But under the premise of “training with distributed internet-scale compute,” the implications are entirely different.

For comparison: INTELLECT-1 (by Prime Intellect team), a 10-billion-parameter model trained via decentralized methods, scored 32.7 on MMLU; another project, Psyche Consilience (400 billion parameters), scored 24.2. Covenant-72B, with 72 billion parameters and a 67.1 MMLU score, stands out in the decentralized training space.

More importantly, this training was “permissionless”: anyone with sufficient compute can join as a node, no pre-approval or whitelist needed. Over 70 independent nodes worldwide contributed to model updates.

What Did Huang Say (and Not Say)?

Restoring the details of that podcast conversation helps clarify how the “endorsement” is being interpreted.

Chamath presented Bittensor’s technical achievement to Huang, describing it as training a Llama model “completely distributed and stateful.” Huang responded by likening it to “a modern version of Folding@home,” and discussed the necessity of parallel coexistence of open-source and proprietary models.

Notably, Huang didn’t mention Bittensor’s token or any investment implications, nor did he delve into decentralized AI training further.

Understanding Bittensor Subnets and SN3

To grasp SN3’s breakthrough, first understand Bittensor and how its subnets operate. Simply put, Bittensor is a blockchain platform for AI, with each subnet functioning as an “AI production pipeline,” each with clear objectives, incentive mechanisms, and a decentralized ecosystem.

Its operation is transparent and decentralized: subnet owners define goals and develop incentive models; miners provide compute resources and perform AI tasks (inference, training, storage); validators score miners’ contributions and upload scores to the Bittensor consensus layer; finally, the Yuma consensus algorithm distributes rewards based on accumulated subnet incentives.

Currently, Bittensor hosts 128 subnets covering inference, serverless AI cloud services, image processing, data labeling, reinforcement learning, storage, and computation.

SN3 is one of these subnets. It doesn’t wrap applications or rent existing APIs; instead, it targets one of the most expensive and closed links in the AI supply chain: large model pretraining.

SN3 aims to coordinate heterogeneous compute resources via Bittensor’s network, demonstrating that powerful foundational models can be trained without expensive centralized supercomputers. Its core appeal is “equity”—breaking resource monopolies in centralized training, enabling individuals and small organizations to participate, and reducing training costs through distributed compute.

The main driver behind SN3 is Templar, developed by Covenant Labs. They also operate two other subnets: Basilica (SN39, focused on compute services) and Grail (SN81, focused on RL fine-tuning and evaluation). These form a vertically integrated ecosystem covering the entire process from pretraining to alignment, creating a decentralized large-model training infrastructure.

Specifically, miners contribute compute, upload gradient updates (model parameter adjustments), which are then evaluated by validators. Validators assess the quality of each miner’s contribution, scoring on how much the model loss improves. If a miner’s contribution is suspicious—e.g., improving loss on random data more than on assigned data—they are penalized.

Rewards are allocated based on how much a miner’s contribution improves the model, directly incentivizing honest effort rather than just raw compute. This addresses the core challenge of preventing “free-riding” in decentralized settings.

How Does Covenant-72B Address Communication and Incentive Compatibility?

Training dozens of untrusted, hardware-diverse nodes to collaboratively train a single model is challenging: it involves communication efficiency and preventing malicious contributions.

SN3 employs two key components: SparseLoCo and Gauntlet.

SparseLoCo tackles communication efficiency. Traditional distributed training synchronizes full gradients each step, which is bandwidth-heavy. SparseLoCo runs 30 local optimization steps (AdamW) per node, then compresses and uploads “pseudo-gradients” to others. Compression techniques include Top-k sparsification (keeping only the most important gradient components), error feedback (accumulating dropped parts), and 2-bit quantization. This achieves over 146x compression, reducing data transfer from 100MB to under 1MB.

This allows the system to operate over typical internet connections (upstream 110 Mbps, downstream 500 Mbps), maintaining about 94.5% compute utilization—20 nodes, each with 8 B200 GPUs, with each round taking only 70 seconds.

Gauntlet ensures incentive compatibility. Running on Bittensor’s blockchain (Subnet 3), it verifies the quality of submitted pseudo-gradients by testing how much they reduce loss on a small dataset (LossScore). It also checks whether nodes are training on their assigned data—if a node improves loss more on random data than on its own data, it gets penalized.

Only the top-scoring nodes’ gradients are aggregated each round; others are replaced. The system maintains robustness by allowing new participants to join at any time. On average, about 16.9 nodes contribute gradients per round, with over 70 unique nodes participating over the entire training.

The Fundamental Shift in Decentralized AI Narrative

From a technical and industry perspective, Covenant-72B signifies several meaningful developments.

First, it challenges the assumption that “distributed training only works for small models.” While still far from frontier models, it demonstrates scalability.

Second, permissionless participation is feasible. Previously, distributed projects relied on whitelists and pre-approval; SN3 shows anyone with sufficient compute can join, with the validation mechanism filtering malicious actors. This is a concrete step toward “true decentralization.”

Third, Bittensor’s dTAO mechanism enables market discovery of subnet value. Subnets can issue their own tokens, and markets determine which subnets receive more TAO emissions via AMM mechanisms. This provides a rough but effective value capture for productive subnets like SN3. However, this mechanism remains vulnerable to narrative and sentiment influences, and the quality of LLM training results is hard for ordinary market participants to evaluate independently.

Fourth, the political-economic implications of decentralized AI training are profound. Clark raised this in Import AI: “Who owns the future of AI?” Currently, a few large data-center owners dominate frontier model training—this is a power and commercial issue. If decentralized training continues to advance, it could foster a genuinely open ecosystem for certain models, especially smaller, domain-specific ones. But this future is still distant.

Summary: A Real Milestone and a Series of Open Questions

Huang likened this to “a modern version of Folding@home.” That project made real contributions to molecular science but didn’t threaten the core R&D of big pharma. The analogy is apt.

SN3 has validated the protocol and demonstrated the feasibility of decentralized training. But behind this achievement lie many questions rarely discussed:

  • MMLU itself is controversial; benchmarks may leak test data into training sets. Comparing Covenant-72B to LLaMA-2-70B from 2023-2024 may underestimate the current state of the art. If evaluated on more recent, robust benchmarks, the results might differ.

  • Data quality remains a bottleneck. High-quality training data—dialogues, code, scientific literature—is concentrated in the hands of a few large companies, publishers, and academic institutions. Democratizing compute doesn’t solve data monopoly issues.

  • Security concerns: permissionless participation means unknown nodes, unknown data sources. Gauntlet filters obvious anomalies but can’t prevent subtle data poisoning. If malicious nodes systematically bias training, the model’s behavior could drift in harmful directions. In high-stakes domains like finance, healthcare, or law, deploying such models could pose risks.

  • Covenant-72B is open-sourced under Apache 2.0 and does not use SN3 tokens. Holders of SN3 tokens share in future emissions from ongoing training, not direct model usage revenue. The token’s value depends on continuous training output and the health of the overall network emission mechanism. If training stalls or results decline, token valuation could weaken.

Listing these issues isn’t meant to dismiss Covenant-72B’s significance. It proves that something once thought impossible can be achieved. But achieving it and understanding what it means are two different things.

Over the past month, SN3’s token surged 440%. This gap isn’t just hype; narrative speed often outpaces reality. Whether this gap will be closed by actual progress or market correction depends on what Covenant AI team delivers next.

Notably, Grayscale filed for a TAO ETF in January 2026, signaling institutional interest. Additionally, Bittensor plans to halve daily TAO emissions in December 2025, tightening supply-side dynamics.

References:

TAO-7,8%
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin