Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Nucleus-Image is open source, with 17B parameter inference only activating 2B, surpassing Imagen4 without post-training benchmarks
ME News Report, April 16 (UTC+8), according to Beating Monitoring, the Nucleus AI team released the text-to-image model Nucleus-Image, simultaneously open-sourcing the model weights, training code, and training dataset. The license is Apache 2.0, allowing commercial use. The model adopts a sparse mixture of experts (MoE) diffusion transformer architecture, with a total of 17 billion parameters, distributed across 64 routing experts per layer. During inference, only about 2 billion parameters are activated, significantly reducing inference costs compared to dense models of similar size.
On three standard benchmarks, Nucleus-Image matches or even surpasses closed-source leading models: GenEval score of 0.87, on par with Qianwen Image Model, with the spatial position sub-item (0.85) ranking first among all comparison models; DPG-Bench score of 88.79, ranking first overall; OneIG-Bench score of 0.522, surpassing Google Imagen4 (0.515) and Recraft V3 (0.502). All these results are based solely on pure pretraining, without DPO, reinforcement learning, or human preference tuning.
Nucleus AI official states this is “the first fully open-source MoE diffusion model at this quality level.” The training data was crawled extensively from the internet, filtered, deduplicated, and aesthetically scored multiple times, resulting in 700 million images and 1.5 billion image-text pairs. Training proceeded in three stages, gradually increasing resolution from 256 to 1024, totaling 1.7 million steps.
The text encoder uses Qwen3-VL-8B-Instruct, called via the diffusers library, with built-in cross-denoising step text KV cache, further reducing inference overhead.
For developers needing local deployment of image generation, the design of 17B parameters with only 2B active means consumer-grade GPUs can run it. Complete open-source release (weights + training code + dataset) is relatively rare—most open-source image models only release weights, with datasets and training details still closed, which is one of the main bottlenecks for reproducible research in text-to-image generation.
(Source: BlockBeats)