NVIDIA reportedly developing AI inference chips, OpenAI may become the largest customer

2026-02-28 09:25:44

Caixin February 28 News (Editor: Xia Junxiong) According to media reports citing sources, chip giant NVIDIA plans to release a new processor specifically designed for AI research companies like OpenAI and other clients to help them build faster, more efficient tools.

Sources revealed that NVIDIA is developing a new inference computing system. This new platform is expected to be announced next month at the NVIDIA GTC developer conference in San Jose and will incorporate chips designed by startup Groq.

Inference computing is a method that enables AI models to respond to user questions, and this area has become a fiercely competitive industry focus. Companies like Google and Amazon have already designed chips to compete with NVIDIA’s flagship systems.

The rapid development of automation programming in the tech industry has also driven demand for new types of chips that can more efficiently handle complex AI-related tasks.

Sources said that OpenAI has agreed to become one of the largest customers for this new processor, marking a significant victory for NVIDIA.

As one of NVIDIA’s biggest clients, OpenAI has been seeking more efficient alternatives to NVIDIA chips over the past few months and signed an agreement last month with chip startup Cerebras to diversify its options.

Potential Challenges for NVIDIA GPUs

NVIDIA has long dominated the GPU (Graphics Processing Unit) market. Analysts estimate that NVIDIA controls over 90% of the GPU market share.

GPUs are processors capable of executing billions of simple tasks simultaneously.

NVIDIA’s Hopper, Blackwell, and Rubin series GPUs are considered industry benchmarks for training large-scale AI models, and their prices are high.

However, since the AI boom, NVIDIA is facing limitations with its flagship products for the first time. As the market focus shifts from training to inference, some customers are pressuring NVIDIA to develop more efficient chips for AI applications.

Over the past year, as companies deploy AI agents and other tools, the demand for advanced computing power has shifted from training to inference. AI agents are systems capable of autonomously performing tasks on behalf of users.

Many companies building and operating AI agents have found that GPUs are too costly, consume too much energy, and are not optimal for running models in practice. With the rapid rise of “agentic AI,” NVIDIA faces pressure to develop lower-cost, more energy-efficient inference chips.

Last month, OpenAI signed a multi-billion-dollar computing partnership with Cerebras. Cerebras offers inference-focused chips that the company claims are faster than NVIDIA GPUs.

For NVIDIA, Google’s in-house developed Tensor Processing Units (TPUs) are also significant challengers. In fact, Google is pushing to make TPUs capable of replacing GPUs.

To strengthen its competitive moat, NVIDIA agreed at the end of last year to pay $20 billion to license key technology from Groq and hired its senior team, including founder Jonathan Ross. This was one of the largest “acqui-hire” deals in Silicon Valley history.

Groq’s chips use a different architecture from NVIDIA’s and are called “Language Processing Units,” which are highly efficient for inference functions. However, NVIDIA has not yet publicly disclosed how it plans to utilize Groq’s technology.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.