Although a single TPU does not match the performance of the most powerful GPU, Google is leveraging ultra-large-scale clusters and higher cost-effectiveness to challenge NVIDIA's pricing power and market dominance. The real battleground lies in the ecosystem and business model — NVIDIA locks in users with CUDA, while Google opens new entry points with TPU + Gemini. NVIDIA has a clear advantage in versatility and ecosystem maturity, but as an increasing number of leading customers begin to experiment with TPUs, any slight shift will be quickly magnified by the market.
When $Alphabet-C (GOOG.US)$ Google has begun attempting to sell its self-developed AI chip, the TPU (Tensor Processing Unit), into a broader market, bringing this 'chip competition,' which was originally confined to the cloud, to the forefront and posing a substantive challenge to NVIDIA, the dominant player in the AI chip industry.
According to the latest article by technology media outlet The Information, $NVIDIA (NVDA.US)$ It is impossible to ignore that the world's two most advanced AI models—one from Google and the other from Anthropic—were fully or partially developed using Google’s self-developed TPU chips rather than NVIDIA’s GPUs. This reality has prompted one of NVIDIA’s largest customers$Meta Platforms (META.US)$ to seriously consider using Google’s TPU to develop new models.
This signifies that the role of the TPU has evolved from being an 'internal tool for Google' to becoming a viable alternative that major AI companies are seriously considering. According to a previous analysis by Morgan Stanley, Google plans to produce over 3 million TPUs by 2026 and approximately 5 million by 2027, while NVIDIA's current GPU production volume is about three times that of Google’s TPUs.
Although a single TPU does not match the performance of the most powerful GPU, Google is leveraging ultra-large-scale clusters and higher cost-effectiveness to challenge NVIDIA's pricing power and market dominance. The real battleground lies in the ecosystem and business model — NVIDIA locks in users with CUDA, while Google opens new entry points with TPU + Gemini. NVIDIA has a clear advantage in versatility and ecosystem maturity, but as an increasing number of leading customers begin to experiment with TPUs, any slight shift will be quickly magnified by the market.
Performance Comparison: Does a Single Chip Lose While the System Wins?
In terms of pure computational power, the most advanced TPU (codenamed Ironwood) delivers about half the number of floating-point operations per second (FLOPS) compared to NVIDIA's Blackwell GPU.
However, this does not mean that the TPU is at a disadvantage.
The Information reports that Google's strategy is to amplify performance advantages through 'clustering': thousands of TPUs can be connected in series to form a 'SuperPod,' offering excellent cost-performance and energy efficiency ratios when training ultra-large models. In contrast, NVIDIA’s systems can directly connect up to only around 256 GPU chips, although users can scale further with additional networking equipment.
In the era of large models, it has become increasingly difficult to determine superiority based solely on 'single-chip performance.' System-level design, interconnection capabilities, and energy efficiency ratios are emerging as new core metrics.
Key Difference: Software Ecosystem Remains NVIDIA's Moat
What truly constitutes NVIDIA's 'moat' is not just the hardware, but the deeply integrated CUDA software ecosystem.
The Information article states that for customers already using NVIDIA’s CUDA programming language to run AI, renting NVIDIA chips is more cost-effective. Developers with the time and resources to rewrite programs can save costs by using TPUs.
For advanced TPU customers such as Anthropic, Apple, and Meta, the challenges of using TPUs are relatively minor, as they are more adept at writing server chip software for AI applications. TPUs demonstrate particularly high cost efficiency when running Google’s Gemini models optimized for them.
However, software compatibility remains a major challenge for TPUs. TPUs can only work seamlessly with specific AI software tools like TensorFlow, while PyTorch, which is widely used by most AI researchers, performs better on GPUs. Multiple engineers have noted that if developers invest time in creating custom software to fully leverage GPUs, their performance could surpass that of TPUs.
Cost Battle: TPUs Are Not 'Cheap'
In terms of manufacturing costs, TPUs and GPUs are actually not far apart. Ironwood uses a more advanced, and thus more expensive, process technology than Blackwell, but due to the smaller chip size, more TPUs can be cut from a single wafer, partially offsetting the cost disadvantage.
Both use High Bandwidth Memory (HBM), and Broadcom plays an extremely critical role in processes and packaging—participating not only in packaging design but also providing key IPs like SerDes (a core technology for high-speed data transmission). Analyst firms estimate that Broadcom earned at least $8 billion from the TPU project.
Notably, NVIDIA’s current hardware business gross margin is as high as 63%, whereas Google Cloud’s overall margin is only 24%. This also explains why NVIDIA can maintain strong profitability even in price wars.
Capacity Game: Taiwan Semiconductor’s 'Balancing Act'
On the foundry side, Taiwan Semiconductor will not bet all its capacity on a single customer. Even with NVIDIA’s extremely strong demand, it is difficult to secure 'unlimited supply.' This means that there will always be room in the market for other solutions—including TPUs.
According to Morgan Stanley's forecast, Google plans to produce 3 million TPUs by 2026, reaching 5 million by 2027, or possibly even higher. Currently, NVIDIA's GPU production is about three times that of TPUs, but the gap is narrowing.
As supply begins to diversify, customers will naturally become more willing to compare, negotiate prices, and spread risks.
Commercialization Challenges: Selling Chips Is Far Harder Than Expected
The Information believes that if Google truly intends to sell TPUs on a large scale, it would need to almost entirely rebuild an entire industrial chain—including server manufacturers, distribution networks, and enterprise-grade after-sales support—essentially 'replicating NVIDIA.'
Moreover, if customers deploy TPUs in their own data centers, Google will lose part of its cloud service revenue (such as storage and database services). This implies that TPUs are unlikely to adopt a 'low-price strategy' in the future but instead compensate for revenue gaps through other fees.
In other words, this is not a business where 'being cheap guarantees success,' but rather a complex strategic choice.
From a broader perspective, the significance of TPUs for Google goes beyond hardware revenue itself. More importantly, they can serve as bargaining chips in negotiations with NVIDIA, help promote Gemini and its AI ecosystem, and give Google greater autonomy in AI infrastructure. As long as customers are willing to have 'another option,' NVIDIA will no longer hold absolute pricing power.
This, perhaps, is what Google truly desires.
Editor/Liam