OpenAI's First Custom Chip 'Jalapeño' Targets the Inference Bottleneck

OpenAI and Broadcom unveiled Jalapeño, a custom silicon built specifically for inference — OpenAI's clearest move yet to escape its dependence on third-party GPUs.

OpenAI's First Custom Chip 'Jalapeño' Targets the Inference Bottleneck

OpenAI is no longer content to be just a customer in the great AI compute scramble. Together with Broadcom, the company has unveiled Jalapeño, its first custom chip — and it is purpose-built for one job: inference.

Built for serving, not training

While much of the industry's silicon arms race has centered on training ever-larger models, the real recurring cost for a company like OpenAI is inference — actually running those models for hundreds of millions of users every day. Jalapeño is optimized specifically for those serving workloads, where efficiency per token translates directly into margin. OpenAI has already received initial samples, with deployment planned by the end of the year.

Loosening Nvidia's grip

The strategic message is hard to miss. By designing its own inference silicon with Broadcom, OpenAI reduces its dependence on third-party chip suppliers and gains leverage over its single largest cost center. It joins Google (TPUs), Amazon (Trainium and Inferentia), and Microsoft on the growing list of AI giants building in-house accelerators rather than paying premium prices for general-purpose GPUs.

Why it matters

Custom inference chips are how the economics of large-scale AI get rewritten. If Jalapeño delivers meaningful cost-per-query savings, OpenAI can serve more users at lower prices — or protect margins as competition intensifies. It also signals a broader 2026 trend: the frontier labs increasingly want to own the full stack, from model weights down to the metal.