OpenAI partnered with Broadcom in October 2025 to design a custom inference chip aimed at reducing the growing expense of ...
The next phase of AI infrastructure will not be defined by a single destination called “the cloud” or “the edge.” ...
AI inference infrastructure investment pulled $1.8 billion in 48 hours as Baseten’s $1.5B round at a $13B valuation and ...
The companies attributed this speed to a deep software-hardware co-development process that actively used OpenAI’s own models ...
Chinese AI models are rapidly closing the gap with U.S. frontier systems. This analysis examines what their growing ...
Nvidia is aiming to dramatically accelerate and optimize the deployment of generative AI large language models (LLMs) with a new approach to delivering models for rapid inference. At Nvidia GTC today, ...
Nvidia has released analysis showing a 4X to 10X reduction in cost per token for AI inferencing by switching to open source models. The cost discounts required combining Blackwell hardware with two ...
You're currently following this author! Want to unfollow? Unsubscribe via the link in your email. The AI coding sector has a problem. Heavy users of AI coding services have been racking up huge costs, ...