The New Engine of Intelligence — Dr. Afnizanfaizal

In the early twentieth century, access to electrical power was the invisible variable that separated industrialising economies from those left behind. You could have the blueprints for a factory, the capital, and the workforce — but without reliable electricity, none of it ran. A century later, we find ourselves in a structurally identical moment, only the resource is compute, the factory is a data centre, and the electricity running through it is measured in GPU-hours.

Jensen Huang, NVIDIA's CEO, put it plainly in a recent earnings call: "Countries around the world are recognising AI as essential infrastructure — just like electricity and the internet." That framing is not marketing. It is a strategic assessment that is now shaping national policy, corporate capital allocation, and the competitive positioning of every serious technology organisation on earth.

I have been tracking this shift for nearly two decades. What has changed is the velocity. The GPU is no longer a component inside a workstation. It has become the primary unit of competitive advantage in the AI economy — and access to it is becoming the defining question of the decade.

The numbers behind the surge

Let us start with the facts, because the scale here tends to exceed intuition. NVIDIA — which commands roughly 86% of the AI data centre accelerator market as of late 2025, up from just 25% in 2021 — reported data centre revenue of USD 41.1 billion in a single quarter, representing a 73% year-on-year increase. That is not the revenue of a chip company. That is the revenue of a utility. (Sources: NVIDIA SEC filings (Q1–Q2 FY2026); Visual Capitalist / Voronoi AI chip market data; Deloitte Technology Predictions 2026)

USD 41.1B

NVIDIA data centre revenue, Q2 FY2026 (single quarter)

86%

NVIDIA's AI data centre market share, late 2025 (up from 25% in 2021)

USD 400–450B

Global AI data centre capex forecast for 2026 (Deloitte)

USD 1T

Projected annual AI data centre capex by 2028

Zoom out further and the picture becomes even more striking. Gartner projects global AI spending approaching USD 1.5 trillion in 2025, encompassing generative AI software, cloud services, and AI-optimised infrastructure. Citigroup has forecast AI-related capital spending by the major technology firms to exceed USD 2.8 trillion through 2029. These are not speculative projections. They are commitments already embedded in corporate balance sheets and government budgets.

The server market tells its own story. Worldwide server market spending grew 97.3% year-on-year in Q2 2025 — effectively doubling in twelve months. The AI data centre GPU market, valued at USD 10.5 billion in 2025, is projected to reach USD 77.15 billion by 2035, compounding at 22% annually. And by 2026, Deloitte estimates that inference workloads alone — the compute required to run, not just train, AI models — will account for roughly two-thirds of all compute, a market exceeding USD 50 billion.

The shift from training to inference is the signal that AI has crossed the threshold from laboratory to economy. When a technology's running costs dwarf its development costs, it has become infrastructure.

Why GPUs specifically — and why it is harder to substitute than it looks

The GPU's dominance over AI workloads is architectural. Where a CPU executes tasks sequentially — one complex instruction at a time — a GPU executes thousands of simpler operations in parallel. Training a large language model involves billions of floating-point multiplications happening simultaneously across matrices of extraordinary dimensions. The GPU was built, almost accidentally, to do exactly this.

But the deeper moat is not the hardware. It is the software. NVIDIA's CUDA platform, built over nearly two decades, represents accumulated developer tooling, library optimisation, and ecosystem lock-in that cannot be replicated quickly. By 2026, over 75% of AI models rely on specialised accelerator chips, and the majority of those are running on CUDA-optimised stacks. Intel's trajectory is instructive: it held 68% of the data centre revenue market in 2021 and has since fallen to approximately 6%, primarily because its CPU-first architecture failed to pivot fast enough when the workload shifted.

This is Clayton Christensen's disruption logic playing out in real time. NVIDIA did not win by making better CPUs. It redefined the measure of performance entirely — from single-threaded clock speed to parallelised floating-point throughput. And the incumbents, optimised for the old metric, could not respond quickly enough.

Geopolitics enters the server room

If compute is infrastructure, then controlling compute is geopolitical power. The United States has understood this earlier and more aggressively than most. U.S. export controls on advanced AI chips have reshaped the global market in ways that are still reverberating.

NVIDIA's CEO Jensen Huang stated publicly in late 2025 that the company's share of China's advanced AI accelerator market had collapsed from roughly 95% to effectively zero, following successive rounds of export restrictions. The financial impact was immediate: in Q1 FY2026 alone, NVIDIA took a USD 4.5 billion charge related to H20 chip inventory and purchase obligations following new licensing requirements for China exports.

The geopolitical lesson is stark: nations and companies that do not build or secure access to sovereign compute infrastructure are making themselves dependent on supply chains that can be interrupted — with one policy announcement — overnight.

China's response has been to accelerate domestic silicon development, with Huawei, Cambricon, and others scaling their own AI accelerator programmes. The EU, meanwhile, is investing in its own AI infrastructure: NVIDIA is actively collaborating with France, Germany, Italy, Spain and the UK to build Blackwell AI infrastructure for European manufacturers. Japan is hosting the Fugaku successor supercomputer. The Stargate Project in the US represents a USD 500 billion commitment to domestic AI infrastructure dominance.

What was once a technology procurement decision has become a foreign policy matter. Sovereign compute — the idea that nations must own and control their own AI infrastructure — is now a serious framework in policy circles from Brussels to Kuala Lumpur to Riyadh.

The startup problem — and the democratisation opportunity

For startups, the GPU question is existential in a different but equally urgent way. Access to compute determines what you can build, how fast you can iterate, and whether you can compete with better-funded rivals. The H100, NVIDIA's previous-generation workhorse, carries a list price that can reach USD 30,000 per unit. A meaningful training cluster requires hundreds of them. For a pre-Series A team, that arithmetic is prohibitive.

But the market is responding. The GPU-as-a-service rental market reached USD 7.38 billion in 2026 and is projected to grow a further 28.73% in 2027. Cloud providers — AWS, Google Cloud, Microsoft Azure, Oracle — are racing to expand GPU-backed compute availability. Specialised providers are emerging with startup-oriented pricing and developer tooling designed to reduce the friction of access. The inference chip market, targeting real-time AI execution rather than training, is itself projected to exceed USD 50 billion in 2026, creating space for more cost-efficient deployment models.

This democratisation trajectory matters enormously for emerging market startups. The key insight — which I will return to in a future post — is that in this environment, a startup's strategic advantage rarely comes from owning compute. It comes from using compute more efficiently, on better data, for a more precisely defined problem, than better-capitalised competitors who are solving for breadth rather than depth.

What the power grid tells us about where this is going

There is one signal that, in my view, captures the true scale of the GPU buildout better than any revenue figure: electricity. Data centres globally could consume approximately 945 terawatt-hours of electricity by 2030, roughly double today's level. From 2024 to 2030, data centre electricity demand is projected to grow around 15% per year — more than four times faster than total electricity consumption growth. In Texas alone, utility Oncor has reportedly received data centre power requests totalling 119 gigawatts — far exceeding its current generation capacity.

NVIDIA's response at the hardware level is telling. At CES 2026, it unveiled the Rubin architecture, a rack-level design combining six specialised chips into a unified system, claiming 40% higher energy efficiency per watt compared to its predecessor. When a chip company's primary engineering metric shifts from raw performance to performance per watt, it signals that the compute arms race has hit a physical constraint: the power grid itself.

Ray Kurzweil's observation that exponential trends are frequently dismissed until they become undeniable applies precisely here. The infrastructure buildout for AI compute looked, until recently, like an overshoot. It is now increasingly clear it may be an undershoot.

The implication for builders, leaders, and learners

I write this not as a spectator of silicon economics but as a practitioner who has watched compute constraints shape — and occasionally doom — technology strategies across multiple cycles. The implication for anyone building or leading in this environment is threefold.

First, understand your compute exposure. Whether you are a startup founder, a corporate innovation lead, or a government digital officer, you need to know where your AI capability depends on infrastructure you do not control — and what happens when access to that infrastructure is disrupted, rationed, or priced out of reach.

Second, watch the inference market more than the training market. Training grabs headlines, but inference is where AI generates economic value at scale. The companies and startups that master efficient, low-latency inference — on the right data, for the right use case — will capture disproportionate returns in the next three years.

Third, and most importantly: do not confuse access to compute with the ability to use it well. The GPU is necessary but not sufficient. The organisations creating durable value from AI are those that combine compute access with deep domain expertise, high-quality proprietary data, and — critically — the institutional will to apply what they learn to the problems of real people.

Owning the infrastructure is the starting line. Knowing what to build on it is the race.