AI Hardware Value Chain — The Stack & The Chip Deep Dive

Thematic Research · AI Hardware Series · v2.0 · March 2026

AI Hardware Value Chain

The stack, the silicon, and why chips are the defining constraint of the AI era

Document TypeStandalone Reference Study

ScopeGlobal — No Market Restriction

StructurePart A: The Stack · Part B: Chips Deep Dive

HorizonStructural + 5–10 Year Forward

CompanionASX IT Sector Thesis v1.0

Central Thesis

AI is infrastructure — and like all infrastructure buildouts, the physical layer that enables it concentrates extraordinary value in the hands of a very few structurally irreplaceable suppliers. At the top of that physical stack sits the semiconductor: the one component for which there is no substitute, no shortcut, and no path around a handful of companies that have spent decades building moats the rest of the world cannot replicate in any near-term timeframe. This document maps the full hardware value chain and then drills into the chip layer — where the money, the moat, and the geopolitical risk converge.

Part A

The AI Hardware Stack — Market Size, Margin Pools & Key Players

The Layer Cake — Mapping the AI Hardware Stack

30-Second Summary

▸Jensen Huang's layer cake — Energy → Chips → Infrastructure → Models → Applications — is the most useful single frame for thinking about AI's physical and economic architecture. The lower the layer, the more capital-intensive, the more concentrated, and the more durable the economics.

▸Total AI infrastructure investment (2024–2028) is estimated at $1–2 trillion globally, with the largest share flowing to data centre build-out and chip procurement. Hyperscalers (Microsoft, Google, Amazon, Meta) are the primary capex deployers; semiconductor companies and contract manufacturers are the primary beneficiaries.

▸The model and application layers command the highest valuations but face the most competitive pressure over time. The physical layers — chips, power, cooling, networking — face lower disruption risk but require sustained reinvestment. The structural question is: which layer becomes the binding constraint as AI scales?

▸Today the binding constraint is compute (chips). Tomorrow it will be power and grid infrastructure. The investment thesis follows the constraint: when chips are scarce, own Nvidia and TSMC; when power is the bottleneck, own utilities and grid infrastructure suppliers.

The Layer Cake — Unpacked

Nvidia's Jensen Huang introduced the layer cake framing to describe AI's economic architecture from the ground up. It is simple, correct, and underappreciated by investors who focus exclusively on the application layer. Each layer has distinct market size, margin characteristics, competitive structure, and investment duration. Reading from bottom to top:

Layer 1

Energy

~$90B incremental AI-driven spend by 2030

The physical foundation. AI data centres are power-hungry in a way no prior compute generation was — a hyperscale AI campus consumes 300–500 MW, equivalent to a small city. Electricity itself is not the bottleneck; the grid infrastructure connecting generation to load is. Transformers, switchgear, transmission capacity, and backup power systems are the actual scarce inputs. AI is estimated to consume an additional 200–300 TWh annually by 2030 — roughly 0.8% of current global electricity production, but concentrated in specific geographies, creating local grid stress that has become the primary site selection constraint for new data centres.

Key players: Utilities (NextEra, Vistra, AGL), Grid (ABB, Eaton, Schneider Electric), Nuclear (Constellation, NuScale), Backup power (Cummins, Aggreko)

Layer 2

Chips

~$200B AI chip market by 2027

The current binding constraint and the focus of Part B. Semiconductors — GPUs, custom AI ASICs, networking chips, and memory — are the component without which AI compute cannot occur. The market is bifurcated: compute chips (Nvidia dominant at ~80% AI GPU share; Google/Amazon/Microsoft custom ASICs growing) and memory chips (HBM bandwidth is the AI performance bottleneck; SK Hynix leads with ~50% HBM share). Fabrication is controlled by TSMC at the leading edge. This layer has the highest gross margins in the entire value chain — Nvidia operates at ~75% gross margins.

Key players: Nvidia, AMD, Intel (design); TSMC, Samsung (fab); SK Hynix, Micron, Samsung (memory); ASML (lithography); Broadcom, Marvell (networking chips); ASIC: Google TPU, Amazon Trainium, Microsoft Maia

Layer 3

Infrastructure (Data Centres)

~$500B annual global DC investment by 2030

The physical layer that houses compute. There are ~12,000 data centres globally with ~114 GW of installed capacity, growing toward 200 GW by 2030. For AI specifically, the relevant metric is GPU density — an AI training cluster requires far more power per rack (40–100 kW/rack vs 5–10 kW/rack for standard compute) and generates heat that demands liquid cooling rather than air cooling. The server layer (physical hardware housing the chips) is dominated by hyperscaler custom designs, with ODM players (Quanta, Foxconn, Super Micro) doing the assembly. Networking within and between clusters (InfiniBand for training, Ethernet for inference) is a distinct high-margin sub-segment.

Key players: Hyperscalers building (Microsoft, Google, Amazon, Meta); Servers (Dell, HPE, Super Micro, Quanta); Networking (Nvidia InfiniBand, Arista, Cisco, Juniper); Cooling (Vertiv, Schneider, Asetek); REITs (Equinix, Digital Realty)

Layer 4

Models

~$100B+ model training capex by 2025; frontier model market ~$60B revenue by 2028

The software intelligence layer — foundation models trained on the infrastructure below. This is where OpenAI, Anthropic, Google DeepMind, Meta, Mistral, and dozens of others compete. Training costs for frontier models now run $50–500M+ per run, creating a capital barrier that concentrates capability at the top. Inference — serving the model to users — is orders of magnitude larger in aggregate compute terms and drives the sustained demand for the chip and infrastructure layers below. Margin structures in the model layer are being actively competed away by open-source models (Llama, Mistral) and API commoditisation, making durable economic capture here harder than it appears.

Key players: OpenAI, Anthropic, Google DeepMind, Meta AI, Mistral, xAI, Cohere; Cloud inference: AWS Bedrock, Azure AI, Google Vertex AI

Layer 5

Applications

~$1T+ total addressable market by 2030 (contested)

The top of the stack — AI as a feature woven into every business and consumer product. The value here is real but diffuse: it accrues not to standalone AI companies but to existing businesses whose economics improve from AI adoption. At the application layer, AI is closer to a feature than a business — the analogy is electricity, which reshaped every industry but did not produce durable standalone businesses selling "electricity as a service." The winners in the application layer are identifiable sector by sector (healthcare, legal, coding, enterprise software) but the pace of commoditisation at the model layer makes application-level moats highly uncertain at this stage of the cycle.

Key players: Microsoft (Copilot), Salesforce, ServiceNow, Adobe, Palantir (enterprise); GitHub Copilot, Cursor (dev tools); vertical AI players in healthcare, legal, finance

Market Sizing — The Numbers at Each Layer

Layer	2024 Market Size (est.)	2027–2030 Estimate	Growth Driver	Margin Character
Chips (AI semis)	~$70B (AI GPU revenue)	~$200B by 2027	Training + inference demand; model size scaling	Very high — 60–75% gross margin at design leaders
Data Centre Build-Out	~$250B global DC capex	~$500B+ by 2030	Hyperscaler capex; AI-specific density requirements	Moderate — construction/equipment margins; REITs earn stable yield
Networking (DC)	~$25B	~$60B by 2028	GPU cluster scale-out; 400G/800G Ethernet adoption	High — 50–60% gross margin for leading vendors
Energy / Power Infra	Difficult to isolate; AI adds ~$30B/yr to grid capex	~$90B+ incremental by 2030	Grid bottleneck; AI campus power requirements	Moderate — regulated utilities; equipment makers higher margin
Models (API / training)	~$20B API revenue; $40B+ training capex	~$100B+ by 2028	Enterprise AI adoption; inference volume growth	Declining — commoditisation pressure; open-source competing
Applications	Nascent — most value unrealised	$500B–1T+ (highly uncertain)	Productivity gains; AI feature embedding	Variable — incumbents gain most; AI-native apps unproven at scale
Market size estimates are directional; sources include Morgan Stanley, Goldman Sachs, SemiAnalysis, and company disclosures. AI GPU revenue is Nvidia data centre revenue as the primary proxy.

So What?

The layer cake is not just a conceptual framework — it is a capital allocation map. The highest gross margins and most durable competitive positions sit in the chip layer (Nvidia, TSMC, ASML, SK Hynix). The largest total capital deployment is at the infrastructure layer (data centres, power), but margin per dollar of investment is lower. The application layer commands the highest speculative valuations but faces the most uncertain long-term margin structure. For a research framework, this means the analytical priority should be the chip layer — and that is exactly what Part B addresses.

Energy & Power Infrastructure — The Next Binding Constraint

Why This Deserves More Than a Paragraph

▸Chips are the current binding constraint. Power is the next one — and unlike chips, the timeline to relieve it is measured in decades, not product cycles. The grid interconnection queue in the US alone exceeds 2,600 GW of requested capacity. Average wait time to connect a large industrial load: 5–7 years.

▸AI data centres are not a gradual incremental load on the grid. A single hyperscale AI campus draws 300–500 MW — the equivalent of adding a small city's worth of demand in a single project. Grid infrastructure was not designed for this step-change, and the upgrade timeline is structurally longer than any near-term AI buildout forecast.

▸Nuclear is re-emerging as the only credible answer for baseload power at data centre scale. Microsoft, Google, Amazon, and Meta have all signed long-term PPAs with nuclear operators or made direct investments in SMR (Small Modular Reactor) developers. This is not greenwashing — it is a hard-headed response to the impossibility of running a 24/7 AI campus on intermittent renewables alone.

▸The investment implication: the power infrastructure layer is larger in total capital requirement than the chip layer. The constraint is not capital — it is time, permitting, and engineering talent. Companies that sit on the scarce inputs (transformer manufacturers, grid switchgear, liquid cooling systems, long-duration storage) are structurally advantaged regardless of which AI model architecture wins.

The Grid Problem — Quantified

The US electrical grid was designed around a world of relatively predictable, geographically dispersed demand growth. AI data centres are the opposite: sudden, enormous, geographically concentrated loads appearing in areas whose grid was sized for agricultural or light industrial use. The mismatch between what hyperscalers need and what the grid can deliver is the primary real-world constraint on AI infrastructure buildout — not chip supply, not permitting for data centre construction, but transformer availability and transmission capacity.

The numbers are stark. A single large EHV (Extra High Voltage) transformer — the type required to step down transmission-level power for a hyperscale campus — costs $3–7M, weighs 200–400 tonnes, and has a lead time of 18–24 months from order to delivery. There are only a handful of manufacturers globally capable of producing them at the required voltage class: ABB, Siemens Energy, Hitachi Energy, SPX Transformer Solutions in the US. Combined global production capacity is estimated at 700–900 units per year against rising demand. The supply of transformers has become one of the least-discussed but most practically constraining bottlenecks in the AI infrastructure buildout.

US Grid Interconnection Queue

2,600+ GW

requested capacity; avg wait 5–7 yrs

Power Per AI Campus

300–500 MW

hyperscale AI; vs 30–50 MW traditional DC

EHV Transformer Lead Time

18–24 mo

critical grid equipment; constrained supply

Global DC Power by 2030

~200 GW

from ~114 GW today; +75% capacity growth

AI Share of DC Power by 2030

~35%

from ~5–15% today

The Power Supply Stack — Who Sits Where

Sub-Segment	What It Does	Key Players	Bottleneck?	Investment Angle
EHV Transformers	Step down transmission voltage to usable campus power. Critical single point; not substitutable.	Hitachi Energy, ABB, Siemens Energy, SPX (US)	Yes — 18–24 month lead times; global capacity ~700–900 units/yr	High — pricing power emerging; order books extended 2–3 years. Hitachi Energy most directly exposed to AI DC demand.
Thermal / Liquid Cooling	Remove heat generated by high-density GPU racks. Air cooling inadequate above ~40 kW/rack; AI racks run 40–100 kW.	Vertiv, Schneider Electric, Asetek, CoolIT, Alibaba (internal)	Partial — supply scaling but technology transition from air to liquid is not trivial	High — liquid cooling is a structural shift, not a cycle. Vertiv is the most liquid public exposure. Direct liquid cooling (DLC) adoption rate is the key metric.
UPS / Backup Power	Uninterruptible power supply and backup generation. Data centres require N+1 or 2N redundancy.	Eaton, Schneider Electric, Cummins (diesel gensets), Bloom Energy (fuel cells)	Moderate — genset lead times extended but not severe	Moderate — growing with DC construction but less differentiated. Fuel cell backup (Bloom) is interesting as grid reliability declines.
Grid Switchgear / Controls	Electrical switching and protection equipment within the campus power distribution system.	ABB, Eaton, Schneider Electric, Siemens	Moderate — same supply chain constraints as transformers but lower unit value	Moderate — benefits from same grid upgrade cycle but more commoditised than transformers.
Nuclear (Baseload)	24/7 carbon-free baseload power. The only renewable-adjacent source that can deliver firm GW-scale power without storage.	Constellation Energy (existing fleet), Kairos Power, NuScale, X-energy (SMR), TerraPower	Long-dated — SMRs are 2030+ at scale; existing nuclear PPAs being competed for now	High structural, long dated — Microsoft Three Mile Island PPA is the template. Constellation is the most liquid near-term exposure. SMR developers are pre-revenue venture-stage.
Renewables + Storage	Solar/wind for partial load coverage; battery storage for short-duration grid firming.	NextEra, AES, Sunrun; battery: QuantumScape, Form Energy (long-duration)	No — renewable capacity is abundant; the issue is intermittency and transmission	Lower for pure AI thesis — renewables can't solve the 24/7 firm power requirement alone. Long-duration storage (10–100hr) is the missing piece and largely pre-commercial.
The power infrastructure buildout is constrained not by capital availability but by manufacturing capacity (transformers, switchgear), permitting timelines (grid upgrades, nuclear), and engineering talent. These constraints are structurally slower to relieve than semiconductor supply constraints.

Geography as the New Data Centre Site Selection Variable

In the prior generation of data centre buildout, site selection was driven by fibre connectivity, real estate cost, and tax incentives. For AI-era hyperscale campuses, the dominant variable has shifted to available power and grid headroom. Virginia — historically the world's largest data centre market — is facing a power availability crisis: Dominion Energy has effectively paused new large-load interconnection approvals in parts of Northern Virginia due to transmission constraints. This has redirected investment to new geographies: the Pacific Northwest (hydro-abundant), Texas (ERCOT grid flexibility), the upper Midwest (cheap wind + available transmission), and internationally, the Nordic countries (hydro + cold climate) and the Middle East (sovereign capital + land + willingness to build dedicated generation).

The implication for the investment thesis is that geographic power availability is becoming a durable competitive advantage for data centre operators who secured land and power agreements early. Equinix and Digital Realty — the dominant colocation REITs — have decades of site relationships that are difficult to replicate. Hyperscalers building their own campuses are racing to secure long-term power purchase agreements before available capacity is fully contracted.

The Nuclear Moment — Why It's Different This Time

Nuclear power has been commercially stagnant in the US and Europe for decades — the combination of Chernobyl, Fukushima, cost overruns (Vogtle), and cheap gas made new nuclear economically unviable. The AI power demand shock is changing that calculus in a narrow but important way: hyperscalers need 24/7 carbon-free power at GW scale, and no combination of solar, wind, and battery storage can reliably deliver that at current technology costs. Microsoft's deal to restart Three Mile Island (2023, 20-year PPA with Constellation for 835 MW), Google's PPA with Kairos Power for SMR output, and Amazon's acquisition of a nuclear-powered data centre campus signal that the tech sector has concluded that nuclear is the only near-term answer for firm baseload power at data centre scale. This is not a speculative thesis — it is procurement behaviour by the world's most data-driven buyers.

So What?

The energy and power infrastructure layer is underweighted in most AI investment frameworks because it is less intellectually exciting than chip architecture debates and less visible than Nvidia's earnings. But it is the layer where the physical constraints are most severe, the lead times are longest, and the incumbent advantages (existing nuclear fleet, long-term transmission rights, cooling IP) are most durable. As chip supply normalises through 2026–2027, the power constraint will replace it as the primary bottleneck — and the companies positioned on that constraint will be the next wave of AI infrastructure beneficiaries. Transformer manufacturers and liquid cooling specialists are the current year picks; nuclear operators and long-duration storage will be the medium-term ones.

Where Value Accrues — The Margin Map & Moat Inventory

Understanding the stack architecturally is one thing. Understanding where durable economics accumulate is another. The history of technology platform buildouts — railways, telephony, the internet — suggests that infrastructure layers produce durable returns for the components where switching costs are highest and replication is hardest, while applications and services above the platform compete away margins unless they develop independent network effects or data advantages. AI is following an analogous pattern.

Three structural questions determine where value accrues in any technology value chain: Who is irreplaceable? Who benefits from rising volume without proportional cost? And who controls the bottleneck that constrains everyone else? Mapping the AI hardware stack against these questions produces a clear hierarchy.

The Irreplaceability Hierarchy

Tier 1 — Structurally Irreplaceable

ASML · TSMC · SK Hynix (HBM)

These companies sit on chokepoints in the physical supply chain that cannot be replicated in any near-term timeframe. ASML is the only company in the world that makes EUV lithography machines — the equipment without which leading-edge chips cannot be manufactured. TSMC fabricates ~90% of the world's most advanced semiconductors. SK Hynix currently leads in HBM3E, the memory bandwidth technology that determines AI GPU performance. In all three cases, the moat is a compound of physics, capital, know-how, and time — not just IP.

Tier 2 — Dominant with High Switching Cost

Nvidia · Broadcom (custom ASIC) · Cadence/Synopsys

Nvidia's moat is not purely its hardware — it is CUDA, the software ecosystem that runs on its hardware. Switching away from Nvidia means rewriting vast amounts of optimised code, retraining ML practitioners, and accepting inferior tooling during transition. Cadence and Synopsys are the duopoly in EDA (Electronic Design Automation) software — every chip designer in the world uses their tools. Broadcom is the dominant custom ASIC partner for hyperscalers building their own AI chips. All three benefit from sticky software/tooling relationships layered on top of hardware.

Tier 3 — Cyclically Attractive

Micron · Samsung · Advanced Packaging (OSAT)

Strong structural tailwinds but exposed to supply-demand cycles. Micron is a direct beneficiary of HBM demand growth but trails SK Hynix by roughly 2 years in HBM3E. Samsung dominates DRAM and NAND but has struggled with advanced packaging yield. OSAT players (ASE, Amkor, SPIL) benefit from the packaging intensity of modern AI chips (CoWoS, SoIC) but compete in a more fragmented market.

Tier 4 — Structurally Exposed

x86 CPU Incumbents · Legacy Memory · Standard Servers

Intel is the most prominent casualty of the AI chip transition — its data centre CPU franchise is under pressure from AMD's Epyc and its GPU ambitions (Gaudi) have failed to gain meaningful traction. Standard DRAM and NAND (non-HBM) face oversupply cycles with limited AI-specific pricing power. Commodity server assemblers face margin compression as hyperscalers increasingly design their own hardware. The risk in this tier is not extinction but sustained multiple compression.

Gross Margin Comparison — Physical Stack Players

Company / Segment	Layer	Gross Margin (FY2024)	Operating Margin	Key Margin Driver
Nvidia (Data Centre)	Chips — Design	~75–78%	~55%	Pricing power from scarcity + CUDA ecosystem lock-in
ASML	Chips — Equipment	~51–53%	~31%	Monopoly on EUV; service contracts on installed base
TSMC	Chips — Fab	~53–56%	~42%	Leading-edge fab monopoly; 3nm/2nm premium pricing
SK Hynix	Chips — Memory (HBM)	~40–45% (HBM blended)	~25%	HBM3E scarcity; 2–3 year technology lead over peers
Broadcom (semi)	Chips — Networking/ASIC	~65–68%	~35%	Custom ASIC design lock-in; networking software/IP
Arista Networks	Infrastructure — Networking	~60–63%	~37%	Software-defined networking; EOS platform stickiness
Vertiv	Infrastructure — Cooling/Power	~36–38%	~17%	Data centre thermal management; liquid cooling growth
Super Micro	Infrastructure — Servers	~14–17%	~8%	AI server assembly; direct procurement; thin margin model
Intel (Data Centre)	Chips — CPU/GPU	~39–42% (blended, declining)	~Negative (2024)	Margin compression from AMD competition + fab transition losses
Margins are approximate FY2024 figures sourced from company reports and consensus estimates. Nvidia data centre segment margins based on blended company disclosure. HBM margins for SK Hynix are estimated — the company does not separately disclose HBM economics.

The Architectural Shift Driving All of This

The transition from CPU-centric to GPU/accelerator-centric data centres is the underlying force that has reshaped the entire hardware stack since 2022. A traditional data centre processes tasks sequentially on a small number of powerful general-purpose processors. An AI training cluster processes matrix operations in parallel across thousands of specialist chips. This architectural shift has: (1) made Nvidia the most valuable company in the world; (2) made HBM memory bandwidth — not raw compute — the primary performance bottleneck; (3) made TSMC's leading-edge packaging capabilities as important as its fabrication; and (4) made power consumption per rack 10–20x what it was in the prior generation, triggering the grid infrastructure crisis that is now the next binding constraint.

So What?

The margin map makes clear that the chip layer is where the value chain's structural economics sit. But within chips, the hierarchy matters: equipment (ASML) and design (Nvidia) generate the highest margins; fabrication (TSMC) is high-margin but capex-intensive; memory (SK Hynix) is lucrative during scarcity but cyclical. The infrastructure layer is large in total dollar terms but margin-thin in most segments except networking. Part B now interrogates each dimension of the chip layer in depth.

Part B

The Chips Deep Dive — Architecture, Moats, Memory, Geopolitics & Cycle

The Semiconductor Value Chain — From Sand to System

A semiconductor chip is one of the most complex manufactured objects in human history — a product requiring the coordinated output of dozens of specialist industries, across multiple geographies, under tolerances measured in atoms. The value chain that produces it is not a linear assembly line but an intricate lattice of interdependent specialisations, each representing decades of accumulated knowledge that cannot be transferred quickly or cheaply.

Understanding the structure of this value chain — who does what, where the margin sits, and where the chokepoints are — is prerequisite to understanding why certain companies are structurally irreplaceable. It also explains why the geopolitical contest over semiconductor supply chains has become the central economic and strategic conflict of the 2020s.

The Six Nodes of the Semiconductor Value Chain

Node 1

EDA & IP

~80–85% gross margin

Electronic Design Automation software. Every chip designed in the world runs on Cadence or Synopsys tools. ARM licenses the fundamental processor architecture IP. Without EDA software and IP licences, chip design is impossible. Near-perfect duopoly with extremely sticky customers.

Node 2

Chip Design (Fabless)

~50–78% gross margin

Companies that design chips but do not fabricate them. Nvidia, AMD, Qualcomm, Apple, Broadcom (partially) are fabless. The design is the asset — chip architecture, instruction set, software stack. Capital-light model vs fab; intellectual intensity is the moat. Nvidia leads for AI compute; Apple M-series dominates efficiency.

Node 3

Lithography Equipment

~51–53% gross margin

The machines that print circuit patterns onto silicon wafers. ASML is the only maker of EUV (Extreme Ultraviolet) machines — the technology required for sub-7nm chips. Each EUV machine costs ~$200M and takes over a year to deliver. The installed base creates a recurring service revenue stream. Absolute global monopoly at the leading edge.

Node 4
Wafer Fabrication
~53–56% gross margin
Converting chip designs into silicon. TSMC handles ~90% of leading-edge fabrication (sub-5nm). Samsung and Intel are the only other credible players at advanced nodes. A leading-edge fab costs $20–30B to build and 5–10 years to bring to yield. TSMC's yield advantage at 3nm is estimated to be 2–3 years ahead of Samsung. See Section 05 for the full moat analysis.

Node 5

Advanced Packaging

~25–35% gross margin

Connecting multiple chips into a single package — increasingly critical for AI chips that stack compute die with HBM memory. TSMC's CoWoS (Chip on Wafer on Substrate) is the leading technology. OSAT players (ASE, Amkor) handle less advanced packaging. CoWoS capacity has been the single biggest bottleneck for Nvidia's H100/H200 supply.

Node 6

Memory

~35–50% gross margin (HBM premium)

DRAM, NAND, and HBM. Memory is the data storage and bandwidth layer. For AI, HBM (High Bandwidth Memory) — stacked DRAM dies with extremely high bandwidth — has emerged as the critical bottleneck. SK Hynix dominates HBM3E; Samsung and Micron are catching up. Standard DRAM/NAND is more commoditised. See Section 06 for deep dive.

The Fabless / Foundry / IDM Split

The semiconductor industry organises itself along a fundamental structural choice: whether to design chips, fabricate chips, or both. This choice — made once and expensive to reverse — shapes everything about a company's economics, risk profile, and competitive position.

Model	Definition	Economics	Examples	AI Winners/Losers
Fabless	Designs chips; outsources all fabrication to foundries	Asset-light; very high gross margins (50–78%); R&D intensive; no fab capex	Nvidia, AMD, Qualcomm, Apple Silicon, Broadcom (mostly), Marvell	Big winner — fabless model captured most AI design value; low fixed cost base amplifies margin leverage
Pure-Play Foundry	Fabricates chips for others; no proprietary chip designs	Extremely capex-intensive ($20–30B per fab); high gross margin at leading edge once scale achieved; long payback periods	TSMC (dominant), GlobalFoundries, SMIC (China)	Big winner — TSMC specifically; foundry model concentrates fab expertise, producing better yields than IDM fabs at equivalent nodes
IDM (Integrated Device Manufacturer)	Designs and fabricates own chips; may also sell foundry capacity	Highest capital intensity; lower margins than fabless for design; internal fab often runs below leading edge efficiency; strategic flexibility	Intel, Samsung (semi), Micron (memory IDM)	Mixed — Samsung benefits from memory IDM; Intel is the prominent casualty, struggling to compete at leading edge while carrying full fab cost structure
The fabless model's dominance in AI reflects a structural truth: design intelligence is the source of value; fabrication is the enabler. The separation of these two activities — pioneered by TSMC in the late 1980s — is one of the most consequential structural changes in the history of the technology industry.

Why the Fabless Model Won

Morris Chang's founding insight at TSMC in 1987 was that chip designers were constrained by having to manage fab operations, and that a dedicated foundry — with no competing chip designs — could serve all designers without conflict and invest purely in fabrication excellence. The result 35 years later: fabless companies like Nvidia can design the most complex chips in history without owning a single clean room, while TSMC runs fabs at yields and node advances no IDM has matched. The division of labour that Adam Smith described for pin-making applies, at extraordinary technological intensity, to semiconductors.

The Compute Architecture Battle — GPU vs ASIC vs CPU

The most consequential product decision in AI hardware is the choice of compute architecture. Training and running AI models requires processing enormous volumes of matrix multiplication — an operation that different chip architectures handle with very different efficiency, cost, and flexibility trade-offs. The dominant architecture today is the GPU. The challengers are custom ASICs. And the incumbent general-purpose CPU is now largely irrelevant for AI accelerator workloads at scale.

Three Architectures, Three Trade-offs

Architecture	How It Works	Strengths for AI	Weaknesses	Market Position
GPU (General Purpose Graphics Processing Unit)	Massively parallel processor with thousands of small cores optimised for matrix operations. Originally designed for graphics; repurposed for AI compute.	Extreme flexibility — runs any AI workload without code rewrite. Massive software ecosystem (CUDA). Best-in-class tools and developer familiarity. Rapid iteration — new Nvidia generation every 12–18 months.	Power-hungry. Not maximally efficient for any specific task — optimised for breadth, not depth. Very expensive per unit (~$30–40K for H100).	~80% AI accelerator market share (Nvidia). The default choice for training and general inference.
ASIC (Application-Specific Integrated Circuit)	Custom chip designed for one specific task. Google's TPU is designed specifically for TensorFlow/JAX matrix operations. Amazon Trainium for Transformer inference.	10–30% more energy efficient than GPUs for the specific workload they target. Lower total cost of ownership at hyperscaler scale once volume justifies design investment. No royalty paid to Nvidia.	Inflexible — useless outside the target workload. Requires massive investment to design (>$500M). Takes 2–4 years from design to volume production. Limited software ecosystem.	Growing rapidly within hyperscalers (Google, Amazon, Microsoft, Meta all have custom silicon programs). ~10–15% of AI compute at hyperscalers currently; projected to grow to 25–30% by 2028.
CPU (Central Processing Unit)	Sequential general-purpose processor. The traditional data centre workhorse. AMD Epyc and Intel Xeon dominate data centre CPUs.	Unmatched for complex sequential logic. Essential for orchestration, data preprocessing, inference serving (small batches). Still required in every AI server alongside the GPU.	Fundamentally inefficient for the parallel matrix operations that dominate AI training and large-batch inference. GPU performs the same AI computation 100–1,000x faster per watt.	Remains necessary but is no longer the primary value driver. AMD is gaining CPU share from Intel. Arm-based CPUs (Ampere, AWS Graviton) growing for cloud workloads.
The architecture war is not zero-sum — all three will coexist. The question is market share at the margin and which architecture captures the incremental unit of AI spend as the market grows.

Why Nvidia Won Round One — and Why the Game is Not Over

Nvidia's dominance in AI is not primarily a hardware story. It is a software story. CUDA — NVIDIA's parallel computing platform launched in 2006 — is the reason. When AI researchers began experimenting with GPU training in the early 2010s, they did so on CUDA. Every framework (TensorFlow, PyTorch), every library (cuDNN, cuBLAS), every optimised model was built for and tested on CUDA-enabled hardware. By the time AI demand exploded in 2023, the software ecosystem lock-in was so deep that switching away from Nvidia was not a hardware decision — it was a software, tooling, and retraining decision affecting every ML engineer in the world.

This is the key insight that gets lost in the GPU vs ASIC debate: Nvidia's moat is CUDA, not the H100. The H100 will be superseded by the H200, B100, B200, and whatever comes next. But the CUDA ecosystem — 4 million+ developers, 3,500+ GPU-accelerated applications, 15+ years of optimisation — is extremely hard to displace regardless of hardware competition.

The CUDA Moat — Quantified

As of 2024, PyTorch — the dominant AI research and production framework — shows CUDA as the default backend in over 95% of production deployments. AMD's ROCm (the CUDA competitor) has made meaningful progress but remains 2–3 years behind in software maturity and library optimisation. Google's JAX runs on TPUs natively but has ~5% of the production deployment base vs PyTorch/CUDA. The economic consequence: developers optimise for CUDA because that is where the tools work best; this creates demand for Nvidia hardware; this generates revenue that funds further CUDA investment. The flywheel has been spinning for 15 years and is now self-sustaining.

The ASIC Threat — Real but Circumscribed

The custom silicon programs at Google (TPU v5), Amazon (Trainium 2, Inferentia 3), Microsoft (Maia 100), and Meta (MTIA) represent the most credible threat to Nvidia's market share. The economics are compelling at hyperscaler scale: a custom chip designed specifically for your inference workload can deliver 20–40% better performance-per-watt, which at a million-chip deployment translates into hundreds of millions of dollars annually in energy savings and foregone Nvidia procurement spend.

But the threat is structurally bounded. Custom ASICs require: (1) sufficient volume to justify $500M+ design investment; (2) workload stability — the chip must be designed for a workflow that will not change before the chip ships; (3) internal silicon engineering teams of 500–1,000+ engineers (rare outside the Big 4 hyperscalers); and (4) 2–4 years of lead time from design to production. These requirements confine viable ASIC programs to a handful of the world's most resource-endowed companies. The long tail of AI companies — thousands of enterprises, startups, and research institutions — will buy Nvidia for the foreseeable future.

Bull: Nvidia Entrenches

80%+ share

CUDA lock-in proves unassailable. Blackwell and future architectures maintain performance leadership. ASIC programs deliver cost savings but never achieve the flexibility needed for general training. Nvidia captures most incremental AI compute spend through 2030.

Base: Gradual Share Loss

65–75% share

Hyperscaler ASICs take 20–25% of AI compute at the largest deployments by 2028. AMD ROCm gains software maturity; open-source ML frameworks improve portability. Nvidia retains dominance in training and among all non-hyperscale customers. Margin pressure emerges but remains manageable.

Bear: Architecture Disruption

50–60% share

A step-change in model architecture (non-Transformer) reduces CUDA's advantage. Open hardware ecosystems (RISC-V based AI accelerators) gain ground. AMD ROCm achieves near-parity. Nvidia's pricing power erodes as supply catches up with demand in 2025–2026.

So What?

The compute architecture battle is ultimately a question of software ecosystem durability, not hardware performance. Nvidia's moat is unusually deep precisely because it is not primarily a hardware moat — hardware can be copied or improved upon; 15 years of software ecosystem development cannot. The base case is gradual, manageable share erosion at the very largest customers, while Nvidia retains dominant share everywhere else. For investors, the relevant risk is not GPU displacement but Nvidia's ability to sustain 70%+ gross margins as supply normalises — the margin question is more pressing near-term than the share question.

The Fab Moat — Why TSMC Is Nearly Impossible to Replicate

TSMC is the most consequential company that most investors outside the technology sector have never deeply analysed. It fabricates approximately 90% of the world's most advanced semiconductors. It is the reason Nvidia's H100 exists, the reason Apple's M-series chips are competitive, and the reason the US government has spent $52 billion via the CHIPS Act trying to build domestic alternatives. Understanding why TSMC's moat exists — and why it is so hard to replicate — requires engaging with physics, economics, and organisational knowledge simultaneously.

The Four Dimensions of TSMC's Moat

Dimension 1 — Physics & Process Know-How

Yield as the Untransferable Asset

A semiconductor fab achieves "yield" — the percentage of chips on a wafer that work correctly. At the 3nm node, achieving commercially viable yield requires controlling manufacturing conditions to tolerances of a few atoms. TSMC's process engineers have spent decades learning, trial by trial, how to achieve and maintain these tolerances. This accumulated process knowledge — which exists as tacit know-how in the minds and procedures of thousands of engineers — cannot be bought, licensed, or transferred. It must be rebuilt from scratch, which is why every attempt to catch up with TSMC through capital investment alone has failed. Intel has spent over $100B on fab investment since 2020 and remains 2–3 years behind TSMC at the leading edge.

Dimension 2 — Capital Intensity as Barrier

$20–30B Per Fab, 5–10 Years to Ramp

A leading-edge fab — TSMC's fabs in Arizona, Taiwan, Japan, and Germany — costs $20–30B to build and equips with 150–200 ASML EUV machines at $200M each. TSMC spent $36B in capex in 2023 alone. The payback period is 8–12 years, requiring confidence in sustained demand and pricing. No company other than TSMC, Samsung, and Intel has demonstrated the financial capacity and technical execution to build and operate fabs at this level. Even sovereign wealth fund-backed initiatives (UAE, Saudi Arabia) have struggled to attract the talent and equipment necessary to do so meaningfully.

Dimension 3 — The Customer Flywheel

Every Customer Makes TSMC Better

TSMC runs designs for over 500 customers simultaneously across every leading chip category. Each design run teaches TSMC something — about yield improvement at specific design rules, about thermal management, about defect patterns in certain geometries. This learning compounds. A foundry with 500 customers learns the process faster than one with 50. Because TSMC maintains strict confidentiality between customers, designers trust it with their most sensitive IP, which further expands the learning base. Samsung and Intel, running their own chip designs through the same fabs, face an inherent conflict-of-interest limitation that reduces external customer trust and shrinks their learning base.

Dimension 4 — Equipment Lead Times

ASML EUV as the Hard Constraint

ASML produces approximately 60 EUV machines per year and is currently sold out through 2026. A new fab cannot be brought to leading-edge production without a guaranteed EUV allocation — and the queue for those allocations runs years deep. TSMC, as ASML's largest single customer (~40% of EUV output), has preferred access and multi-year supply agreements. A new entrant building a competing foundry today faces not just the capital and know-how barriers but a 3–5 year equipment queue before the fab can even begin process development at leading nodes.

The Taiwan Concentration Reality

TSMC's primary fabs are in Taiwan — a geopolitical fact that has become the semiconductor industry's most discussed and least resolved risk. Approximately 92% of the world's leading-edge (sub-5nm) chip production capacity is located in Taiwan as of 2024. The US CHIPS Act, Japan's semiconductor subsidies, and the EU Chips Act are all attempts to redistribute this concentration — but the pace of geographic diversification is structurally constrained by the same factors that created the concentration in the first place.

TSMC Taiwan Capacity Share

~92%

of sub-5nm global capacity

TSMC Arizona (N4)

~5%

of TSMC capacity by 2026 (est.)

CHIPS Act Funding

$52B

US domestic semiconductor investment

Time to New Fab (leading edge)

5–10 yrs

from groundbreak to commercial volume

TSMC Arizona Reported Cost Premium

~50%

vs Taiwan fabs (labour, logistics, yield)

The Geographic Diversification Problem

The geographic diversification of semiconductor manufacturing is slower and harder than policymakers publicly acknowledge. TSMC's Arizona fabs have faced significant challenges: skilled technician shortages, yield rates initially below Taiwan counterparts, construction delays, and cost overruns. The reported 50% cost premium for chips fabricated in Arizona vs Taiwan reflects these structural disadvantages — higher labour costs, less dense supplier ecosystems, and the absence of the deep talent pool that Taiwan's semiconductor education system has spent 40 years building. Intel's domestic US fabs, receiving the largest single CHIPS Act allocation (~$8.5B), remain behind TSMC at the leading edge despite a decade of aggressive investment. Meaningful geographic rebalancing of chip fabrication is a 10–20 year project, not a 3–5 year one.

So What?

TSMC's moat is among the most durable in the global economy precisely because it is multi-dimensional — physical, capital, organisational, and ecosystem-based simultaneously. Any single dimension could theoretically be addressed by a well-resourced competitor; the combination of all four is what makes catch-up take decades. For investors, the relevant debate is not "will TSMC's moat erode?" but "what is TSMC worth relative to its irreplaceability?" — a question that intersects with Taiwan geopolitical risk, capex cycle timing, and whether leading-edge chip demand grows fast enough to justify continued node investment. All three variables are in TSMC's favour through at least 2027.

Memory — HBM, DRAM, and the Bandwidth Bottleneck

Memory is the least glamorous part of the semiconductor value chain — and, since 2023, the most important bottleneck in the AI hardware stack. The reason is simple: AI inference and training are not compute-bound at the frequencies that matter; they are memory-bandwidth-bound. The limiting factor in running a large language model is not the speed at which the GPU performs matrix multiplications — it is the speed at which it can feed data to itself from memory. This is why HBM (High Bandwidth Memory) became the critical component, and why SK Hynix — the company that delivers it at volume — became arguably as important to the AI supply chain as TSMC.

Memory Taxonomy — What Matters for AI

Memory Type	What It Is	Bandwidth	AI Role	Key Players	Investment Relevance
HBM3E (High Bandwidth Memory)	Stacked DRAM dies connected to GPU via silicon interposer. Physically mounted on the same package as the compute chip.	1.2 TB/s per stack (H100: 3.35 TB/s total from 5 stacks)	The critical AI bottleneck. GPU performance limited by how fast HBM can supply weights and activations. More HBM = faster inference for large models.	SK Hynix (~50% share), Samsung (~30%), Micron (~20%)	High — HBM supply is currently constrained; SK Hynix has a 2-year technology lead; HBM ASPs are 5–8x standard DRAM. Tight supply expected through 2026.
DRAM (Standard)	Standard volatile memory used in servers, PCs, and phones. DDR5 is the current generation.	~50–80 GB/s (DDR5)	Server memory for CPU workloads; smaller AI inference deployments. Less relevant for large model training but required in every server.	Samsung (~40%), SK Hynix (~30%), Micron (~25%)	Cyclical — DRAM markets swing between oversupply and shortage. AI demand adds a structural growth layer but does not eliminate the cycle. Samsung has historically been the price-setting swing producer.
NAND Flash	Non-volatile storage. Used for SSDs, data storage in data centres.	Low relative to DRAM	Training data storage; checkpoint storage. Not on the critical AI latency path. Demand grows with data centre scale-out but is less AI-specific than DRAM/HBM.	Samsung (~30%), SK Hynix (~20%), Kioxia (~18%), Micron (~12%), WD (~13%)	Lower near-term — NAND cycle is independent of the AI upcycle; oversupply has compressed margins since 2022. Recovery is underway but pricing remains below 2021 peak.
HBM content per GPU is growing with each generation: H100 has 80GB HBM3, H200 has 141GB HBM3E, B200 (Blackwell) has 192GB HBM3E. Each generation increase in HBM content per GPU amplifies the revenue opportunity for HBM suppliers without requiring proportional production capacity increases.

Why HBM Became the Bottleneck

The reason HBM became scarce — and why SK Hynix captured a disproportionate share of AI memory value — is a story about a technology bet made years before the AI boom. In 2018–2020, SK Hynix made a strategic decision to invest heavily in HBM3 development, betting that the convergence of GPU compute and memory-intensive ML workloads would create demand that standard DRAM could not serve. Samsung, the larger company, hedged more conservatively. Micron focused on standard DRAM and NAND cycles.

When Nvidia began designing the H100 in 2020–2021, TSMC and Nvidia turned to SK Hynix as the only supplier with both the HBM3 technology and the production ramp capability at the necessary scale. By the time the H100 launched in 2022 and AI demand exploded in 2023, SK Hynix was effectively the sole-source supplier for the critical memory component of the world's most sought-after chip. The resulting pricing power — HBM3E sells at 5–8x the price of equivalent-capacity standard DRAM — has driven SK Hynix's DRAM gross margins to levels not seen in a decade.

The HBM Content Escalator

Each successive generation of Nvidia's flagship AI GPU carries more HBM capacity: H100 (80GB) → H200 (141GB) → B200 Blackwell (192GB). The GB200 NVL72 rack system — Nvidia's most advanced AI infrastructure product — contains 13.5TB of HBM3E memory across 72 B200 GPUs, representing approximately 70kg of HBM stacks per rack. This is not just a technology progression: it is a structural revenue escalator for HBM suppliers. Even if Nvidia sells the same number of GPU units in 2025 as in 2024, the total HBM content — and total HBM revenue — grows automatically because each chip requires more of it.

The SK Hynix / Samsung / Micron Race

Company	HBM Share (2024 est.)	HBM Generation	Technology Gap vs Leader	Strategic Position
SK Hynix	~50%	HBM3E in volume; HBM4 in development	Leader — sets the pace	Sole qualified HBM3E supplier for Nvidia H200/B200 at launch. 2-year technology lead over Samsung at the time of the AI boom. Benefits from the technology bet made in 2019–2021. Key risk: Samsung catching up on HBM4 for the next GPU generation.
Samsung	~30%	HBM3E ramping; yield issues delayed market entry	~12–18 months behind SK Hynix	Faced yield qualification failures for Nvidia H100/H200 HBM3E in 2023. Recovered with improved processes. Expected to gain HBM share in 2025 with Blackwell-era products. Scale advantage in DRAM production is a tailwind for cost-down. Key risk: continued yield lagging in advanced packaging for HBM stacks.
Micron	~20%	HBM3E qualified; ramping for Nvidia	~18–24 months behind SK Hynix at HBM entry	Qualified for Nvidia HBM supply in late 2024. Receives meaningful volume from Nvidia for Blackwell GB200. US-domiciled company benefits from CHIPS Act incentives and US government preference for domestic supply. Fastest-growing HBM share trajectory. Key opportunity: US customers (hyperscalers, defence) may preference Micron for supply chain security reasons.
HBM market share will shift with each new GPU generation qualification cycle. The current hierarchy (Hynix → Samsung → Micron) is not fixed — qualification for each new Nvidia platform resets the competitive landscape.

So What?

Memory is the AI hardware supply chain's most underappreciated value pool. HBM is structurally different from commodity DRAM: it is technology-intensive, supply-constrained, and growing in content per GPU with every product cycle. SK Hynix's current position is the result of a contrarian technology bet made when AI was not yet a consensus investment theme — a useful reminder that the most valuable positions in technology supply chains are often built years before the demand they serve materialises. The key forward question is whether SK Hynix's lead is durable enough to persist through the HBM4 transition, or whether Samsung's scale closes the gap on the next technology node.

Geopolitics & Cycle — The US-China Chip War, Taiwan Risk & Where We Are

The semiconductor industry has always been globally interdependent — chips designed in California, fabricated in Taiwan, assembled in Malaysia, sold worldwide. That interdependence is now the central theatre of the US-China strategic competition. Since 2019, the US has progressively restricted China's access to advanced semiconductors and the tools to make them. The restrictions have escalated with each administration, producing a supply chain fragmentation that is reshaping where chips are made, who can buy them, and what the long-run cost structure of the industry looks like.

The US-China Chip War — A Timeline of Escalation

Year	Action	Effect
2019	Huawei Entity List addition — US suppliers require licences to sell to Huawei	TSMC, Qualcomm, Google cut Huawei supply; Huawei loses access to leading-edge process nodes for Kirin chips
2020	TSMC fabrication ban for Huawei — US foreign direct product rule extended to TSMC	Huawei loses ability to manufacture its own mobile chips; Kirin 9000 is last leading-edge SoC
2022	Comprehensive export controls — ban on sale of advanced chips (>A100 performance) and related equipment to China	Nvidia forced to develop downgraded China-specific products (A800, H800); ASML restricted from shipping DUV machines to China
2023	Controls extended — H800 and A800 also banned; 40+ Chinese chip entities added to Entity List	Nvidia H20 (further downgraded) becomes the China-compliant product; ~$12–15B of annual Nvidia revenue redirected
2023–24	Huawei Mate 60 Pro launch with SMIC-fabricated 7nm chip (Kirin 9000S)	Demonstration that China can make near-leading-edge chips domestically, though at lower yields and higher cost; significant shock to US policymakers
2025	H20 banned outright; BIS rules tightened further; ASML DUV sales to China halted	Near-total severance of China from US-ecosystem advanced chips; Chinese companies redirecting investment to self-sufficiency
The US-China chip war is one of the most consequential structural forces in the semiconductor industry. Its effects compound over time — each restriction accelerates China's domestic investment, which creates long-run supply competition even if near-term Chinese capabilities remain limited.

The Two Scenarios That Matter — China's Semiconductor Future

The geopolitics section in v1 of this document documented what happened chronologically but did not commit to an analytical view on what it means. That is an evasion. The US-China chip restrictions produce two materially different futures with genuinely different investment implications, and a serious framework should be explicit about which is more probable and what the second-order effects of each are.

Scenario A — Restrictions Hold

~60% probability

China remains permanently capacity-constrained at the frontier. SMIC plateaus at 7nm with DUV multi-patterning; without EUV it cannot progress to 5nm or below at commercial yield. The talent and equipment walls prove durable. China builds a parallel ecosystem for domestic consumption but cannot compete globally for leading-edge AI chip supply. Investment implications: TSMC pricing power intact; Nvidia's China revenue loss (~15–20% of data centre revenue) is permanent but manageable; ASML loses China revenue but gains from CHIPS Act-driven fab buildout elsewhere.

Scenario B — China Catches Up

~40% probability

China achieves credible 5nm-equivalent production by 2029–2031. SMIC or a successor entity finds a path to advanced nodes through some combination of reverse-engineered DUV multi-patterning improvements, smuggled or grey-market equipment, and breakthrough process engineering. Huawei's Ascend AI accelerators achieve competitive performance-per-watt for inference workloads. Investment implications: TSMC faces meaningful pricing pressure on mature nodes; China-captive AI infrastructure (Baidu, Alibaba, Tencent) shifts procurement; ASML's long-run China revenue becomes permanent zero; Nvidia's position in China is strategically severed rather than just reduced.

The Honest Assessment on Scenario Probabilities

Scenario A is assigned higher probability not because China lacks the intent or capital — it demonstrably has both — but because the physics of advanced semiconductor manufacturing impose real constraints that money alone cannot quickly solve. EUV is not just an export-controlled product; it is a system requiring 100,000+ precision components, many of which have their own supply chains that are also export-controlled. Reproducing it from scratch is a decade-plus project even with full state resources. However, Scenario B deserves 40% weight because history repeatedly shows that technology restrictions accelerate rather than prevent indigenous development when the prize is large enough. China's semiconductor industry in 2025 is more capable than US policymakers assumed it would be in 2019. Extrapolating that trajectory is not wishful thinking — it is reading the evidence.

The Inference vs Training Split — An Underweighted Dynamic

One structural shift that the chip architecture analysis in Section 05 raised but did not fully resolve: the workload mix is shifting from training to inference, and this shift has different implications for the competitive landscape than a simple "more AI = more Nvidia" extrapolation.

Training workloads are large, parallelisable, and batch-processed. They run for days or weeks on clusters of thousands of GPUs. Flexibility across model architectures matters because researchers iterate frequently. This is Nvidia's home turf — CUDA + H100/B100 is optimised for exactly this workload.

Inference workloads are different in character: they require low latency (milliseconds, not minutes), high throughput at lower batch sizes, and relentless cost-per-token optimisation. A frontier LLM serving 2.5 billion queries per day is an inference problem, not a training problem. The optimal chip architecture for inference prioritises energy efficiency and memory bandwidth utilisation over raw floating-point performance. This is where the architecture competition opens up:

Workload	% of AI Compute (2024 est.)	% by 2030 (est.)	Optimal Architecture	Competitive Implication
Training (frontier models)	~70–80%	~25–30%	GPU clusters (Nvidia H/B series)	Nvidia dominant; ASIC alternatives limited by flexibility requirements during research iteration
Inference (production serving)	~20–30%	~70–75%	Specialised inference ASICs, efficient GPUs, edge chips	Most competitive battleground — Google TPU, Amazon Inferentia, Groq LPU, and Nvidia's own inference-optimised products all competing; CUDA advantage is weaker here
Fine-tuning / adaptation	Small but growing	Meaningful share of enterprise AI	Mid-range GPUs; cloud-based fine-tuning services	Nvidia H100 overkill for most fine-tuning; AMD MI300 gaining traction for cost efficiency
The inference growth projection is the single most important structural variable for the next phase of the AI chip market. Inference-optimised products — including Nvidia's own L40S and the upcoming inference-specific lines — will be the primary battleground for chip market share through 2028.

The Hyperscaler Vertical Integration Threat

The v1 document mentioned hyperscaler custom ASICs as a competitive threat to Nvidia. The fuller picture is more structural: hyperscalers are not just building chips — they are vertically integrating across the entire stack in a way that progressively reduces their dependence on third-party vendors at every layer.

Google designs its own TPUs, runs them in its own data centres, on its own grid-connected power infrastructure, serving its own model (Gemini) through its own cloud platform. The only external dependencies are TSMC (fabrication) and ASML (via TSMC). Amazon has Trainium chips, Graviton CPUs, its own DC construction capability, AWS Outposts for edge, and long-term nuclear PPAs. Microsoft has Maia ASICs, Azure infrastructure, and the Constellation Energy nuclear PPA. Meta has MTIA inference chips, its own massive DC build programme, and is the largest single customer for Nvidia — while simultaneously building the alternative.

This vertical integration dynamic is not a short-term threat to chip layer economics — the volumes required and the complexity of the transition mean hyperscalers will remain large Nvidia customers for years. But it is the structural force that will gradually compress Nvidia's pricing power at the hyperscaler segment and push Nvidia's growth increasingly toward the enterprise/cloud-customer segment where CUDA lock-in is strongest and custom silicon economics are least compelling.

Cycle Positioning — The Demand / Supply Outlook

Segment	Current Cycle Position	Key Risk	Outlook 2025–2027
AI GPUs (Nvidia)	Supply-constrained through 2025; Blackwell ramp expected to ease shortage in 2026	Hyperscaler capex moderation; ASIC share shift at margin	Strong demand, supply normalising; pricing power moderates from peak but remains elevated. Margin risk is compression from current ~75% gross margin toward ~65–68%.
HBM Memory	Severely supply-constrained; SK Hynix fully allocated through 2025	Samsung qualification closing gap; next-gen HBM4 resets pecking order	Tight through 2025; modest easing in 2026 as Samsung/Micron ramp. ASP premium likely to compress but remains well above standard DRAM. SK Hynix maintains lead on HBM4 timing.
Advanced Logic (TSMC 3nm/2nm)	Fully allocated; Nvidia, Apple, AMD competing for capacity	Arizona ramp slower/costlier than expected; geopolitical Taiwan risk	TSMC pricing power intact through at least 2026. 2nm node ramp (2025–2026) expected to sustain premium. Long-term Taiwan geopolitical risk is an investor concern but not a near-term operational constraint.
Standard DRAM	Recovering from 2022 oversupply; prices improving in 2024	HBM capacity conversion reducing standard DRAM supply	Structural support from server demand and AI-adjacent server builds. Samsung's swing capacity now partially committed to HBM, tightening standard DRAM supply.
NAND Flash	Still recovering; prices below 2021 peak	Oversupply risk if data centre storage demand disappoints	Weaker than DRAM/HBM near-term. Recovery in 2025 partially underway but NAND is less directly AI-driven. Watch Samsung capex discipline as the key variable.
Cycle views as of early 2026. Semiconductor cycle dynamics shift rapidly; these assessments are directional, not precise forecasts. The most important variable is hyperscaler capex guidance, which provides the most reliable forward signal for AI chip demand.

The Central Investment Tension in AI Hardware

The paradox at the centre of AI hardware investing is this: the companies with the most durable structural moats (ASML, TSMC) are also the most geopolitically exposed. The company with the most visible near-term earnings power (Nvidia) faces the most uncertain long-run competitive structure as inference growth shifts architecture dynamics. And the memory companies (SK Hynix, Micron) that benefit most from the AI cycle also carry the highest cyclicality risk when the upcycle eventually normalises. There is no position in the AI hardware stack that captures structural moat, cyclical safety, and geopolitical insulation simultaneously. The investment framework should therefore be portfolio-level: own the moat (ASML, TSMC) for structural compounding; own the cycle leader (Nvidia) for near-term earnings; own the memory optionality (SK Hynix, Micron) for cycle leverage; own the power constraint (Vertiv, Hitachi Energy, Constellation) for the next bottleneck — with explicit position sizing to reflect each risk dimension.

Summary — The AI Hardware Moat Map

Across the sections of this study, a clear hierarchy of competitive positioning has emerged. The moat scores below reflect structural durability (not near-term earnings), assessed across switching cost, replication difficulty, and time to catch up.

ASML (EUV monopoly)

9.8/10

TSMC (leading-edge fab)

9.5/10

Nvidia (CUDA ecosystem)

9.0/10

Cadence / Synopsys (EDA)

8.8/10

Hitachi Energy / Vertiv (power infra)

7.4/10

SK Hynix (HBM leadership)

7.2/10

Broadcom (custom ASIC/networking)

7.0/10

Micron (HBM ramp)

5.8/10

Samsung (semi — diversified)

5.5/10

Intel (structural transition)

3.8/10

So What?

The AI hardware value chain is not a sector — it is a collection of structurally distinct businesses connected by supply chain relationships. ASML and TSMC are infrastructure monopolies compounding quietly. Nvidia is a platform business with a software moat dressed up as a hardware company, facing a gradual workload-mix headwind as inference growth reshapes optimal chip architecture. SK Hynix and Micron are technology-cycle businesses with near-term tailwinds but long-run cyclicality. The power infrastructure layer — transformers, cooling, nuclear baseload — is the emerging constraint that most chip-focused AI frameworks have underweighted. The unifying insight across all layers: the binding physical constraint is always where the value goes, and that constraint migrates as each bottleneck is relieved.

What Breaks the Thesis — Tail Risks and Architectural Discontinuities

A rigorous investment framework must ask not just what makes the thesis work but what would cause it to fail entirely. The AI hardware thesis — own the physical constraint, compound with the moat — rests on a set of assumptions that are well-grounded in current evidence but are not immutable laws. The following are the scenarios that would materially break the thesis rather than merely slow it down. They are assigned low probability but deserve explicit framing because their consequences are high enough to warrant position sizing that accounts for them.

The Four Thesis-Breakers

Risk 1 — Post-Transformer Architecture

A Model Architecture That Doesn't Need Dense Matrix Compute

The entire chip layer thesis is predicated on AI remaining dominated by Transformer-based models requiring massive matrix multiply operations — the operation GPUs and HBM are optimised for. If a successor architecture emerges that achieves frontier AI performance with dramatically less compute (sparse activation, state-space models like Mamba, neuromorphic approaches), the demand for Nvidia GPUs and HBM could fall structurally rather than cyclically. This is not implausible — the history of AI research is a history of paradigm shifts. The probability is low over a 5-year horizon but non-trivial over 10. The signal to watch: if leading labs are achieving GPT-4 equivalent results at 1/10th the compute, the architecture has shifted.

Risk 2 — Optical / Photonic Computing

Light Replaces Electrons for Data Movement

A significant fraction of the energy cost and latency in AI chips is data movement — shuttling numbers between compute units and memory. Photonic (light-based) interconnects can in principle move data at the speed of light with near-zero energy cost. If integrated photonics matures to the point where it can replace electronic interconnects within a chip package, the HBM bandwidth bottleneck (Section 07) dissolves and the memory architecture thesis breaks. Several startups (Lightmatter, Ayar Labs) are pursuing this; hyperscalers are funding the research. A 10-year horizon for meaningful impact is plausible; a 5-year horizon is not. Monitoring: watch for hyperscaler production deployments of photonic interconnects at meaningful scale.

Risk 3 — AI Demand Plateau / Capex Air Pocket

ROI Accountability Forces a Hyperscaler Spending Pause

The most near-term and highest-probability tail risk. Hyperscalers have collectively committed over $300B in 2025 capex, much of it for AI infrastructure. This is being justified on the basis of AI monetisation that is, in most cases, not yet materialised at the required scale. If AI ROI disappoints — if enterprise AI adoption is slower than projected, if model commoditisation reduces revenue per query faster than volume grows, or if a credible cheaper alternative to the current GPU cluster paradigm emerges — hyperscalers could sharply reduce GPU orders in 2026–2027 as existing capacity comes online. This would not break the 10-year thesis but would produce a severe near-term earnings shock for Nvidia, TSMC, and SK Hynix simultaneously. The closest historical analogue is the 2000–2001 telecom capex crash following the dot-com bubble.

Risk 4 — Taiwan Contingency

The Black Swan That Breaks Everything Simultaneously

A military conflict or naval blockade affecting Taiwan would not just be an investment risk — it would be a global supply chain catastrophe. 90% of the world's leading-edge chip production, in a single geography, subject to a single geopolitical event. The economic consequence would be measured in trillions of dollars of GDP impact globally, affecting every technology company, every car manufacturer, every defence contractor, every hospital simultaneously. Diversification efforts (TSMC Arizona, Japan fabs) provide partial insurance at the margin — they cover perhaps 10–15% of current TSMC Taiwan output by 2027. This risk cannot be hedged meaningfully within the semiconductor sector — it can only be sized appropriately as an uninsurable portfolio tail risk.

How to Size These Risks in a Portfolio

None of these four risks is high probability over a 3–5 year investment horizon. But their combination — and the correlation between them in a stress scenario (a Taiwan contingency simultaneously triggers a capex freeze and an accelerated push for alternative architectures) — means that concentrated positions in AI hardware warrant explicit tail risk management. The practical implication is not to avoid the sector but to size positions such that the portfolio can survive a 40–60% drawdown in AI hardware names without permanent capital impairment. The structural thesis is intact; the entry price and position sizing matter as much as the thesis quality.

So What?

The AI hardware thesis is strong — the physical constraints are real, the moats are deep, and the demand drivers are secular. But strong theses break in specific, identifiable ways, and intellectual honesty requires naming them rather than burying them in footnotes. The most actionable of the four risks is the capex air pocket — it is the most near-term, the most monitorable (hyperscaler capex guidance is public), and the one most likely to produce a buying opportunity rather than a thesis break. The architectural discontinuity and Taiwan risks are genuine but long-dated and unactionable as trading signals. Size for them; do not obsess over timing them.

The US-China Chip War — A Timeline of Escalation

Year	Action	Effect
2019	Huawei Entity List addition — US suppliers require licences to sell to Huawei	TSMC, Qualcomm, Google cut Huawei supply; Huawei loses access to leading-edge process nodes for Kirin chips
2020	TSMC fabrication ban for Huawei — US foreign direct product rule extended to TSMC	Huawei loses ability to manufacture its own mobile chips; Kirin 9000 is last leading-edge SoC
2022	Comprehensive export controls — ban on sale of advanced chips (>A100 performance) and related equipment to China	Nvidia forced to develop downgraded China-specific products (A800, H800); ASML restricted from shipping DUV machines to China
2023	Controls extended — H800 and A800 also banned; 40+ Chinese chip entities added to Entity List	Nvidia H20 (further downgraded) becomes the China-compliant product; ~$12–15B of annual Nvidia revenue redirected
2023–24	Huawei Mate 60 Pro launch with SMIC-fabricated 7nm chip (Kirin 9000S)	Demonstration that China can make near-leading-edge chips domestically, though at lower yields and higher cost; significant shock to US policymakers
2025	H20 banned outright; BIS rules tightened further; ASML DUV sales to China halted	Near-total severance of China from US-ecosystem advanced chips; Chinese companies redirecting investment to self-sufficiency
The US-China chip war is one of the most consequential structural forces in the semiconductor industry. Its effects compound over time — each restriction accelerates China's domestic investment, which creates long-run supply competition even if near-term Chinese capabilities remain limited.

China's Response — Domestic Self-Sufficiency Push

China's response to US chip restrictions has been a massive, state-directed investment in domestic semiconductor capability. The numbers are significant: over $150B in committed government funding for semiconductor industry development through 2030, channelled through the China Integrated Circuit Industry Investment Fund (the "Big Fund"). SMIC — China's leading foundry — has been ramping 7nm production using DUV lithography (not EUV) through multi-patterning techniques that achieve similar geometry at lower yield and higher cost. Huawei has rebuilt its chip design capabilities from scratch, developing its own AI accelerators (Ascend series) as a CUDA-alternative ecosystem.

The honest assessment: China is building a parallel semiconductor ecosystem that is 3–5 years behind the global frontier today, cannot make sub-5nm chips without EUV, and operates at substantially higher cost per wafer than TSMC. But it is building — and the restrictions are accelerating rather than stopping that effort. The long-run scenario where China has a credible 5nm foundry by 2030 is not implausible, even if it requires continued sacrifice of economic efficiency for strategic independence.

Geopolitical Risk 1

Taiwan Strait Risk

TSMC's concentration in Taiwan creates a scenario risk that is difficult to price but impossible to ignore: a military conflict or naval blockade affecting Taiwan would immediately disrupt 90% of global leading-edge chip supply. This risk has driven the US, Japan, and European CHIPS Act investments — but geographic diversification of fab capacity is a decade-long project. Near-term, TSMC's Arizona and Japan fabs provide partial insurance; they cannot substitute for Taiwan at scale. The probability is low but the consequence is catastrophic — the world's most extreme supply chain single point of failure.

Geopolitical Risk 2

ASML Export Control Creep

ASML is Dutch; its EUV machines contain US-origin technology. The US has progressively pressured the Netherlands to restrict ASML DUV sales to China. In 2023, ASML was prohibited from shipping its most advanced DUV tools to China; in 2025, the restrictions expanded. ASML's China revenue — historically ~15–25% of total — faces structural decline, partially offset by hyperscaler demand in the US, Japan, and Europe. The political economy of ASML export controls creates ongoing regulatory risk for the company even as it represents a structural tailwind for non-China fabs.

Geopolitical Risk 3

US-Allied Supply Chain Fragmentation

The US is not just restricting China — it is actively attempting to build an allied semiconductor supply chain that excludes China, incentivised through the CHIPS Act ($52B), the Inflation Reduction Act, and bilateral agreements with Japan and South Korea. The consequence is capex duplication: fabs being built in the US, Japan, Germany, and Ireland at higher cost than Taiwan equivalents, with sovereign incentives subsidising the economic inefficiency. Long-run, this structurally raises the floor cost of advanced chip production — which benefits incumbents with existing scale but creates inflationary pressure on semiconductor pricing.

Cycle Dynamics

Where We Are in the AI Upcycle

The semiconductor cycle traditionally follows a 3–4 year boom-bust rhythm driven by memory oversupply. The AI upcycle that began in 2023 is overlaid on this, creating unusual dynamics: memory (NAND) is still recovering from 2022 oversupply while HBM is simultaneously in shortage. Logic chips (GPUs) are in unprecedented demand. The capex cycle at hyperscalers shows no signs of deceleration — Microsoft, Google, Amazon, and Meta collectively guided over $300B in 2025 capex, most flowing into AI infrastructure. The risk of a capex air pocket (hyperscalers reducing GPU orders as supply catches up) is the primary cycle risk, likely to materialise in 2026–2027 as Blackwell supply normalises.

Cycle Positioning — The Demand / Supply Outlook

Segment	Current Cycle Position	Key Risk	Outlook 2025–2027
AI GPUs (Nvidia)	Supply-constrained through 2025; Blackwell ramp expected to ease shortage in 2026	Hyperscaler capex moderation; ASIC share shift at margin	Strong demand, supply normalising; pricing power moderates from peak but remains elevated. Margin risk is compression from current ~75% gross margin toward ~65–68%.
HBM Memory	Severely supply-constrained; SK Hynix fully allocated through 2025	Samsung qualification closing gap; next-gen HBM4 resets pecking order	Tight through 2025; modest easing in 2026 as Samsung/Micron ramp. ASP premium likely to compress but remains well above standard DRAM. SK Hynix maintains lead on HBM4 timing.
Advanced Logic (TSMC 3nm/2nm)	Fully allocated; Nvidia, Apple, AMD competing for capacity	Arizona ramp slower/costlier than expected; geopolitical Taiwan risk	TSMC pricing power intact through at least 2026. 2nm node ramp (2025–2026) expected to sustain premium. Long-term Taiwan geopolitical risk is an investor concern but not a near-term operational constraint.
Standard DRAM	Recovering from 2022 oversupply; prices improving in 2024	HBM capacity conversion reducing standard DRAM supply	Structural support from server demand and AI-adjacent server builds. Samsung's swing capacity now partially committed to HBM, tightening standard DRAM supply.
NAND Flash	Still recovering; prices below 2021 peak	Oversupply risk if data centre storage demand disappoints	Weaker than DRAM/HBM near-term. Recovery in 2025 partially underway but NAND is less directly AI-driven. Watch Samsung capex discipline as the key variable.
Cycle views as of early 2026. Semiconductor cycle dynamics shift rapidly; these assessments are directional, not precise forecasts. The most important variable is hyperscaler capex guidance, which provides the most reliable forward signal for AI chip demand.

The Central Investment Tension in AI Hardware

The paradox at the centre of AI hardware investing is this: the companies with the most durable structural moats (ASML, TSMC) are also the most geopolitically exposed. The company with the most visible near-term earnings power (Nvidia) faces the most uncertain long-run competitive structure. And the memory companies (SK Hynix, Micron) that benefit most from the AI cycle also carry the highest cyclicality risk when the upcycle eventually normalises. There is no position in the AI hardware stack that captures structural moat, cyclical safety, and geopolitical insulation simultaneously. The investment framework should therefore be portfolio-level: own the moat (ASML, TSMC) for structural compounding; own the cycle leader (Nvidia) for near-term earnings; own the memory optionality (SK Hynix, Micron) for cycle leverage — with explicit position sizing to reflect each risk dimension.

Summary — The AI Hardware Moat Map

Across the six sections of this study, a clear hierarchy of competitive positioning has emerged. The moat scores below reflect structural durability (not near-term earnings), assessed across switching cost, replication difficulty, and time to catch up.

ASML (EUV monopoly)

9.8/10

TSMC (leading-edge fab)

9.5/10

Nvidia (CUDA ecosystem)

9.0/10

Cadence / Synopsys (EDA)

8.8/10

SK Hynix (HBM leadership)

7.2/10

Broadcom (custom ASIC/networking)

7.0/10

Micron (HBM ramp)

5.8/10

Samsung (semi — diversified)

5.5/10

Intel (structural transition)

3.8/10

So What?

This document is a research reference for investment analysis purposes. Market size estimates, margin figures, and competitive assessments are directional and based on publicly available information as of early 2026. This document does not constitute investment advice. Companion documents: ASX IT Sector Thesis v1.0 · ASX Materials Sector Thesis v1.0.

Key Sources Company Reports & Disclosures: Nvidia (10-K, investor day presentations, Jensen Huang keynotes); TSMC (annual reports, technology symposium); ASML (annual reports, EUV readiness reports); SK Hynix (earnings releases, HBM technology briefings); Micron (earnings, memory roadmap); Intel (annual reports, IDM 2.0 strategy); Vertiv (investor presentations). · Industry Research: SemiAnalysis (semiconductor supply chain deep dives); SEMI (industry statistics, fab investment data); IEA (data centre electricity consumption); Brookfield Asset Management — "Building the Backbone of AI" (infrastructure investment framework); S&P Global — "Copper in the Age of AI" (materials demand); BHP Insights — "How Copper Will Shape Our Future". · Investment Research: Morgan Stanley (AI infrastructure capex series); Goldman Sachs (semiconductor cycle and AI demand notes); Bernstein (TSMC and ASML deep dives). · Original Source: Anuradha Raja — "AI, Data Centres and Copper: Some Notes" (anuradharaja.com, January 2026) — data centre statistics and power demand framing used throughout Part A. · Government & Policy: US CHIPS and Science Act (2022); BIS export control regulations; NIST semiconductor reports; DoE data centre energy efficiency reports.