NNuggets
BookmarksCollections
  • About Us
  • Terms of use
  • Privacy policy
  • Disclaimer
  • Copyright & Takedown Policy
  • Community Guidelines
  • Cookie Policy
  • Contact

© 2026 Nuggets

NuggetsMarket PulseCollections

On this page

Speakers & Credentials

  • Speakers & Credentials
  • 1. Executive Summary
  • 2. Chronological Table of Contents
  • 3. Detailed Thematic Summary
  • The Reference Vault
  • 4. Data & Figures
  • 5. Core Frameworks & Mental Models
  • 6. Anecdotes
  • 7. References & Recommendations

On this page

  • Speakers & Credentials
  • 1. Executive Summary
  • 2. Chronological Table of Contents
  • 3. Detailed Thematic Summary
  • The Reference Vault
  • 4. Data & Figures
  • 5. Core Frameworks & Mental Models
  • 6. Anecdotes
  • 7. References & Recommendations
Technology/March 30, 2026/13 min read/youtu.be

The AI Silicon Shortage Explained: TSMC, Nvidia CPO, Memory Crisis & What Comes Next | SemiAnalysis

Source
Source
Watch on YouTube ↗

"we're arguing that there's not enough wafer fab capacity power is no longer the biggest constraint and right now it's really not enough front end capacity" - Ivan [00:01:56]

"recently Nvidia overtook Apple as the largest TSMC customer in 2025" - Shravan [00:04:18]

References

  1. Original source (youtu.be)

Disclaimer: Orignal content owned by or sourced from third parties. It does not represent the views of 'Nuggets' platform or it's team. AI is used extensively across this platform including for summaries. Accuracy is not guaranteed, there can be mistakes. Any info or content on this platform is not a financial, legal, or investment advice. Do your own research. Refer for complete disclosures:- Terms of Use · Full Disclaimer

Related nuggets

Jun 2, 2026

AI Is Escaping the Screen | 01 Jun 2026 | Coatue

Coatue : AI is entering a new phase: moving beyond digital tools and into fully autonomous systems operating in the physical world. From advanced manufacturing and surgical robotics to robots in the home, the next wave of innovation will b…

Jun 2, 2026

Kalshi Monthly Volume - Politics ($M) | Chart of the Day | Coatue

Coatue: Kalshi's political volume has scaled dramatically, and the American Power Index KPOW is what that scale enables: a single number gauge of the current balance of political power and where markets expect it to move, which Kalshi bill…

Jun 2, 2026

The BlackBerry Problem |18 May 2026 | The Mistakes Series | Malcolm Gladwell's Revisionist History

"My mistake and naivity was to think that people are were with me so you're flying around the world you're trying to get people on side and you think they're on side but they're not mhm mhm and you get blindsight" Jim Balsillie 00:01:34 ht…

Jun 2, 2026

Partnership Perspectives: Network International | 2 Jun 2026 | Brookfield Perspectives

Actions

Reading

Published
March 30, 2026
Read time
13 min read
Progress0%

"based on our modeling AI as a percentage of entry output in 2026 is 60% and then this goes to 90% in uh or 85% in 2027" - Ivan [00:10:22]

"it's like trying to find airplane tickets on like the last flight out because the price is just ramping up so quickly" - Jordan Nanos [00:22:07]

"they're signing GPUs through to 2030 that's a 8-year life that somebody wants to commit to pay for it" - Jordan Nanos [00:27:17]

"at the end of the day it's a cyclical industry booms and busts will happen" - Shravan [00:33:11]


Speakers & Credentials

  • Jordan Nanos: Host at SemiAnalysis Weekly, covering semiconductor industry trends, GPU rental markets, and infrastructure deployment dynamics.
  • Ivan: Analyst at SemiAnalysis, specializing in wafer fab modeling, hardware supply chain constraints, and memory bandwidth (HBM) technical requirements.
  • Shravan: Analyst at SemiAnalysis, focusing on TSMC capacity allocations, consumer electronics demand cycles, memory procurement, and semiconductor cyclicality.
  • Dan Nystedt: Analyst at SemiAnalysis, specializing in optical networking architectures (CPO, DSPs, DWDM), data center switching, and hyperscaler capital expenditures.

1. Executive Summary

  • The primary bottleneck for the AI rollout has fundamentally shifted from advanced packaging (CoWoS) and data center power to a severe shortage in front-end wafer fabrication capacity, specifically at TSMC's N3 (3nm) node.
  • This paradigm shift is driven by skyrocketing token demand from advanced agentic workflows and massive model adoption, forcing AI accelerators to monopolize up to 90% of leading-edge wafer output by 2027.
  • The collateral damage is hitting the consumer electronics and smartphone sectors heavily, characterized by compressed operating margins due to High Bandwidth Memory (HBM) costs skyrocketing, forcing low-end smartphone unit contractions.
  • In the networking space, the industry is approaching a critical architectural pivot toward Co-Packaged Optics (CPO) and Dense Wavelength Division Multiplexing (DWDM) to bypass the electrical limitations of copper and solve bandwidth escape bottlenecks in next-generation GPU clusters.
  • Hyperscaler CapEx cuts appear highly unlikely as the return on investment (ROI) for advanced automation remains overwhelmingly positive, reinforcing a seemingly insatiable "last flight out" scramble for GPU compute that is driving hardware lease contracts into the 2030s.

2. Chronological Table of Contents

  • [00:00:06] Introduction & The Shifting AI Bottlenecks
  • [00:03:09] TSMC Demand Dynamics: Nvidia Overtakes Apple
  • [00:10:15] The Front-End N3 Silicon Squeeze & Consumer Electronics Impact
  • [00:17:05] The Structural HBM (High Bandwidth Memory) Crisis
  • [00:20:00] Cloud Economics, GPU Rental Prices, and ROI
  • [00:32:30] Semiconductor Cyclicality and the Double-Ordering Illusion
  • [00:39:16] OFC Recap: Exploring Co-Packaged Optics (CPO)
  • [00:44:13] Scale-up vs. Scale-out CPO & Nvidia's Roadmap
  • [00:49:45] Multi-Source Agreements (MSAs): OCI vs. CPX vs. XPO

3. Detailed Thematic Summary

The Evolution of the AI Buildout Bottlenecks [00:01:26]

  • The primary constraints on AI infrastructure scaling have cycled systematically since the dawn of ChatGPT in late 2022 [00:01:26].
  • The first phase in 2023 was constrained by a lack of advanced co-packaging capacity [00:01:35].
  • The 2023 to 2025 era was largely defined by a shortage of data center electrical power and physical space [00:01:41].
  • The current bottleneck is characterized by a severe lack of front-end wafer fabrication capacity, rendering power constraints secondary [00:01:56].
  • This supply crunch is crashing into exploding token demand, heavily driven by multi-step agentic workflows and advanced models like Claude Code, which helped Anthropic add $6 billion in Annual Recurring Revenue in just the month of February [00:02:42].

TSMC's Changing Demand Profile and Capital Discipline [00:03:40]

  • Historically, consumer electronics and smartphones—led by Apple and Qualcomm—were TSMC's primary demand drivers [00:03:47].
  • A major industry shift occurred as Nvidia overtook Apple as TSMC's largest customer in 2025 [00:04:18].
  • TSMC exited 2025 with an N3 (3nm) capacity of 120k wafer starts per month, with approximately two-thirds (70k-80k) previously dedicated to consumer smartphones and PCs [00:05:22].
  • TSMC is investing heavily, stepping up CapEx from roughly $30 billion in 2024/2025 to $52-$54 billion this year, with expectations to hit $70 billion in 2027 [00:04:52].
  • Despite these investments, TSMC behaves as a disciplined "Kingmaker," favoring stable customers like Apple (who represents 25-30% of leading-edge demand) over volatile profiles, having learned hard lessons from the 2018 crypto bubble burst [00:08:43].

The N3 Silicon Shortage and Consumer Squeeze [00:10:15]

  • AI accelerators are cannibalizing the N3 node. In 2025, AI accounted for just 9% of N3 wafer demand [00:12:50].
  • By 2026, AI is projected to consume 60% of N3 capacity, and a staggering 85-90% by 2027 [00:10:22].
  • Consumer electronics are functioning as a reluctant "release valve." A 10-15% decline in smartphone and PC units is expected due to component costs and lack of fab capacity [00:06:23].
  • Memory costs are crushing lower-end manufacturers. The memory slice of a smartphone Bill of Materials (BOM) has surged from 17-20% up to 25-30% [00:11:53].
  • Consequently, Chinese mid-range manufacturers like Xiaomi, Oppo, and Vivo are cutting orders by up to 30% [00:12:15].
  • Even if the industry reallocates 25% of smartphone N3 wafer capacity to AI, it would only yield roughly 700,000 Rubin GPUs and 1.5 million TPU V7s, which remains mathematically insufficient to clear the AI backlog [00:16:46].

The Memory (HBM) Squeeze [00:17:05]

  • The memory shortage is inherently tied to fabrication physics. High Bandwidth Memory (HBM) consumes three times more wafer capacity per bit than standard commodity DRAM [00:17:47].
  • As the industry migrates to HBM4 and HBM4E, this ratio will worsen, requiring four times the wafer capacity of commodity DRAM [00:17:53].
  • Compounding the issue, structural engineering friction exists between Nvidia requesting ultra-high pin speeds (e.g., 11 GB/s) and memory vendors struggling to meet those specific tolerances, keeping supply yields artificially tight [00:18:11].
  • Meaningful new memory capacity from fabs (Samsung, SK Hynix, Micron) is not expected to come online until the second half of 2027 [00:18:53].

Cloud Economics, GPU Rental Prices, and ROI [00:20:00]

  • Hyperscaler CapEx cuts are practically impossible due to overwhelming workflow ROI. For instance, executing a complex analytical task via Claude Code costs SemiAnalysis roughly $5 to $7, replacing tasks that would traditionally take three to four hours of junior analyst time [00:20:00].
  • Increased memory pricing has pushed the baseline cost of producing an AI server up by 5% to 10% [00:24:48].
  • The compute lease market is fractured into three tiers: Long-end off-take contracts (4-5 years), Middle-market contracts (1-4 years), and the smallest tier, On-Demand [00:25:16].
  • GPU rental prices are defying depreciation gravity. Original models predicted a 30% drop in H100 pricing for 2026 as superior GB300 chips came online. Instead, H100 hourly rental prices bottomed at $170, inflected back up to $180, and surged an additional 15-20% early in the year, plus another 10% in March [00:21:24].
  • The market is behaving like "the last flight out," characterized by extreme illiquidity where NeoCloud providers are locking in clients to 4-year contract extensions, forcing buyers to commit to an 8-year lifecycle for silicon (paying for H100s until 2030) [00:27:17].

Optical Networking: Co-Packaged Optics (CPO) Deep Dive [00:39:16]

  • At the OFC and GTC conferences, networking architectures took center stage. Standard copper connections max out at roughly 2 meters at 224 gigs, severely limiting multi-rack GPU scaling [00:42:07].
  • Co-Packaged Optics (CPO) physically integrates the optical engine directly adjacent to the switch/GPU chip. This eliminates the need for long electrical traces to faceplate pluggables and bypasses energy-intensive Digital Signal Processors (DSPs) [00:40:47].
  • Scale-out CPO: Used for connecting tens of thousands of GPUs across data centers to save power and simplify deployments for NeoClouds. Nvidia's upcoming multi-plane CPO switches can push 409 Terabytes of aggregate bandwidth, up from 100 Terabits in Tomahawk 6 [00:43:54].
  • Scale-up CPO: Used for high-bandwidth domains. Despite squeezing 144 GPUs into a 600-kilowatt rack using copper (Kyber), Nvidia announced products like the NVL576 (connecting 8 racks using Reuben scale-up CPO) and the NVL1152 (connecting 8 Kyber racks using CPO) [00:45:30].

Multi-Source Agreements (MSAs) and Optical Standardization [00:49:45]

  • The OCI MSA, backed by heavyweights like Nvidia, Broadcom, AMD, Meta, and Microsoft, marks a massive architectural pivot from gray optics (one lambda per fiber) to Dense Wavelength Division Multiplexing (DWDM), allowing multiple signals over a single fiber using a 4-wavelength transmit/receive bi-directional architecture at 50 Gig NRZ [00:52:03].
  • Rival consortium CPX MSA focuses on standardizing the physical connector (plug-ability) rather than the modulation technique, heavily implying the use of ring modulators for "near package optics" (flexible, pluggable CPO) [00:54:30].
  • Meanwhile, the XPO architecture attempts to extend the lifespan of traditional faceplate pluggables natively cooled via co-packaged copper flyover cables, bridging the gap toward Linear Pluggable Optics (LPO) [00:56:03].
  • Production for ultra-high power lasers to satisfy these new optical demands is incredibly hard to forecast, though companies like Lumentum are expected to scale production by 20 to 30 times their current base [00:49:05].

The Reference Vault

4. Data & Figures

Data PointValueContextTimestamp
Anthropic Monthly ARR Added$6 billionIncredible revenue growth driven by advanced agentic workflows like Claude Code in February.[00:02:42]
TSMC 2024/2025 CapEx~$30 billionHistorical baseline capital expenditure for TSMC.[00:04:52]
TSMC 2026 Expected CapEx$52 - $54 billionMassive ramp in spend to combat the AI semiconductor shortage.[00:04:52]
TSMC 2027 Projected CapEx~$70 billionProjected expenditure to scale next-generation fabs.[00:04:52]

5. Core Frameworks & Mental Models

  1. The Shifting Cycle of Bottlenecks: This framework posits that in hyperscale technological buildouts, solving one fundamental constraint immediately exposes the next. As the AI industry conquered Co-Packaging (CoWoS) constraints, it slammed into Data Center Power constraints, and currently has shifted violently into Front-End Wafer Fabrication (N3 capacity) limitations. [00:01:26]
  2. The "Kingmaker" Allocation Model: A strategic operational posture adopted by monopolistic suppliers (like TSMC). Rather than chasing peak opportunistic pricing in a supply squeeze, the supplier curates its clientele, deliberately starving volatile buyers (like crypto or some AI startups) in favor of allocating capacity to legacy buyers (like Apple) who offer highly predictable, stabilizing, multi-year demand profiles. [00:08:43]
  3. The Three-Tiered GPU Market: A structural lens to view GPU compute leasing. The market fractures into long-end offtake contracts (4-5 years, mostly large AI labs), a middle contract market (1-4 years for AI natives/labs), and the tightest, most illiquid tier: on-demand pricing. [00:25:16]
  4. The "Double Ordering" Bullwhip Effect: A foundational mental model for understanding cyclical semiconductor inventory gluts. In tight supply environments, end-customers panic and order the same component allocation from multiple distinct distributors concurrently. This creates a phantom surge in aggregate demand signals to the fab. Once the buyer secures one allocation, they cancel the others, precipitating a sudden inventory bust. [00:34:30]

6. Anecdotes

  • TSMC's 2018 Crypto Burn: Shravan highlights how TSMC learned their allocation discipline the hard way in 2018. The CEO touted massive incoming demand from cryptocurrency asset customers, but within just three quarters, the crypto market imploded, the demand evaporated, and TSMC was left holding the bag. This history dictates why TSMC remains intensely protective of Apple's baseline allocation today. [00:07:38]
  • The "Last Flight Out" GPU Market: Jordan describes the frantic, zero-liquidity environment of trying to secure Nvidia H100 capacity from NeoClouds in real-time. He compares the skyrocketing prices and instant rejections to trying to buy a ticket on the last airplane out of a disaster zone—when capacity is gone, providers simply refuse communication because they have multi-year backlogs stretching out to August/September. [00:22:07]
  • A 10x Analyst Productivity Boom: Shravan shares a personal anecdote regarding AI ROI. By utilizing Claude Code and SemiAnalysis's internal dashboard APIs, he is able to instantly parse and cross-reference thousands of words across dozens of earnings transcripts and PDF slides. He equates the AI to managing "two to three people working for me always," boosting his output by 10x and proving why Hyperscalers refuse to cut CapEx. [00:36:32]

7. References & Recommendations

  • Companies/Entities: TSMC, Nvidia, Apple, AMD, Broadcom, Microsoft, Meta, OpenAI, Anthropic, Qualcomm, MediaTek, Samsung, SK Hynix, Micron, Xiaomi, Oppo, Vivo, CoreWeave, Arista, Lumentum.
  • Products/Technologies: ChatGPT, Claude Code, H100, Blackwell, Rubin GPU, TPU V7, TPU V8, MI400, Kyber, Fineman NVL1152, NVL576, HBM, HBM4, HBM4E, Co-Packaged Optics (CPO), Dense Wavelength Division Multiplexing (DWDM), Linear Pluggable Optics (LPO), Tomahawk 6, Tomahawk 7.
  • Conferences/Agreements: GTC (GPU Technology Conference), OFC (Optical Fiber Conference), OCI MSA, CPX MSA, XPO.

"Brookfield's the largest infrastructure owner in the world... We drew a pipeline and we showed all the different components of the payments ecosystem on a pipeline and said it's like a pipe that moves any commodity except what it's moving…

TSMC 3nm (N3) Capacity 2025
120k wafer starts/mo
The total N3 exit rate capacity of TSMC in 2025.
[00:05:22]
Consumer Baseline N3 Allocation70k - 80k wafers/moTraditional demand historically dedicated to Apple, Qualcomm, etc.[00:05:42]
Projected Smartphone Unit Decline10 - 15%Reduction in global smartphone unit volume due to memory & wafer constraints.[00:06:23]
Apple Share of TSMC Leading Edge25 - 30%Apple's historically dominant position in securing advanced node capacity.[00:08:43]
AI Share of N3 Capacity (2025)9%Accelerator percentage of wafer demand prior to the squeeze.[00:12:50]
AI Share of N3 Capacity (2026)60%The projected percentage of N3 output commanded strictly by AI.[00:10:22]
AI Share of N3 Capacity (2027)85 - 90%Accelerators entirely squeezing out consumer electronics at the leading edge.[00:10:22]
Smartphone Memory BOM Cost25 - 30%Cost of memory within devices, up from a historical baseline of 17-20%.[00:11:53]
Chinese OEM Order CutsUp to 30%Reduction in mid-to-low-end smartphone build orders (Xiaomi, Oppo, Vivo).[00:12:15]
Reallocation Impact (5% N3 shift)>100k Rubin / >300k TPUv7What moving 5% of smartphone capacity yields in physical AI chips.[00:16:28]
Reallocation Impact (25% N3 shift)700k Rubin / 1.5M TPUv7What moving 25% of smartphone capacity yields (still insufficient).[00:16:46]
HBM Wafer Consumption Ratio3x to 4xThe multiple of silicon space required by HBM vs standard commodity DRAM.[00:17:47]
Nvidia HBM Pin Speed Request11 GB/sThe extremely high bandwidth speeds Nvidia is forcing memory vendors to hit.[00:18:11]
Cost of Task via Claude Code$5 - $7Cost of AI performing a 3-4 hour analyst task, explaining high cloud ROI.[00:20:00]
H100 Expected 2026 Price Drop-30%The unfulfilled expectation that newer chips would slash legacy rental costs.[00:21:24]
H100 Actual Price Rebound$170 to $180+Prices bottomed and then rallied 15-20% into early year, climbing another 10% in March.[00:21:24]
Server Cost Increase5 - 10%Increase in the cost of an AI server directly attributed to higher memory prices.[00:24:48]
Max Copper Reach at 224 Gigs~2 MetersThe physical limit of sending high-speed electrical signals without optics.[00:42:07]
Tomahawk 6 Base Capacity100 TerabitsThe aggregate bandwidth capacity for a standard switch chip.[00:43:54]
Nvidia Scale-up Rack Power600 KilowattsMassive power density of a 144-GPU Kyber rack.[00:45:30]
Lumentum Laser Production20 - 30xThe expected scale-up multiplier for ultra-high power laser production.[00:49:05]