AI Gigawatt Campuses: Power Quality, Cooling, and Supply Risk

The October 02, 2025 episode of Anastasi In Tech has Anastasi doing a deep dive on the “Colossus 2,” a gigawatt-scale AI campus. She explains how siting, interconnection, sub-cycle power control, liquid cooling, and on-site recycled-water systems now determine uptime and cost.

AI Gigawatt Campuses: Power Quality, Cooling, and Supply Risk

  • My 'briefing notes' summarize the content of podcast episodes; they do not reflect my own views.
  • They contain (1) a summary of podcast content, (2) potential information gaps, and (3) some speculative views on wider Bitcoin implications.
  • Pay attention to broadcast dates (I often summarize older episodes)
  • Some episodes I summarize may be sponsored: don't trust, verify, if the information you are looking for is to be used for decision-making.

Summary

The October 02, 2025 episode of Anastasi In Tech has Anastasi doing a deep dive on the “Colossus 2,” a gigawatt-scale AI campus by xAI. She explains how siting, interconnection, sub-cycle power control, liquid cooling, and on-site recycled-water systems now determine uptime and cost. She also details network fabric choices and vendor concentration that constrain scale as clusters approach hundreds of thousands of GPUs.

Take-Home Messages

  1. Interconnection First: Queue position and local politics drive timelines more than engineering drawings.
  2. Sub-Cycle Stability: Battery buffering and fast controls are core compute equipment as GPU spikes hit the grid.
  3. Thermal Discipline: Direct-to-chip liquid plus tested failover is mandatory as racks near ~130 kW.
  4. Water On-Site: Recycled-water plants reduce municipal strain and stabilize cooling through seasonal swings.
  5. Supply Bottlenecks: GPU/HBM availability and fabric efficiency, not floor space, cap practical scale.

Overview

Anastasi explains why the xAI project shifted across the Tennessee–Mississippi border to secure faster interconnection and expandable power rights. Early capacity comes from repowered gas turbines while large battery systems damp millisecond GPU transients that otherwise cause voltage sags and costly restarts. She frames siting as a race for grid access in which interconnection position dictates the critical path.

Power density rises from roughly 50 kW per rack toward ~130 kW with Blackwell-class systems, forcing adoption of direct-to-chip liquid cooling. Anastasi describes building-level chilled water tied to extensive air-cooled chillers and warns that short interruptions can crash racks or damage hardware. She stresses tight electrical-mechanical-controls coordination to keep utilization high.

Water strategy receives unusual weight as the campus adds a ceramic membrane bioreactor to recycle on the order of 13 million gallons per day. Anastasi presents this as a reliability play that reduces dependence on potable supplies and cushions seasonal variability. She signals similar systems will spread as multi-hundred-megawatt halls concentrate cooling demand.

On the compute side, NVLink aggregates local GPU “islands” while Spectrum-class Ethernet stitches halls together at high line rates. Anastasi points to DPUs offloading networking and storage so training jobs maintain high GPU utilization. She closes by highlighting vendor concentration and HBM throughput as gating factors as clusters scale toward hundreds of thousands of accelerators.

Stakeholder Perspectives

  1. Grid operators: Require sub-cycle controls and storage to prevent voltage excursions from GPU spikes.
  2. Municipal leaders: Weigh tax revenue and jobs against water use, noise, land, and traffic externalities.
  3. Environmental regulators: Scrutinize gas reliance, emissions intensity, and recycled-water system impacts.
  4. AI tenants: Prioritize reliability, fabric efficiency, and predictable $/MW while hedging single-supplier risk.
  5. Financiers and insurers: Tie capital to interconnection milestones, supply assurances, and demonstrated thermal resilience.

Implications and Future Outlook

Operators will harden power quality with right-sized batteries, faster controls, and tighter orchestration between GPUs and facility systems. Liquid cooling becomes standard as densities rise, with verified failover windows treated as product requirements. On-site recycled-water capacity shifts from optional to strategic, smoothing seasonal stress and community relations.

Medium-term focus moves to fabric scalability and supply concentration as clusters span halls and campuses. Multi-vendor strategies, memory-supply guarantees, and standardized telemetry aim to preserve training efficiency at scale. Heat-recovery pilots and cleaner long-term power contracts reduce operating risk and public pushback.

Regions that align permitting, grid upgrades, and industrial-water solutions will capture outsized investment. Campuses will stage capacity in tranches linked to interconnection and component deliveries to avoid stranded assets. Governance around monitoring, reporting, and community benefits will shape the social license to operate.

Some Key Information Gaps

  1. What control strategies and storage sizing best damp millisecond GPU spikes without overspending on batteries? Facility-level optimization here directly protects uptime and grid stability.
  2. What cooling redundancy and failover times are necessary to avoid crash or damage as densities approach ~130 kW per rack? Clear thresholds guide procurement, testing, and insurance.
  3. What portfolio of fuels and contracts lowers long-run $/MWh while meeting emerging emissions requirements? Power cost and policy compliance determine siting and competitiveness.
  4. What fabric topologies and congestion controls sustain >90% efficiency across hundreds of thousands of GPUs? Network choices set the ceiling on effective compute and training time.
  5. Which heat-recovery architectures can economically capture a meaningful share of waste heat at campus scale? Bankable offtakes convert a liability into local development value.

Broader Implications for Bitcoin

Grid Markets Converge with Compute Demand

Large AI campuses normalize sub-second ancillary services at the load, accelerating markets for fast frequency response and voltage support. Demand-response Bitcoin miners can monetize these sub-second services alongside AI loads, anchoring new ancillary markets and improving curtailment economics. Over time, telemetry-rich mining fleets become dispatchable grid assets that stabilize variable generation while preserving hash-rate targets.

Energy Siting Competition Intensifies

Jurisdictions that pre-permit interconnection capacity, recycled-water plants, and heat-aware cooling will outcompete peers for capital. Regions that streamline behind-the-meter generation and flexible-load programs will attract co-located Bitcoin mining that stabilizes demand during AI maintenance or ramp cycles. This alignment reshapes regional energy maps as utilities adopt transparent capacity reservation and performance-based tariffs.

Water Becomes a First-Class Datacenter Resource

Mandated recycled-water systems and closed-loop cooling will become standard for high-density compute. These designs reduce political risk for large Bitcoin mines in arid basins and enable scale without drawing on municipal potable supplies. Over three to five years, permitting tied to water circularity will gate both AI growth and energy-intensive Bitcoin operations.

Supply Chain Concentration Drives Policy Responses

Concentration in accelerators, memory, and power electronics will trigger export controls, subsidies, and diversification mandates. Bitcoin ASIC and HBM chokepoints invite shared procurement, domestic assembly, and inventory hedges that align mining and AI supply strategies. Joint qualification frameworks lower systemic risk while preserving performance across heterogeneous fleets.

Waste Heat as Infrastructure

Gigawatt campuses can unlock district-energy markets if standardized interfaces and finance mature. High-temperature loops from Bitcoin halls and AI clusters can feed greenhouses, industrial pre-heating, or residential networks, converting thermal liabilities into contracted revenue. Municipalities that pilot bankable heat offtakes will set templates replicable across regions.

From PUE to System-Level Efficiency

Operational metrics will expand beyond PUE to include grid services delivered, water circularity, and effective compute per megawatt. Bitcoin operators should report miner uptime under curtailment and ancillary revenue earned to demonstrate net system value. This reframing influences tariffs, interconnection terms, and public acceptance.

Bitcoin–AI Grid Symbiosis

Hybrid campuses will use miners as flexible ballast during AI training lulls, cutting stranded capacity and smoothing demand ramps. This flexibility strengthens power-purchase negotiation leverage and improves asset utilization across turbines, batteries, and cooling. Over time, modular mining capacity becomes a standard reliability tool embedded in campus energy-management systems.