On September 6th, Google announced at Hot Chips 2025’s closing session that its latest 7th-gen TPU “Ironwood”—launched in April and the first Google chip built for large-scale inference—has been deployed to Google Cloud data centers, redefining AI computing limits via breakthrough tech.
Core Specifications
- Dual computing cores with 4614 TFLOPs (FP8 precision)
- 192GB HBM3e memory (7.3TB/s bandwidth) and 1.2TB/s single-chip I/O bandwidth
- Scalable to 9216 chips per system, hitting 42.5 ExaFLOPS peak performance and 1.77PB shared memory (a global record)
Reliability & Thermal Management
- Built-in root of trust, self-testing, and silent data corruption protection; uses logical repair to boost yield (RAS-compliant)
- Auto-reconfigures on node failure and resumes tasks via checkpoints
- Google’s 3rd-gen liquid cooling + cold plate: doubles stability vs. air cooling
Efficiency & Real-World Impact
- 2x more energy-efficient than prior Trillium TPU; supports dynamic voltage/frequency adjustment
- Optimized circuitry, 4th-gen sparse cores (fits recommendation engines, full-modal generation)
- Cuts pharma genetic sequencing to days, banks’ credit approval to minutes, and hits 99.9% fraud detection accuracy
ICgoodFind Summary: Ironwood TPU’s rollout showcases Google’s AI hardware dominance, speeds up ultra-large inference industrialization, and sets a high-performance, high-reliability benchmark for semiconductors.