IBM’s Co-Packaged Optics (CPO) Technology for Training and Running Generative AI Models in Data Centers and Other Computing Applications
- Latitude Design Systems
- Apr 10
- 3 min read
Introduction
Data center technology is undergoing a fundamental transformation driven by generative artificial intelligence. Currently, approximately three-quarters of data center traffic occurs internally, leading to a sharp increase in demand for high-speed data transmission. While traditional copper cables have long served as the foundation for data transfer, they are increasingly limited by signal attenuation over long distances. CPO, as a revolutionary solution, is fundamentally reshaping how interconnect bandwidth density and energy efficiency are achieved [1].
Evolution of Computing Performance
Over the past two decades, computing performance has surged by an astonishing factor of 60,000, thanks to the continued scaling predicted by Moore’s law. However, a significant gap has emerged: I/O bandwidth has only improved by a factor of 30 in the same period. This growing discrepancy between computational capacity and data transmission capability has become a major challenge in modern data centers.

Impact on AI Model Training
The limitations of current network infrastructure severely affect the efficiency of AI model training. Recent studies reveal that networks often become bottlenecks in GPU training, with about one-third of users experiencing GPU utilization below 15%. The impact is profound — training a single GPT-4 model consumes around 50 GWh of electricity, underscoring the urgent need for more efficient solutions.

CPO Technology Innovations
IBM’s breakthroughs in CPO technology have led to several major advancements in photonic integration. By significantly shortening electrical circuit lengths and leveraging advanced packaging technologies, this innovation has made substantial progress in addressing both bandwidth density and energy efficiency challenges.

Advanced Module Design
The integrated module design combines optical and electronic components, including a PIC (8 x 10 mm²), substrate (17 x 17 mm²), and waveguides (less than 12mm in length). This level of integration represents a significant leap forward in packaging density and efficiency.

Implementation and Assembly
The optical test vehicles (OTV-1a and OTV-1b) demonstrate the precise integration of optical and electronic components. The assembly process utilizes lead-free flip-chip technology and micro-BGA card connections, marking a significant advancement in manufacturing techniques. With meticulous design and optimization, the technology achieves minimal insertion loss variation across multiple reflow cycles.


Performance Achievements and Reliability
Current CPO technology exhibits substantial improvements over traditional methods. Bandwidth density has increased from 0.15–0.25 Tbps/mm to 2–10 Tbps/mm, and with optimization across 4–16 wavelengths, future targets range from 20–80 Tbps/mm. Interface density between PICs and waveguides has been enhanced sixfold. Rigorous testing has confirmed reliability through multiple reflow cycles, environmental testing, thermal cycling from -40°C to +125°C, and 1000-hour damp heat testing at 85°C and 85% relative humidity.
Future Development
Next-generation CPO technology focuses on advancing several critical areas. Development efforts target waveguide spacing under 20 μm, increased waveguide channel density, and enhanced multi-wavelength compatibility. The technology roadmap includes multilayer connector/termination assembly schemes and improved manufacturing processes. These innovations aim to further reduce energy consumption while enhancing performance.
The successful integration of optical and electronic components, backed by proven reliability through rigorous testing, makes CPO a pivotal enabling technology for next-generation high-performance computing systems. This technological leap offers scalable solutions for future computational demands, addressing both current data center challenges and anticipated future requirements. Significant improvements in bandwidth density and energy efficiency position CPO as a foundational technology for next-generation computing infrastructure.
Reference
[1] J. Knickerbocker, J. B. Heroux, G. Bonilla, H. Hsu, N. Liu, A. P. Ramos, F. Arguin, Y. Tribodeau, B. Terjani, M. Schultz, R. K. Ganti, L. Chu, C. Marushima, Y. Taira, S. Kohara, A. Horibe, H. Mori, and H. Numata, "Next generation Co-Packaged Optics Technology to Train & Run Generative AI Models in Data Centers and other computing applications," Technical Report, IBM Research, 2024.
Comments