Latest from FTTx/Optical Networks
OSP Engineering
Addressing the deluge of bandwidth from AI/ML evolutions.
As the deadline for this article was fast approaching, I was tempted to ask Microsoft’s Copilot or ChatGPT to take a crack at it. While I didn’t accept their help this time, we are experimenting with AI assistance in multiple areas of our marketing work, including some early-phase writing. If you attended this year’s Mobile World Congress OFC or ECOC events, you know that AI was everywhere.
So, let’s go back to the subtitle of this article. What is a deluge? Does it really exist, and if so, how should we address it? In the Merriam-Webster dictionary, deluge is defined as “an overwhelming amount.” During the early days of COVID-19, research indicated that every part of the network—from residential broadband to metro, long-haul, subsea, and data center interconnect (DCI)—was growing more than 40% annually.
But if you fast-forward to today, my latest discussions with operators indicate that DCI capacity demand has accelerated up to 50% annually, while other parts of the network have slowed to 20-30%. In effect, traffic demand is bifurcating—with DCI red hot and other parts of the network still growing but at a slower pace. Any traffic that doubles in less than two years qualifies as a deluge in my book. So, yes, DCI traffic is a deluge.
But is DCI’s accelerated growth driven by AI? The short answer is yes, some of it is, but we are still in the very early days of AI/ML and in particular generative AI. Let me explain. Right now, the biggest impact of AI/ML is happening inside the data center vs. outside.
Generative AI runs on high-performance accelerated GPUs, not standard servers with CPUs. GPUs consume more power, use very large data sets or parameters, and work in parallel to accomplish their complex tasks—especially during training. This parallelism is key as it requires coordination and the sharing of vast amounts of information. Low latency is critical as an intermediate result from one GPU may serve as an input to the calculations of others—thus halting computation if not delivered rapidly.
Parallelism and sharing of large data sets are why GPU clusters are interconnected with high-speed optics and optical fabrics. Meta recently announced two 24,000 GPU AI clusters. Avoiding GPU connectivity bottlenecks is why we see short distance 400G optical interconnects today rapidly giving way to 800G and 1.6T over the next several years.
With an increase in modular, distributed data centers, we look for hollow-core fiber to be introduced for DCI connections in metro areas in coming years.
So, while AI/ML’s effect on data center interconnect growth is modest when compared to inside the data center, it will increase with time. With more applications, more people taking advantage of AI/ML capabilities (think about medical imaging analysis for disease detection), and generative AI creating new images and videos (consider collaboration with artists or marketing/branding), the north-south traffic to/from data centers will continue to grow.
And we know that data centers don’t exist in a vacuum. They need connectivity with other data centers—data centers that are increasingly modular and distributed to reduce their impact on local real estate and the power grid and to be closer to end users for latency-sensitive applications. One estimate holds that 9% of all data center traffic is east-west—meaning DCI or connectivity to other data centers. Thus, with more data centers coming online, many of them distributed, and more AI/ML traffic to/from data centers, Al/ML will help DCI to sustain its hot growth rate for years to come.
So, how do we address increasing data center modularity and distribution while also supporting the DCI deluge that is already here and will be sustained in part by accelerating AI/ML utilization? The answer is fourfold: stackable compact modular platforms, innovations in coherent optical engines, increases in fiber spectrum, and the introduction of new transmission mediums like hollow-core fiber.
Right Size
Today’s compact modular optical platforms enable operators to start with a 1RU, 2RU, or 3RU chassis and stack them as needed, matching cost to capacity while minimizing complexity. The latest compact modular platforms are also designed to support mixing and matching of both optical line system and transponder or optical engine sleds. This approach is key to minimizing costs in smaller DCI deployments—enabling both line system and transponder functions to be combined into a single chassis instead of multiple units required with a dedicated per-function design approach.
Speed Up
Leading coherent optical engines are evolving in two directions simultaneously: 1) smaller, lower-power pluggables that can reach 1,000 km or more and 2) embedded, sled-based optical engines with sophisticated transmission schemes that maximize capacity-reach and spectral efficiency.
A wide variety of 400G coherent pluggables are available today, including 400ZR, which supports fixed DCI applications up to 120 km. 400G ZR+ pluggables offer more advanced functionality, including increased programmability and better optical performance for metro-regional and some long-distance connectivity. 800G coherent pluggables are under development for delivery in early 2025. This latest generation of coherent pluggables expands capacity-reach significantly. With such capabilities in small QSFP-DD packages, IP over DWDM (IPoDWDM) is being realized in metro DCI applications with pluggables being deployed directly into routers and switches.
This brings us to embedded optical engines. While today’s 800G embedded engines deliver enormous value for DCI applications, we are moving into the terabit era with the development of 1.2+ Tb/s engines that can transmit 800G up to 3,000 km. Due to their high capacity-reach, embedded optical engines are ideal for long-distance DCI connectivity solutions, including across continents or oceans with subsea connectivity. Embedded engines are ideal where fiber is scarce and spectral efficiency matters most. As an example, data center operators that lease fiber can utilize embedded optical engines to maximize data transmission over a single fiber pair and thus avoid the incremental costs associated with leasing more fibers.
GPUs consume more power, use very large data sets or parameters, and work in parallel to accomplish their complex tasks—especially during training. This parallelism is key as it requires coordination and the sharing of vast amounts of information.
More Spectrum
For more than 10 years, we have used the 4.8 THz extended C-band spectrum for DCI and other fiber optic applications. Occasionally, especially with internet content providers and DCI applications, the 4.8 THz L-band spectrum has been added for a combined 9.6 THz C+L delivery. But, with advancements in optical line system components like amplifiers and wavelength-selective switches, we can now cost-effectively increase the transmission spectrum from 4.8 THz to 6.1 THz in both the Super C- and Super L-bands. For a small incremental line system infrastructure cost, we can realize 27% incremental spectrum and transmission capacity per fiber pair. Super C and Super L transmission is a cost-effective way to get more out of existing fiber resources to keep up with DCI capacity demands.
Change the Medium
Microsoft purchased Lumenisity, a hollow-core fiber optic innovator, in late 2022. A sometimes-overlooked fact is that a wavelength does not travel at the speed of light through a silica (glass) fiber. Due to propagation and scattering, waves travel slower—~67% of the speed of light through a fiber strand. On the other hand, hollow-core fiber (HCF) with a gas vs. silica medium can travel faster—~47% faster according to Lumenisity.
Faster transmission means lower latency, and we know lower latency is a desirable attribute for certain types of traffic like AI/ML or high-speed financial trading. Although improving, the knock on HCF has always been increased attenuation or signal degradation over distance. This too has been improving significantly in recent years. With an increase in modular, distributed data centers, we look for hollow-core fiber to be introduced for DCI connections in metro areas in coming years.
Bring It All Together
The commercialization of AI/ML technology, and in particular generative AI applications like ChatGPT, is having significant impacts inside the data center. In parallel, DCI traffic demands have accelerated. While modest today, AI/ML-related DCI traffic will continue to grow—and help buoy an already hot DCI market.
To accommodate the rapid connectivity growth between data centers, we will need to continue to innovate with modular, stackable optical solutions; with pluggable and embedded optical engines that deliver more capacity in less power and footprint; with more spectrum on the fiber super-highway; and with new transmission mediums like hollow-core fiber.
Commercially available generative AI solutions (like ChatGPT) only launched less than two years ago. We are literally just getting started. Hold on tight and grab an optical transmission partner that’s laser focused on what’s next.
Tim Doiron | Vice President, Solution Marketing, Infinera
Tim Doiron is Vice President, Solution Marketing at Infinera. For more information, visit www.infinera.com. Follow Tim on LinkedIn. Follow Infinera on LinkedIn, Facebook and X @Infinera.