Broadcom’s Jericho4 ASICs just opened the door to multi-datacenter AI training
(2025/08/06)
- Reference: 1754440339
- News link: https://www.theregister.co.uk/2025/08/06/broadcom_jericho_4/
- Source link:
Broadcom on Monday unveiled a new switch which could allow AI model developers to train models on GPUs spread across multiple datacenters up to 100 kilometers apart. The switch could help pave the way for an alternative to the massive facilities currently being built to power the AI boom, allowing companies to stitch together distant and less power-hungry datacenters.
Codenamed [1]Jericho4 , Broadcom claims 51.2 Tb/s of aggregate bandwidth across the ASIC's switch and fabric ports. But while the chip can serve double duty as a scale-out or scale-up network switch, Broadcom has far higher radix and lower latency options with its Tomahawk 6 or Ultra accelerators.
Instead, Broadcom has positioned the chip for datacenter-to-datacenter interconnect (DCI).
[2]
"If you're running a training cluster and you want to grow beyond the capacity of a single building, we're the only valid solution out there," Amir Sheffer, an associate product line manager at Broadcom told El Reg .
[3]
[4]
Each Jericho can be configured with up to eight of what Broadcom calls "hyper ports" – four 800GbE links that behave like one big 3.2Tb/s port.
Compared to simply using ECMP link aggregation to bind a bunch of 800GbE ports together, Broadcom says its hyper ports can achieve 70 percent higher link utilization.
[5]
The silicon-slinger said users can scale Jericho4 into configurations of up to 36,000 hyper ports, which should be enough to connect two datacenters at a blistering 115.2 petabits per second.
That's enough bandwidth to connect 144,000 GPUs, each at 800Gbps, to an equal number in a neighboring datacenter without running into bottlenecks.
Historically, datacenter operators have employed some degree of over-subscription in their DCI deployments, whether it be 4:1 or 8:1, and this is likely to continue to be the case, Sheffer said.
[6]
As great as alleviating the power constraints associated with large scale AI training workloads by distributing those workloads across multiple datacenters might sound, bandwidth isn't the only factor. Latency also comes into play.
While Jericho4's deep HBM-backed buffers and congestion management tech can help with tail latency caused by packet loss, they can't change the fact that light only travels so quickly through glass fiber.
Over a 100-kilometer span the round trip latency works out to a nearly one millisecond – and that's before you take into consideration the latency incurred by the transceivers and protocol overheads.
However, progress is being made to mitigate the impact of latency on distributed training workloads. Back in late January, Google's DeepMind team [7]published a paper titled "Streaming DiLoCo with overlapping communication," in which the web giant detailed an approach to low-communication training.
The basic idea was to create distributed work groups that don't have to talk to one another all that often. By using quantization, and strategically scheduling communication between the datacenters, researchers suggest, many of the bandwidth and latency challenges can be overcome.
[8]Stacking up Huawei's rack-scale boogeyman against Nvidia's best
[9]FCC dives in to sink Chinese grip on undersea internet cables
[10]With Tomahawk Ultra, Broadcom asks who needs UALink when there's Ethernet?
[11]Amazon's latest Graviton 4 EC2 instances pack dual 300Gbps NICs
Jericho4, which is currently available for large customers to sample so they can start designing appliances, arrives as hyperscalers and cloud providers break ground on massive multi-gigawatt datacenter campuses. These clusters are so large that in many cases they require new power plants to be built to support them. Meta, for instance, [12]contracted Entergy to construct three combined-cycle combustion turbine generators totaling 2.2 gigawatts of capacity to fuel its Richland Parish megacluster.
With Jericho4, Broadcom has presented an alternative. Rather than build one great big datacenter campus, AI outfits could build multiple smaller datacenters and pool their resources. ®
Get our [13]Tech Resources
[1] https://investors.broadcom.com/news-releases/news-release-details/broadcom-ships-jericho4-enabling-distributed-ai-computing-across
[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/cloudinfrastructuremonth&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aJLTR9VLpITvPuNhV1C0ewAAAFE&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0
[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/cloudinfrastructuremonth&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aJLTR9VLpITvPuNhV1C0ewAAAFE&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/cloudinfrastructuremonth&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aJLTR9VLpITvPuNhV1C0ewAAAFE&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/cloudinfrastructuremonth&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aJLTR9VLpITvPuNhV1C0ewAAAFE&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[6] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/cloudinfrastructuremonth&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aJLTR9VLpITvPuNhV1C0ewAAAFE&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[7] https://www.theregister.com/2025/02/11/deepmind_distributed_model_training_research/
[8] https://www.theregister.com/2025/07/29/huawei_rackscale_boogeyman/
[9] https://www.theregister.com/2025/07/17/fcc_china_subsea_cables/
[10] https://www.theregister.com/2025/07/15/broadcom_ethernet_scale_up/
[11] https://www.theregister.com/2025/07/02/amazons_graviton_4_network/
[12] https://www.theregister.com/2024/12/05/meta_largestever_datacenter/
[13] https://whitepapers.theregister.com/
Codenamed [1]Jericho4 , Broadcom claims 51.2 Tb/s of aggregate bandwidth across the ASIC's switch and fabric ports. But while the chip can serve double duty as a scale-out or scale-up network switch, Broadcom has far higher radix and lower latency options with its Tomahawk 6 or Ultra accelerators.
Instead, Broadcom has positioned the chip for datacenter-to-datacenter interconnect (DCI).
[2]
"If you're running a training cluster and you want to grow beyond the capacity of a single building, we're the only valid solution out there," Amir Sheffer, an associate product line manager at Broadcom told El Reg .
[3]
[4]
Each Jericho can be configured with up to eight of what Broadcom calls "hyper ports" – four 800GbE links that behave like one big 3.2Tb/s port.
Compared to simply using ECMP link aggregation to bind a bunch of 800GbE ports together, Broadcom says its hyper ports can achieve 70 percent higher link utilization.
[5]
The silicon-slinger said users can scale Jericho4 into configurations of up to 36,000 hyper ports, which should be enough to connect two datacenters at a blistering 115.2 petabits per second.
That's enough bandwidth to connect 144,000 GPUs, each at 800Gbps, to an equal number in a neighboring datacenter without running into bottlenecks.
Historically, datacenter operators have employed some degree of over-subscription in their DCI deployments, whether it be 4:1 or 8:1, and this is likely to continue to be the case, Sheffer said.
[6]
As great as alleviating the power constraints associated with large scale AI training workloads by distributing those workloads across multiple datacenters might sound, bandwidth isn't the only factor. Latency also comes into play.
While Jericho4's deep HBM-backed buffers and congestion management tech can help with tail latency caused by packet loss, they can't change the fact that light only travels so quickly through glass fiber.
Over a 100-kilometer span the round trip latency works out to a nearly one millisecond – and that's before you take into consideration the latency incurred by the transceivers and protocol overheads.
However, progress is being made to mitigate the impact of latency on distributed training workloads. Back in late January, Google's DeepMind team [7]published a paper titled "Streaming DiLoCo with overlapping communication," in which the web giant detailed an approach to low-communication training.
The basic idea was to create distributed work groups that don't have to talk to one another all that often. By using quantization, and strategically scheduling communication between the datacenters, researchers suggest, many of the bandwidth and latency challenges can be overcome.
[8]Stacking up Huawei's rack-scale boogeyman against Nvidia's best
[9]FCC dives in to sink Chinese grip on undersea internet cables
[10]With Tomahawk Ultra, Broadcom asks who needs UALink when there's Ethernet?
[11]Amazon's latest Graviton 4 EC2 instances pack dual 300Gbps NICs
Jericho4, which is currently available for large customers to sample so they can start designing appliances, arrives as hyperscalers and cloud providers break ground on massive multi-gigawatt datacenter campuses. These clusters are so large that in many cases they require new power plants to be built to support them. Meta, for instance, [12]contracted Entergy to construct three combined-cycle combustion turbine generators totaling 2.2 gigawatts of capacity to fuel its Richland Parish megacluster.
With Jericho4, Broadcom has presented an alternative. Rather than build one great big datacenter campus, AI outfits could build multiple smaller datacenters and pool their resources. ®
Get our [13]Tech Resources
[1] https://investors.broadcom.com/news-releases/news-release-details/broadcom-ships-jericho4-enabling-distributed-ai-computing-across
[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/cloudinfrastructuremonth&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aJLTR9VLpITvPuNhV1C0ewAAAFE&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0
[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/cloudinfrastructuremonth&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aJLTR9VLpITvPuNhV1C0ewAAAFE&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/cloudinfrastructuremonth&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aJLTR9VLpITvPuNhV1C0ewAAAFE&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/cloudinfrastructuremonth&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aJLTR9VLpITvPuNhV1C0ewAAAFE&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[6] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/cloudinfrastructuremonth&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aJLTR9VLpITvPuNhV1C0ewAAAFE&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[7] https://www.theregister.com/2025/02/11/deepmind_distributed_model_training_research/
[8] https://www.theregister.com/2025/07/29/huawei_rackscale_boogeyman/
[9] https://www.theregister.com/2025/07/17/fcc_china_subsea_cables/
[10] https://www.theregister.com/2025/07/15/broadcom_ethernet_scale_up/
[11] https://www.theregister.com/2025/07/02/amazons_graviton_4_network/
[12] https://www.theregister.com/2024/12/05/meta_largestever_datacenter/
[13] https://whitepapers.theregister.com/