- Session
- 14:10 - 14:10
- Duration: 29 mins
- Publication date: 11 Nov 2025
- Location: Turing Lecture Theatre, IET London: Savoy Place, London, United Kingdom
- Part of event REACH 2025
About the session
Bill McColl, Director, Computing Systems Lab, Huawei Research Center, Zurich, Switzerland
The next major step in the evolution of computing will be the construction of modular superpods and massive superclusters, the largest of which will have millions of compute nodes. The dominant workload on these systems will be AI agents - commercial agents, industrial agents, research agents, and personal agents. A large datacenter-scale supercluster will need to be able to run billions of complex agents concurrently with predictable ultra-high performance, where each of the agents at any point in time may be learning, reasoning, planning, decision-making, action-taking, or collaborating. Moreover, many agents may be very long-lived, or persistent and running continuously, often in the background. Agents may also migrate across superclusters over time. No parallel architectures have ever been constructed to handle such a scale or such a workload. In this talk I will describe a new approach to the design of superpods and superclusters to address this challenge. The main focus will be on how we can build the new ultra-fast interconnects and data exchange engines required for communication and coordination between billions of complex agents.