Access your personal account

Log in to see your favourites, lists and progress.

Access via institution

Not currently connected to any institutions

Connect via

Access Code

Redeem Access Code

Log in to redeem access code

Rethinking Precision: The Design Space of Block Floating-Point Formats for the LLM Era

Session
Monday, 10 November 2025
13:25
Duration: 27 mins
Publication date: 11 Nov 2025
Location: Turing Lecture Theatre, IET London: Savoy Place, London, United Kingdom
Part of event REACH 2025

About the session

Partha Maji, Senior Director – AI Hardware Acceleration, Microsoft, UK

As Large Language Models (LLMs) scale to trillions of parameters, traditional floating-point formats are increasingly constrained by memory bandwidth, energy, and storage limits. Emerging block floating-point (BFP) schemes offer a promising alternative—combining fixed-point efficiency with floating-point adaptability through shared exponents and local scaling. This talk explores the design space of BFP formats for both inference and training, focusing on how exponent sharing, mantissa precision, and block granularity interact with accuracy, stability, and hardware cost. Drawing from recent advances such as MX and NVFP variants, we will examine practical design choices - calibration, accumulation, rounding, and mixed-precision fusion - that enable 3–6× compression and improved accelerator utilization without significant accuracy loss. The discussion bridges algorithm and hardware perspectives, outlining co-design principles that make BFP numerics deployable in real systems. Finally, we highlight open research challenges in dynamic range handling, attention sensitivity, and unified training-inference numerics - inviting the community to rethink precision as a continuum, not a constant.

Keywords:: IET conference

REACH 2025

Reach Emerging Architectures in Computing Horizons

Savoy Place London

Sustainable Computer System Design

emerging AI hardware

memory bandwidth problem

Channels

Communications

Communications

IT

Lectures

Lectures