Post by Nubank

5,986,496 followers

As Large Language Models scale, the industry faces a critical challenge: the massive memory overhead required for training. Tomorrow, Nubankers will be at ICLR to present a technical contribution that addresses this head-on: “On quantizing the state of the Muon optimizer.” The study introduces the 8-bit Muon, a solution using blockwise quantization that achieves a 62% reduction in optimizer state footprint while maintaining full performance parity, even at large parameter scales. The research reveals that Muon’s update mechanism is uniquely compatible with simple linear quantization, bypassing the complex scaling usually required for optimizers like AdamW. For those attending the conference in Rio, the authors Rafael Celente and Hiroto Udagawa will be at the poster session in Room 202C this Monday, starting at 9:00 AM. It is a great opportunity to discuss how Nubank applies cutting-edge research to redefine financial services at scale. To learn more about the research, access the full paper here: https://lnkd.in/diqbU9Vw