Post by Tuan Tran Anh
Researcher at IML - DFKI
π Excited to share that our new work has been accepted to NeurIPS 2025! This paper continues my research on making multi-modal models more data- and compute-efficient. We show that leading 3D point-cloud transformers (PointTransformer-V3, Sonata, SpatialLM) are substantially over-tokenized. Our globally informed graphβbased 3D token-merging method adaptively removes 70β95% of tokens with minimal accuracy loss, and can fully recover performance with only a few epochs of fine-tuning, or even straight off-the-shelf. β‘π This challenges the assumption that more tokens = better performance and opens the door to lighter, faster, and more scalable 3D foundation models. π π Paper & code: https://lnkd.in/eHBYJ9zm Huge thanks to immense efforts from Hoai-Chau Tran, Duy Ho Minh Nguyen, and Paul Swoboda, Daniel Sonntag and to collaborators Michael Barz, Khoa Doan, Roger Wattenhofer, Vien Ngo, and Mathias Niepert. Letβs keep pushing forward! πͺβ¨ If youβre around in San Diego this week, letβs meet up for coffee! β
Video Content