Post by Tuan Tran Anh

Researcher at IML - DFKI

πŸŽ‰ Excited to share that our new work has been accepted to NeurIPS 2025! This paper continues my research on making multi-modal models more data- and compute-efficient. We show that leading 3D point-cloud transformers (PointTransformer-V3, Sonata, SpatialLM) are substantially over-tokenized. Our globally informed graph–based 3D token-merging method adaptively removes 70–95% of tokens with minimal accuracy loss, and can fully recover performance with only a few epochs of fine-tuning, or even straight off-the-shelf. βš‘πŸ“‰ This challenges the assumption that more tokens = better performance and opens the door to lighter, faster, and more scalable 3D foundation models. πŸš€ πŸ”— Paper & code: https://lnkd.in/eHBYJ9zm Huge thanks to immense efforts from Hoai-Chau Tran, Duy Ho Minh Nguyen, and Paul Swoboda, Daniel Sonntag and to collaborators Michael Barz, Khoa Doan, Roger Wattenhofer, Vien Ngo, and Mathias Niepert. Let’s keep pushing forward! πŸ’ͺ✨ If you’re around in San Diego this week, let’s meet up for coffee! β˜•

Post content

Video Content