Post by National AI Research Lab (NAIRL)

1,213 followers

How does an AI decide what a scene should sound like? Not just whether a sound matches the picture, but why a heavy object landing fast should sound different from a light one landing slow? Researchers from Professor Tae-Hyun Oh's team at KAIST, affiliated with the National AI Research Lab (NAIRL), have been selected for an Oral presentation at #CVPR2026, an honor reserved for papers ranking within the top 1% worldwide, for work that tackles exactly this question. The study, conducted jointly with Pohang University of Science and Technology and SonyAI, introduces PAVAS (Physics-Aware Video-to-Audio Synthesis), an AI that generates realistic sound effects by reasoning about the physical properties of objects in a video. When we watch a giant dinosaur step forward on screen, we instinctively expect a low, heavy sound, because our brains anticipate sound by combining an object's shape, size, weight, and movement speed. Earlier video-to-audio models generated sound based mainly on what was visible on screen, often failing to reflect differences in mass and speed. PAVAS instead estimates physical quantities such as mass and velocity from a video and feeds them into the sound generation model, so that volume and timbre change according to collision intensity and object weight, all in about ten seconds. The team expects the technology to extend across film and game sound effects, augmented and virtual reality, metaverse content, robotics simulation, and even the detection of manipulated content such as deepfakes. As Professor Oh notes, while generative AI has advanced largely by scaling data and models, this work is meaningful in designing AI to reflect physical quantities and causality, pointing toward next-generation multimodal AI that understands text, video, and audio together. We extend our appreciation to Professor Oh and the research team for their contribution to this important area of AI research. Watch the broadcast coverage (KBS): https://naver.me/FfssJib1

Post contentPost contentPost content