Post by Yuzhe Yang

AI Prof @ UCLA | Scientist @ Google | PhD @ MIT

๐Ÿ“ˆ Can LLMs really reason over health time series? Introducing ๐—›๐—˜๐—”๐—ฅ๐—ง๐—ฆ โค๏ธ โ€” the first ๐˜ญ๐˜ช๐˜ท๐˜ช๐˜ฏ๐˜จ benchmark for health time-series reasoning. Most current evaluations of health time series are still narrow in scope.ย With ๐—›๐—˜๐—”๐—ฅ๐—ง๐—ฆ, we move beyond that and study how modern LLMs handle real physiological data at scale. We built a large-scale benchmark with ย โ€ข ๐Ÿงช ๐Ÿฎ๐Ÿฌ๐—ž+ย test samples ย โ€ข ๐Ÿงฉ ๐Ÿญ๐Ÿญ๐Ÿฌ tasks ย โ€ข ๐Ÿฅ ๐Ÿญ๐Ÿฎ health domains (metabolism, motion, cardiac, sleep, audio, ...) ย โ€ข ๐Ÿ“ก ๐Ÿฎ๐Ÿฌ signal modalities (ECG, PPG, EEG, IMU, EMG, CGM, ...) ๐Ÿ“Š It enables to date the broadest coverage of ย โ€ข sequence lengths (up to 1M+ steps), ย โ€ข sampling frequencies (up to 48kHz), ย โ€ข time spans (from seconds to years). ๐Ÿš€ Rather than focusing on a narrow slice of prediction, ๐—›๐—˜๐—”๐—ฅ๐—ง๐—ฆ covers four levels of reasoning in one unified benchmark: ๐Ÿง  ๐˜—๐˜ฆ๐˜ณ๐˜ค๐˜ฆ๐˜ฑ๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐Ÿ” ๐˜๐˜ฏ๐˜ง๐˜ฆ๐˜ณ๐˜ฆ๐˜ฏ๐˜ค๐˜ฆ โœ๏ธ ๐˜Ž๐˜ฆ๐˜ฏ๐˜ฆ๐˜ณ๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ โš™๏ธ ๐˜‹๐˜ฆ๐˜ฅ๐˜ถ๐˜ค๐˜ต๐˜ช๐˜ฐ๐˜ฏ Across 14 state-of-the-art LLMs ๐Ÿค–, we find that strong general capability does not yet translate into strong health time-series reasoning. Many models still struggle with long-range temporal structure, high-frequency signals, and tasks that require more than simple pattern matching or heuristic shortcuts. ๐—›๐—˜๐—”๐—ฅ๐—ง๐—ฆ is designed as a living and evolving community benchmark. We hope it will continue to grow with community inputs on new datasets / tasks / models, and help push toward AI that can better understand and reason over health time series in the real world! ๐Ÿ‘‡ ๐Ÿ“„ Paper: https://lnkd.in/gf5-UBeA ๐ŸŒ Website: https://lnkd.in/gNEnwjXB ๐Ÿ•ต๏ธ Code: https://lnkd.in/gzcfmYCZ ๐Ÿค— Dataset: https://lnkd.in/g7Ea6zvj ๐Ÿ† Leaderboard: https://lnkd.in/gc5y_8EX Great work led by my students Sirui Li, Shuhan Xiao, Mihir Joshi and collaborators Ahmed Abdelhadi Metwally, Daniel McDuff, and Wei Wang! We are also grateful for generous compute support from Google, OpenAI, Anthropic, and xAI. UCLA UCLA Computer Science Computational Medicine Department UCLA Henry Samueli School of Engineering and Applied Science #AI #HealthAI #LLM #TimeSeries #MultimodalAI #FoundationModels

Post content