Post by Robotrove

140 followers

๐—ง๐—ต๐—ฒ ๐—ฟ๐—ผ๐—ฏ๐—ผ๐˜๐—ถ๐—ฐ๐˜€ ๐—ฑ๐—ฎ๐˜๐—ฎ ๐—ฏ๐—ผ๐˜๐˜๐—น๐—ฒ๐—ป๐—ฒ๐—ฐ๐—ธ ๐—ท๐˜‚๐˜€๐˜ ๐—ด๐—ผ๐˜ ๐˜‚๐—ป๐—น๐—ผ๐—ฐ๐—ธ๐—ฒ๐—ฑ. 14.6 years of internet video. 147 million action segments. Zero annotation. Rice University's @RobotPI Lab just dropped EgoInfinity โ€” not another static dataset, but a data engine that turns arbitrary internet videos into executable robot training data. Here's what it does: โ†’ Takes any static-camera RGB video of human manipulation โ†’ Recovers metric 3D hand trajectories + 6-DoF object poses + contact states โ†’ Retargets the motion onto any robot morphology No mocap. No wearables. No manual labels. No human in the loop. The key insight: it's not a naive pipeline. Cross-module metric calibration and interaction-aware refinement fix the drift and contact inconsistency you'd get from just chaining vision models together. Already validated on 4 robots: Unitree G1, Robonaut2, Dual-Franka, XLeRobot โ€” with real-robot demos on Franka FR3 and LEAP dexterous hand (grasping, cutting, wiping, pouring). The addressable corpus is bounded only by Action100M. The engine itself is corpus-agnostic. Scale is literally infinite. This is how you go from hundreds of lab demos to millions of real-world manipulation trajectories. Huge thanks to the team behind it Kejia Ren, @Yiting Chen, Howard Qian, Podshara Chanrungmaneekul, and Kaiyu Hang, with @Andrew Morgan from RAI Institute. ย Project: https://lnkd.in/eDCtAtZU ย Paper: https://lnkd.in/eRVG3xiU ย Code: https://lnkd.in/eEwsatwx Follow the Robotrove page for more drops that push physical AI forward.

Post content

Video Content