Post by Meta
12,063,290 followers
Segmenting an image with AI used to require guiding the model one object at a time. However, that approach breaks down as use cases grow more complex. SAM 3, was built to handle that shift — identifying and tracking multiple objects across images and video from a single prompt while running in near real time. This concept-based approach uses semantic understanding to unlock new real-world applications in video editing and analysis at scale. Getting there meant iterating across data, architecture and training, and making sure those capabilities hold up together under real-world constraints. Meta Research Engineer Chay Ryali breaks down his approach to building SAM 3 for semantic understanding 👇 #LifeAtMeta #MetaCareers