Meta’s V-JEPA 2 Model Trains AI To Understand Its Surroundings

On Wednesday, Meta introduced its latest AI model, V-JEPA 2 — a "world model" aimed at enabling AI agents to better interpret and navigate their surroundings.
Image Credits: Pixabay

On Wednesday, Meta introduced its latest AI model, V-JEPA 2 — a “world model” aimed at enabling AI agents to better interpret and navigate their surroundings.

V-JEPA 2 builds on the original V-JEPA model released last year, which learned from over 1 million hours of video. This extensive training helps robots and other AI systems operate in the physical world by enabling them to understand and predict how forces like gravity shape future events.

These are the types of intuitive understandings that young children and animals naturally develop as their brains mature — for instance, when playing fetch with a dog, the dog will (ideally) grasp that a ball bouncing on the ground will spring upward, or that it should run toward where it expects the ball to land, rather than chasing its current position.

Teaching AI to Understand and Act in the Physical World

Meta illustrates scenarios in which a robot might face situations like seeing from a first-person perspective that it’s holding a plate and a spatula while approaching a stove with cooked eggs. The AI can then infer that a logical next step would be to use the spatula to transfer the eggs onto the plate.

Meta claims that V-JEPA 2 is 30 times faster than Nvidia’s Cosmos model, which also focuses on improving physical-world intelligence. However, it’s possible that Meta is using different evaluation criteria than Nvidia to measure performance.

“We believe that world models will mark the beginning of a new era in robotics, allowing AI agents to assist with everyday chores and physical tasks in the real world—without requiring massive amounts of robotic training data,” said Meta’s chief AI scientist Yann LeCun in a video.


Read the original article on: TechCrunch

Read more: The Release of OpenAI’s Open Model has Been Postponed