BEHAVIOR is a human-centered simulation benchmark to evaluate embodied AI solutions
Embodied artificial intelligence (EAI) is advancing. But where are we now? We propose to test EAI agents with the physical challenges humans need to solve in their everyday life: household activities such as doing laundry, picking up toys, setting the table, or cleaning floors. BEHAVIOR is a benchmark in simulation where EAI agents need to plan and execute navigation and manipulation strategies based on sensor information to fulfill up to 1,000 household activities. BEHAVIOR tests the ability of agents to perceive the environment, plan, and execute complex long-horizon activities that involve multiple objects, rooms, and state changes, all with the reproducibility, safety, and observability offered by a realistic physics simulation.
The broader goal of BEHAVIOR is to fuel the development of general, effective EAI that brings major benefits to people’s daily lives – human-centered AI that serves human needs, goals, and values. BEHAVIOR achieves this by selecting activities from human time use surveys, and conducting large-scale preference surveys that ask people: “what everyday activities do you want robots to do for you?” Furthermore, these activities are defined based on the principles of participatory design: a team of crowdworkers and researchers work together to provide the knowledge and definitions for BEHAVIOR activities.
BEHAVIOR Benchmarks
BEHAVIOR-1K
BEHAVIOR-100