BEHAVIOR is a human-centered simulation benchmark to evaluate embodied AI solutions

Embodied artificial intelligence (EAI) is advancing. But where are we now? We propose to test EAI agents with the physical challenges humans need to solve in their everyday life: household activities such as doing laundry, picking up toys, setting the table, or cleaning floors. BEHAVIOR is a benchmark in simulation where EAI agents need to plan and execute navigation and manipulation strategies based on sensor information to fulfill up to 1,000 household activities. BEHAVIOR tests the ability of agents to perceive the environment, plan, and execute complex long-horizon activities that involve multiple objects, rooms, and state changes, all with the reproducibility, safety, and observability offered by a realistic physics simulation.

The broader goal of BEHAVIOR is to fuel the development of general, effective EAI that brings major benefits to people’s daily lives – human-centered AI that serves human needs, goals, and values. BEHAVIOR achieves this by selecting activities from human time use surveys, and conducting large-scale preference surveys that ask people: “what everyday activities do you want robots to do for you?” Furthermore, these activities are defined based on the principles of participatory design: a team of crowdworkers and researchers work together to provide the knowledge and definitions for BEHAVIOR activities.

BEHAVIOR Benchmarks



BEHAVIOR-1K

  • 1,000 activities from a large-scale human preference survey
  • 8 scene types, 50 fully interactive scenes
  • 1,900+ object types, 9,000+ object models
  • Based on OmniGibson, powered by Nvidia's Omniverse
  • Supports flexible materials and deformable bodies
  • Supports realistic fluids and thermal effects
  • Requires Geforce RTX graphics card
  • Explore BEHAVIOR-1K


    BEHAVIOR-100

  • 100 activities from American Time Use Survey
  • Resident house scenes, 15 fully interactive scenes
  • 390+ object types, 1,200+ object models
  • Based on iGibson 2.0 and PyBullet
  • Human demonstration dataset with 500 VR demos
  • Activity length of ~300-20,000 steps
  • Explore BEHAVIOR-100

    The BEHAVIOR-1K Team


    Chengshu (Eric) Li Chengshu (Eric) Li
    Ruohan Zhang Ruohan Zhang
    Josiah Wong Josiah Wong
    Cem Gokmen Cem Gokmen
    Sanjana Srivastava Sanjana Srivastava
    Hang Yin Hang Yin
    Wensi Ai Wensi Ai
    Sujay Garlanka Sujay Garlanka
    Benjamin Martinez Benjamin Martinez
    Roberto Martín-Martín Roberto Martín-Martín
    Silvio Savarese Silvio Savarese
    Hyowon Gweon Hyowon Gweon
    Karen Liu Karen Liu
    Jiajun Wu Jiajun Wu
    Li Fei-Fei Li Fei-Fei

    Alumni


    Chen Wang Chen Wang
    Gabrael Levine Gabrael Levine
    Michael Lingelbach Michael Lingelbach
    Ayano Hiranaka Ayano Hiranaka
    Minjune Hwang Minjune Hwang
    Jiankai Sun Jiankai Sun
    Mona Anvari Mona Anvari
    Arman Aydin Arman Aydin
    Emily Jin Emily Jin
    Manasi Sharma Manasi Sharma
    Dhruva Bansal Dhruva Bansal
    Samuel Hunter Samuel Hunter
    Kyu-Young Kim Kyu-Young Kim
    Alan Lou Alan Lou
    Caleb Matthews Caleb Matthews
    Ivan Villa-Renteria Ivan Villa-Renteria
    Jerry Tang Jerry Tang
    Claire Tang Claire Tang
    Fei Xia Fei Xia
    Kent Vainio Kent Vainio
    Zheng Lian Zheng Lian
    Shyamal Buch Shyamal Buch
    Yunzhu Li Yunzhu Li