Habitat Rearrangement Challenge 2022

Overview

For NeurIPS 2022, we are hosting the Object Rearrangement challenge in the Habitat simulator 1 2. Object rearrangement focuses on mobile manipulation, low-level control, and task planning.

For details on how to participate, submit and train agents refer to github.com/facebookresearch/habitat-challenge/tree/rearrangement-challenge-2022 repository.

In the object rearrangement task, a Fetch robot is randomly spawned in an unknown environment and asked to rearrange 1 object from an initial to desired position – picking/placing it from receptacles (counter, sink, sofa, table) and opening/closing containers (drawers, fridges) as necessary. The task is communicated to the robot using the GeometricGoal specification, which provides the initial 3D center-of-mass position of each target object to be rearranged along with the desired 3D center-of-mass position for that object. An episode is successful if the target object is within 15cm of its desired position (without considering orientation).

The Fetch robot is equipped with an egocentric 256x256 90-degree FoV RGBD camera on the robot head. The agent also has access to idealized base-egomotion giving the relative displacement and angle of the base since the start of the episode. Additionally, the robot has proprioceptive joint sensing providing access to the current robot joint angles.

There are two tracks in the Habitat Rearrangement Challenge.

Rearrange-Easy: The agent must rearrange one object. Furthermore, all containers (such as the fridge, cabinets, and drawers) start open, meaning the agent never needs to open containers to access objects or goals. The task planning in rearrange-easy is static with the same sequence of navigation to the object, picking the object, navigating to the goal, and then placing the object at the goal. The maximum episode length is 1500 time steps.
Rearrange: The agent must rearrange one object, but containers may start closed or open. Since the object may start in closed receptacles, the agent may need to perform intermediate actions to access the object. For example, an apple may start in a closed fridge and have a goal position on the table. To rearrange the apple, the agent first needs to open the fridge before picking the apple. The agent is not provided with task information about if these intermediate open actions need to be executed. This information needs to be inferred from the egocentric observations and goal specification. The maximum episode length is 5000 time steps.

Dataset

For both tracks, there are the following dataset splits. All datasets use scenes from the ReplicaCAD dataset 2. Objects are from YCB 3, and the same object types are present in all dataset splits (there are no new objects in any of the splits).

train: 50k episodes specifying rearrangement problems across 63 scenes. Robot positions are procedurally generated in these episodes, giving an infinite number of training episodes.
minival: 20 episodes for quick evaluation to ensure consistency between your local setup, Docker evaluation, and EvalAI.
val: 1k episodes in a different 21 scenes from ReplicaCAD.
test-standard: 1k episodes from 21 unseen scenes from ReplicaCAD for the public EvalAI leaderboard. There is no public access to this dataset.
test-challenge: 1k episodes on the same 21 scenes from test. This dataset will be used to announce the competition winners.

Agent

We first describe the Fetch robot’s sensors and the information provided from these sensors as inputs to your policy in the challenge. Next, we describe the action space of the robot and the values your policy should output.

Agent Observations: The Fetch robot is equipped with an RGBD camera on the head, joint proprioceptive sensing, and egomotion. The task is specified as the starting position of the object and the goal position relative to the robot start in the scene. From these sensors, the task provides the following inputs:

256x256 90-degree FoV RGBD camera on the Fetch Robot head.
The starting position of the object to rearrange relative to the robot’s end-effector in cartesian coordinates.
The goal position relative to the robot’s end-effector in cartesian coordinates.
The starting position of the object to rearrange relative to the robot’s base in polar coordinates.
The goal position relative to the robot’s base in polar coordinates.
The joint angles in radians of the seven joints on the Fetch arm.
A binary indicator if the robot is holding an object (1 when holding an object, 0 otherwise). This indicates if the robot is holding any object, not just the target object.

Outputs: The robot can control its base and arm to interact with the world. The robot grasps objects through a suction gripper. If the tip of the suction gripper is in contact with an object, and the suction is engaged, the object will stick to the end of the gripper. The suction is capable of grasping any object in the scene and holding onto the handles of receptacles for opening or closing them. Finally, the robot also outputs a binary episode termination signal.

The relative displacement of the position target in radians for the PD controller. The robot controls all 7DoF on the arm.
An indicator for if the robot should engage the suction on the arm. A value greater than or equal to 0 will engage the suction and attempt to grasp an object. A value less than 0 will disengage the suction.

Evaluation

The primary evaluation criteria is the Success. We also include additional metrics to diagnose agent efficiency and partial progress.

Success: If the target object is placed within 15cm of the goal position for the object and the agent called the stop action. This is the primary metric used to judge submissions.
Rearrangement Progress: The Euclidean distance between the object and goal position at the end of the episode, expressed in meters.
Time Taken: The amount of time (in seconds) needed to solve the episode.

The episode will terminate with failure if the agent accumulates more than 100kN of force throughout the episode or 10kN of instantaneous force.

Participation Guidelines

Participate in the contest by registering on the EvalAI challenge page and creating a team. Participants will upload docker containers with their agents that are evaluated on an AWS GPU-enabled instance. Before pushing the submissions for remote evaluation, participants should test the submission docker locally to make sure it is working. Instructions for training, local evaluation, and online submission are provided below.

Valid challenge phases are habitat-rearrangement-{minival, test-standard, test-challenge}.

The challenge consists of the following phases:

Minival phase: The purpose of this phase is sanity checking — to confirm that our remote evaluation reports the same result as the one you see locally. Each team is allowed a maximum of 100 submissions per day for this phase, but we will block and disqualify teams that spam our servers.
Test Standard phase: The purpose of this phase is to serve as the public leaderboard establishing state-of-the-art; this is what should be used to report results in papers. Each team is allowed a maximum of 10 submissions per day for this phase, but again, please use them judiciously. Don’t overfit to the test set.
Test Challenge phase: This phase will be used to decide challenge winners. Each team is allowed a total of 5 submissions until the end of the challenge submission phase. The highest performing of these five will be automatically chosen. Results on this phase will not be made public until the announcement of the final results at the NeurIPS 2022 competition.

Note: Your agent will be evaluated on 1000 episodes and will have a total available time of 48 hours to finish. Your submissions will be evaluated on AWS EC2 p2.xlarge instance, which has a Tesla K80 GPU (12 GB Memory), 4 CPU cores, and 61 GB RAM. If you need more time/resources to evaluate your submission, please get in touch.

Workshop

Habitat Challenge 2022 Workshop talk:

Rearrange-Easy track final leaderboard:

Rank	Team	SUCCESS
1	Multi-skill mobile manipulation	0.64
2	TP-SRL	0.30
3	UESTC-529	0.21
4	srye	0.20
5	VER	0.16
6	Skill Transformer	0.14
7	Monolithic	0.00

Citing Habitat Rearrangement Challenge 2022

@misc{habitatrearrangechallenge2022,
  title={Habitat Rearrangement Challenge 2022},
  author={Szot, Andrew and Yadav, Karmesh and Clegg, Alex and Berges, Vincent-Pierre and Gokaslan, Aaron and Chang, Angel and Savva, Manolis and Kira, Zsolt and Batra, Dhruv},
  howpublished={\url{https://aihabitat.org/challenge/2022_rearrange}},
  year={2022}
}

@article{szot2021habitat,
  title={Habitat 2.0: Training home assistants to rearrange their habitat},
  author={Szot, Andrew and Clegg, Alexander and Undersander, Eric and Wijmans, Erik and Zhao, Yili and Turner, John and Maestre, Noah and Mukadam, Mustafa and Chaplot, Devendra Singh and Maksymets, Oleksandr and others},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  pages={251--266},
  year={2021}
}

Acknowledgments

The Habitat challenge would not have been possible without the infrastructure and support of EvalAI team.

References

1.: ^ Habitat: A Platform for Embodied AI Research. Manolis Savva*, Abhishek Kadian*, Oleksandr Maksymets*, Yili Zhao, Erik Wijmans, Bhavana Jain, Julian Straub, Jia Liu, Vladlen Koltun, Jitendra Malik, Devi Parikh, Dhruv Batra. IEEE/CVF International Conference on Computer Vision (ICCV), 2019.
2.: ^ a b Habitat 2.0: Training Home Assistants to Rearrange their Habitat. Andrew Szot, Alex Clegg, Eric Undersander, Erik Wijmans, Yili Zhao, John Turner, Noah Maestre, Mustafa Mukadam, Devendra Chaplot, Oleksandr Maksymets, Aaron Gokaslan, Vladimir Vondrus, Sameer Dharur, Franziska Meier, Wojciech Galuba, Angel Chang, Zsolt Kira, Vladlen Koltun, Jitendra Malik, Manolis Savva, Dhruv Batra. NeurIPS, 2021
3.: ^ The ycb object and model set: Towards common benchmarks for manipulation research. Calli, B., Singh, A., Walsman, A., Srinivasa, S., Abbeel, P., & Dollar, A. M. (2015, July). In 2015 international conference on advanced robotics (ICAR)