Habitat Matterport Dataset

The Habitat-Matterport 3D Research Dataset (HM3D) is the largest-ever dataset of 3D indoor spaces. It consists of 1,000 high-resolution 3D scans (or digital twins) of building-scale residential, commercial, and civic spaces generated from real-world environments.

HM3D is free and available here for academic, non-commercial research. Researchers can use it with FAIR’s Habitat simulator to train embodied agents, such as home robots and AI assistants, at scale.

Resources:    PDF    GitHub    Download    Video

Citing HM3D

If you use the HM3D dataset in your research, please cite the HM3D paper:

  title={Habitat-Matterport 3D Dataset ({HM}3D): 1000 Large-scale 3D Environments for Embodied {AI}},
  author={Santhosh Kumar Ramakrishnan and Aaron Gokaslan and Erik Wijmans and Oleksandr Maksymets and Alexander Clegg and John M Turner and Eric Undersander and Wojciech Galuba and Andrew Westbury and Angel X Chang and Manolis Savva and Yili Zhao and Dhruv Batra},
  booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)},

Dataset metrics

When compared to popular existing 3D scene datasets, Habitat Matterport (HM3D) is the largest scale dataset with significantly more scenes and floor space. When compared to other large-scale datasets, it has comparable scene clutter and navigation complexity.

Metadata description

We characterize each scene using the following metadata which were computed through automated scripts or manual inspection.

Split specifies whether the scene belongs to the train / validation split. The test scenes are held out for evaluation and not publicly released.

Reviewer rating (1-5) is a subjective assessment of the scene quality by human reviewers. The rating is based on visually inspecting the scene on metrics such as navigability, holes/cracks, furnishing, number of open/closed doors, and interactive potential. This varies from 1 (lowest) to 5 (highest).

# Floors is the number of floors in the scene. This was obtained by manually inspecting the scene.

# Rooms is the number of rooms in the scene. This was obtained by by automatically analyzing the 3D scan meshes (may have inaccuracies).

Furnished specifies whether large parts of the scene have furniture or not. This was obtained by manually inspecting the scene.

Floor space (m2) measures the overall extents of the navigable regions in the scene. This is the area of the 2D-convex hull of all navigable locations within a floor. For scenes with multiple floors, the floor space is summed over all floors.

Navigable space (m2) measures the total scene area that is navigable in the scene. This is computed in reference to a cylindrical robot with radius 0.1m and height 1.5m.

Navigation complexity measures the difficulty of navigating in a scene. This is computed as the maximum ratio of geodesic to euclidean distances between any two navigable locations in the scene.

Scene clutter measures the amount of clutter in the scene. This is computed as the ratio between the scene mesh area within 0.5m of the navigable regions and the navigable space.


Habitat team members and contributors to HM3D effort: Dhruv Batra, Angel Chang, Alex Clegg, Aaron Gokaslan, Wojtek Goluba, Oleksandr Maksymets, Manolis Savva, Santhosh Kumar Ramakrishnan, John Turner, Eric Undersander, Andrew Westbury, Erik Wijmans, Yili Zhao.
We also thank all the volunteers who contributed to the dataset curation effort: Harsh Agrawal, Sashank Gondala, Rishabh Jain, Shawn Jiang, Yash Kant, Noah Maestre, Yongsen Mao, Abhinav Moudgil, Sonia Raychaudhuri, Ayush Shrivastava, Andrew Szot, Joanne Truong, Madhawa Vidanapathirana, Joel Ye.