AI Habitat Concepts and Terminology Reference

High level descriptions of common terms used throughout the Habitat ecosystem.

Scene: defines an isolated 3D world composed of a STATIC stage and a variable number of objects, Agents, and Sensors.

SceneDataset: a metadata configuration structure which defines system parameters, links a set of available assets, and organizes them together into one or many scene descriptions which can be instanced in the Simulator.

SceneGraph: a hierarchical representation of a 3D Scene that organizes the contents into a tree composed of regions, Objects, and parts. Can be programmatically manipulated. A Habitat Simulator instance operates on a single active SceneGraph with an optional SemanticSceneGraph used for semantic sensor rendering of Scenes with separate semantic asset geometry.

Stage: the set of STATIC mesh components which make up the backdrop of a Scene.

Object: a set of 3D assets, or parts, connected rigidly or by joints which can be collectively referred to as a single semantic unit. Every Object is represented by a subtree with a single root node on the SceneGraph.

MotionType: a classification of the capability of an object’s state to change and the modality by which it can:

  • STATIC: The state of the object remains constant. A Scene’s Stage is always STATIC.
  • KINEMATIC: The state can be manually modified but the object does not react to collisions and dynamics such as forces cannot be applied to it.
  • DYNAMIC: The state of the object is modified by physics simulation including integration of forces, torques, impulses (e.g., gravity, internal actuation, constraints, and collisions with other objects).

Agent: a physically embodied agent with a suite of Sensors. Can observe the environment and is capable of taking actions that change agent or environment state.

Sensor: a device capable of returning state observation data from the environment at a specified frequency. Simulator sensors are often attached to agents, but may also be globally defined independent of agents.

Observation: data representing an observation from the Sensor class. This can correspond to physical sensors on an Agent (e.g., RGB, depth, semantic segmentation masks, collision sensors) or more abstract sensors such as the current agent state. The structure and format of observation data is constant after an Agent is initialized. Observations are usually passed as input for model training.

Simulator: an instance of a simulator backend. Given actions for a set of configured Agents and SceneGraphs, can update the state of the Agents and SceneGraphs, and provide observations for all active Sensors possessed by the Agents. See Simulator.

Task: this class extends the simulator’s Observations class and action space with task-specific ones. The criteria of episode termination and measures of success are provided by the Task. For example, in goal-driven navigation, Task provides the goal and evaluation metric. To support this kind of functionality the Task has read-only access to Simulator and Episode-Dataset. See EmbodiedTask.

Episode: a class for episode specification that includes the initial position and orientation of an Agent, scene id, goal position, and optionally the shortest path to the goal. An episode is a description of an instance of the task.

Episode-Dataset: contains a list of task-specific episodes (e.g., a set of PointGoal targets for a navigation task or question-answer pairs for EmbodiedQA paired with a set of scene assets (e.g., Matterport3D, Gibson, etc.). from a particular data split and additional dataset-wide information. Habitat-API handles loading and saving of episode-datasets to disk, getting a list of associated 3D scenes, and getting a list of episodes for a particular scene.

Environment: the fundamental environment concept for Habitat, abstracting all the information needed for working on embodied tasks with a simulator. This class acts as a base for other derived environment classes. Environment consists of three major components: a Simulator, an Episode-Dataset (containing Episodes), and a Task, and it serves to connect all these components together. Concretely a custom extension of a Habitat Env is capable of initializing, stepping, and resetting itself as well as defining both action and observation spaces. A RLEnv extends the OpenAI Gym Environment for use with reinforcement learning by additionally providing reward, termination, and episode info.