Skip to content

Embodied Agents Workshop, CVPR 2019

CVPR 2019 Workshop, Long Beach, California

Sunday, 16th June, 09:00 AM to 06:00 PM, Room: TBD



There has been a recent shift in the computer vision community from tasks focusing on internet images to active settings involving embodied agents that perceive and act within 3D environments. Practical deployment of AI agents in the real world requires study of active perception and coupling of perception with control as in embodied agents.

This workshop has two primary objectives. The first is to establish a unified platform and a set of benchmarks to measure progress in embodied agents by hosting an embodied agents challenge named Habitat Challenge. These benchmarks will evaluate algorithms for navigation and question-answering tasks using the Habitat platform. The benchmarks and unified embodied agents platform will catalyze future work, promoting reproducibility, reusability of code, and consistency in evaluation.

The second objective of the workshop is to bring together researchers from the fields of computer vision, language, graphics, and robotics to share work being done in the area of multi-modal AI, as well as discuss future directions in research on embodied agents connecting several recent threads of research on: visual navigation, natural language instruction following, embodied question answering and language grounding.

Embodied Agents Challenge and Call for Papers

The challenge will be based on the Navigation task in which the agent must navigate to an object of a specific category. The metric for this task will be the final distance of the agent from the target object. A training instance for this task will be a tuple of the form (environment, category of object, ground truth location). The agent is initially given a target object category for navigation, as a category label, and has to move around in the environment to find an instance of the object.

Challenge Timeline

Challenge starts April 3, 2019
Challenge submission deadline May 18, 2019

Abstract Submission Timeline

Paper submission deadline May 24, 2019
Notification to authors May 29, 2019

People are invited to submit abstracts up to 2 pages describing work in relevant areas of simulation environments, visual navigation, embodied question answering, simulation-to-real transfer, vision & language, etc. Please email abstracts to by May 24, 2019.


  • 09:00 - 09:10 : Welcome and Introduction
  • 09:10 - 09:35 : Invited Talk (Richard Newcombe)
  • 09:35 - 10:00 : Invited Talk (Benjamin Kuipers)
  • 10:00 - 10:25 : Invited Talk (Raia Hadsell)
  • 10:25 - 10:45 : Morning Break
  • 10:45 - 11:10 : Invited Talk (Jitendra Malik)
  • 11:10 - 11:25 : Habitat Challenge on EvalAI (Rishabh Jain)
  • 11:25 - 12:00 : Challenge Results and Analysis
  • 12:00 - 12:20 : PointGoalNav-RGB Track Winners (oral)
  • 12:20 - 12:40 : PointGoalNav-RGBD Track Winners (oral)
  • 12:40 - 14:00 : Lunch
  • 14:00 - 14:25 : [TBD]
  • 14:25 - 14:50 : Poster Spotlights
  • 14:50 - 16:00 : Poster Session and Afternoon Break
  • 16:00 - 18:10 : Panel Discussion + Closing Remarks

Invited Speakers

Richard Newcombe is the Director of Research Science at Facebook Reality Labs which is building the future of Augmented Reality and Contextualized AI services and devices at Facebook.He Co-Founded Surreal Vision Ltd. in 2014. He worked at Microsoft Research in Cambridge, gaining early access to the first commodity depth sensor, which resulted in the groundbreaking KinectFusion system. He holds BSc at the University of Essex and PhD at Imperial College.  

Benjamin Kuipers joined the University of Michigan in January 2009 as Professor of Computer Science and Engineering. Prior to that, he held an endowed Professorship in Computer Sciences at the University of Texas at Austin. He received his B.A. from Swarthmore College, and his Ph.D. from MIT. He investigates the representation of commonsense and expert knowledge, with particular emphasis on the effective use of incomplete knowledge. He has served as Department Chair at UT Austin, and is a Fellow of AAAI, IEEE, and AAAS.  

Raia Hadsell, a senior research scientist at DeepMind, has worked on deep learning and robotics problems for over 10 years. After completing a PhD with Yann LeCun at NYU, her research continued at Carnegie Mellon's Robotics Institute and SRI International, and in early 2014 she joined DeepMind in London to study artificial general intelligence. Her current research focuses on the challenge of continual learning for AI agents and robotic systems.   [Webpage]

Jitendra Malik is the Arthur J. Chick Professor in the Department of Electrical Engineering and Computer Sciences at UC Berkeley. His research group has worked on many different topics in computer vision, computational modeling of human vision and computer graphics. Several well-known concepts and algorithms arose in this research, such as normalized cuts, high dynamic range imaging and R-CNN. He has mentored more than 50 PhD students and postdoctoral fellows.   [Webpage]