Spatial AI

Mobile robots need to perceive their surroundings in order to act: they need to keep track of their own motion, know their location in a map, and even create maps when they operate in unknown environments – just like humans. They may also need to understand what objects and people are around them and how they are moving and behaving. We work on a range of such Spatial AI capabilities serving as a basis for robots to safely navigate and physically interact with the world.

Simultaneous Localization and Mapping (SLAM)

We are working on various methods for robots to simultaneously create maps and localize themselves inside them using a range of sensors: cameras, inertial measurement units (IMUs), LiDAR, GNSS, and more.

The method in the picture below shows a scalable SLAM system developed by MRL at TUM, consisting of the estimated trajectory (black line) and dense submaps (visualized here as colored meshes). Deep-learned depth maps from stereo as well as temporal images are fused into volumetric occupancy submaps, and also leveraged inside the visual-inertial estimator. Result from the Hilti-Oxford dataset.

Novel Sensors

The lab is working with novel sensors and their use in Spatial AI: below, in research led by MRL at TUM, we use an event camera for Simultaneous Localization and Mapping (SLAM), whereby employing a learning-based method that detects stable keypoints and descriptors directly from event data. We use self-supervision from synchronized grayscale frames that are only needed in training.

SuperEvent: Cross-Modal Learning of Event-based Keypoint Detection for SLAM, see external page https://smartroboticslab.github.io/SuperEvent/

Humans in Motion

Robots should robustly detect and track moving humans in their field of view, even as they are in motion themselves. The lab has been developing various such methods at Imperial College London and later at TUM, e.g. GloPro: Globally-Consistent Uncertainty-Aware 3D Human Pose Estimation & Tracking in the Wild. The method consistently estimates human body pose, posture, and shape within a global 3D coordinate system while simultaneously modeling the uncertainty distribution across the entire body.