Plenary Speakers

Two plenary talks are planned, one starting each day of the workshop.
These presentations will be kindly provided by:

Juan D. Tardós, Universidad de Zaragoza

Visual SLAM: Algorithms, Challenges and Applications


Recent research is showing that computing medium-scale visual maps for rigid scenes in real time is feasible. Current maps, composed of a sparse set of points, are adequate for accurate camera location, but quite poor for performing high-level tasks such as robot navigation, object manipulation or human-computer interaction. Furthermore, most available techniques are intensive in computer requirements, being unable to map large environments.

In this talk we will present algorithms developed at the University of Zaragoza for large-scale visual mapping with monocular and stereo cameras, and efficient place recognition techniques for reliable loop detection.  We will also discuss several applications and the challenges they pose for visual SLAM: most robotics and human interaction applications require scene understanding and object recognition techniques to boost the semantic contents of the maps; novel medical applications will require the ability to map non-rigid scenes.


Towards 3D Semantic Perception (pdf)


Intelligent autonomous action in ordinary environments calls for maps. 3D geometry is generally required for planning motions in complex scenarios and to self localize with six degrees of freedom (6 DoF) (x, y, z positions, roll, yaw, and pitch angles). Meaning, in addition to geometry, becomes inevitable if one is supposed to interact with its environment in a goal-directed way. A semantic stance enables systems to reason about objects; it helps disambiguate or round off sensor data; and knowledge becomes reviewable and communicable.

The talk describes an approach and two integrated robotic systems for semantic 3D mapping. The prime sensors are 3D laser scanners. Individual scans are registered into a coherent 3D geometry map by 6D SLAM, our bundle adjustment framework for laser scanner data. Coarse scene features (e.g., walls, floors in a building) are determined by semantic labeling. More delicate objects are then detected and localized by a trained classifier or by matching 3D Google warehouse models. In painting methods are used to reconstruct occlusions. In the end, the semantic maps can be visualized for human inspection. We sketch the overall architecture of the approach, explain the respective steps and their underlying algorithms, give examples based on implementation at working robots, and discuss the findings.