Please use this identifier to cite or link to this item:
Authors: Schmid, Stephan
Title: Semi-dense filter-based visual odometry for automotive augmented reality applications
Issue Date: 2019 Dissertation 136
Abstract: In order to integrate virtual objects convincingly into a real scene, Augmented Reality (AR) systems typically need to solve two problems: Firstly, the movement and position of the AR system within the environment needs to be known to be able to compensate the motion of the AR system in order to make placement of the virtual objects stable relative to the real world and to provide overall correct placement of virtual objects. Secondly, an AR system needs to have a notion of the geometry of the real environment to be able to properly integrate virtual objects into the real scene via techniques such as the determination of the occlusion relation between real and virtual objects or context-aware positioning of virtual content. To solve the second problem, the following two approaches have emerged: A simple solution is to create a map of the real scene a priori by whatever means and to then use this map in real-time operation of the AR system. A more challenging, but also more flexible solution is to create a map of the environment dynamically from real time data of sensors of the AR-system. Our target applications are Augmented Reality in-car infotainment systems in which a video of a forward facing camera is augmented. Using map data to determine the geometry of the environment of the vehicle is limited by the fact that currently available digital maps only provide a rather coarse and abstract picture of the world. Furthermore, map coverage and amount of detail vary greatly regionally and between different maps. Hence, the objective of the presented thesis is to obtain the geometry of the environment in real time from vehicle sensors. More specifically, the aim is to obtain the scene geometry by triangulating it from the camera images at different camera positions (i.e. stereo computation) while the vehicle moves. The problem of estimating geometry from camera images where the camera positions are not (exactly) known is investigated in the (overlapping) fields of visual odometry (VO) and structure from motion (SfM). Since Augmented Reality applications have tight latency requirements, it is necessary to obtain an estimate of the current scene geometry for each frame of the video stream without delay. Furthermore, Augmented Reality applications need detailed information about the scene geometry, which means dense (or semi-dense) depth estimation, that is one depth estimate per pixel. The capability of low-latency geometry estimation is currently only found in filter based VO methods, which model the depth estimates of the pixels as the state vector of a probabilistic filter (e.g. Kalman filter). However, such filters maintain a covariance matrix for the uncertainty of the pixel depth estimates whose complexity is quadratic in the number of estimated pixel depths, which causes infeasible complexity for dense depth estimation. To resolve this conflict, the (full) covariance matrix will be replaced by a matrix requiring only linear complexity in processing and storage. This way, filter-based VO methods can be combined with dense estimation techniques and efficiently scaled up to arbitrarily large image sizes while allowing easy parallelization. For treating the covariance matrix of the filter state, two methods are introduced and discussed. These methods are implemented as modifications to the (existing) VO method LSD-SLAM, yielding the "continuous" variant C-LSD-SLAM. In the first method, a diagonal matrix is used as the covariance matrix. In particular, the correlation between different scene point estimates is neglected. For stabilizing the resulting VO method in forward motion, a reweighting scheme is introduced based on how far scene point estimates are moved when reprojecting them from one frame to the next frame. This way, erroneous scene point estimates are prevented from causing the VO method to diverge. The second method for treating the covariance matrix models the correlation of the scene point estimates caused by camera pose uncertainty by approximating the combined influence of all camera pose estimates in a small subspace of the scene point estimates. This subspace has fixed dimension 15, which forces the complexity of the replacement of the covariance matrix to be linear in the number of scene point estimates.
Appears in Collections:06 Fakultät Luft- und Raumfahrttechnik und Geodäsie

Files in This Item:
File Description SizeFormat 
Dissertation_Stephan_Schmid.pdf15,06 MBAdobe PDFView/Open

Items in OPUS are protected by copyright, with all rights reserved, unless otherwise indicated.