Skip to Main content Skip to Navigation

3D geometry-based neural camera pose estimation

Abstract : Vision-based absolute camera pose estimation, also known as visual localization, is an underpinning backbone to many computer vision applications, such as augmented or virtual reality, robotics and autonomous driving. When working with crowdsourced images captured under challenging conditions, visual disturbances are frequently encountered. These perturbations make visual localization a very hard -and so far unsolved- problem. The goal of this thesis is to develop models that can improve the performance of absolute camera pose algorithms. The first part of this thesis focuses on the task of matching 2D keypoints against a 3D model, which is a commonly used building block to structure-based visual localization approaches. We propose a novel keypoint matching paradigm which explicitly models dense keypoint matching uncertainties in images, and finds it improves over state-of-the-art keypoint matching methods. Then, we introduce a novel reprojection error to merge feature learning and absolute camera pose estimation, which we call the Neural Reprojection Error. Our formulation reuses the previously introduced dense matching uncertainties to significantly improve the camera pose estimation accuracy, compared to standard approaches. This formulation is also data-driven and thus helps us avoid cumbersome hyperparameter optimization. The last contribution of this thesis is to study the problem of visual correspondence hallucination. We train a deep learning model to regress matching distributions in non-covisible image areas (i.e. that are either occluded or fall outside of the image boundaries). We show our model is not only able to make such predictions, but that when coupled with the Neural Reprojection Error it significantly outperforms existing absolute camera pose estimation methods, when presented with very low-overlap image pairs.
Document type :
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Monday, February 7, 2022 - 5:25:09 PM
Last modification on : Sunday, February 13, 2022 - 9:15:17 AM
Long-term archiving on: : Sunday, May 8, 2022 - 7:19:02 PM


Version validated by the jury (STAR)


  • HAL Id : tel-03560786, version 1



Hugo Germain. 3D geometry-based neural camera pose estimation. Other [cs.OH]. École des Ponts ParisTech, 2021. English. ⟨NNT : 2021ENPC0033⟩. ⟨tel-03560786⟩



Record views


Files downloads