Skip to Main content Skip to Navigation

Object representation in local feature spaces : application to real-time tracking and detection

Abstract : Visual representation is a fundamental problem in computer vision. The aim is to reduce the information to the strict necessary for a query task. Many types of representation exist, like color features (histograms, color attributes...), shape ones (derivatives, keypoints...) or filterbanks.Low-level (and local) features are fast to compute. Their power of representation are limited, but their genericity have an interest for autonomous or multi-task systems, as higher level ones derivate from them. We aim to build, then study impact of low-level and local feature spaces (color and derivatives only) for two tasks: generic object tracking, requiring features robust to object and environment's aspect changes over the time; object detection, for which the representation should describe object class and cope with intra-class variations.Then, rather than using global object descriptors, we use entirely local features and statisticals mecanisms to estimate their distribution (histograms) and their co-occurrences (Generalized Hough Transform).The Generalized Hough Transform (GHT), created for detection of any shape, consists in building a codebook, originally indexed by gradient orientation, then to diverse features, modeling an object, a class. As we work on local features, we aim to remain close to the original GHT.In tracking, after presenting preliminary works combining the GHT with a particle filter (using color histograms), we present a lighter and fast (100 fps) tracker, more accurate and robust.We present a qualitative evaluation and study the impact of used features (color space, spatial derivative formulation).In detection, we used Gall's Hough Forest. We aim to reduce Gall's feature space and discard HOG features, to keep only derivatives and color ones.To compensate the reduction, we enhanced two steps: the support of local descriptors (patches) are partially chosen using a geometrical measure, and node training is done by using a specific probability map based on patches used at this step.With reduced feature space, the detector is less accurate than with Gall's feature space, but for the same training time, our works lead to identical results, but with higher stability and then better repeatability.
Complete list of metadatas

Cited literature [153 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Monday, February 19, 2018 - 11:05:06 AM
Last modification on : Wednesday, October 14, 2020 - 3:40:00 AM
Long-term archiving on: : Monday, May 7, 2018 - 6:15:37 AM


Version validated by the jury (STAR)


  • HAL Id : tel-01712041, version 1


Antoine Tran. Object representation in local feature spaces : application to real-time tracking and detection. Computer Vision and Pattern Recognition [cs.CV]. Université Paris Saclay (COmUE), 2017. English. ⟨NNT : 2017SACLY010⟩. ⟨tel-01712041⟩



Record views


Files downloads