Object representation in local feature spaces : application to real-time tracking and detection

Abstract : Visual representation is a fundamental problem in computer vision. The aim is to reduce the information to the strict necessary for a query task. Many types of representation exist, like color features (histograms, color attributes...), shape ones (derivatives, keypoints...) or filterbanks.Low-level (and local) features are fast to compute. Their power of representation are limited, but their genericity have an interest for autonomous or multi-task systems, as higher level ones derivate from them. We aim to build, then study impact of low-level and local feature spaces (color and derivatives only) for two tasks: generic object tracking, requiring features robust to object and environment's aspect changes over the time; object detection, for which the representation should describe object class and cope with intra-class variations.Then, rather than using global object descriptors, we use entirely local features and statisticals mecanisms to estimate their distribution (histograms) and their co-occurrences (Generalized Hough Transform).The Generalized Hough Transform (GHT), created for detection of any shape, consists in building a codebook, originally indexed by gradient orientation, then to diverse features, modeling an object, a class. As we work on local features, we aim to remain close to the original GHT.In tracking, after presenting preliminary works combining the GHT with a particle filter (using color histograms), we present a lighter and fast (100 fps) tracker, more accurate and robust.We present a qualitative evaluation and study the impact of used features (color space, spatial derivative formulation).In detection, we used Gall's Hough Forest. We aim to reduce Gall's feature space and discard HOG features, to keep only derivatives and color ones.To compensate the reduction, we enhanced two steps: the support of local descriptors (patches) are partially chosen using a geometrical measure, and node training is done by using a specific probability map based on patches used at this step.With reduced feature space, the detector is less accurate than with Gall's feature space, but for the same training time, our works lead to identical results, but with higher stability and then better repeatability.
Complete list of metadatas

Cited literature [153 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/tel-01712041
Contributor : Abes Star <>
Submitted on : Monday, February 19, 2018 - 11:05:06 AM
Last modification on : Wednesday, July 3, 2019 - 10:48:05 AM
Long-term archiving on : Monday, May 7, 2018 - 6:15:37 AM

File

60867_TRAN_2017_diffusion.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01712041, version 1

Citation

Antoine Tran. Object representation in local feature spaces : application to real-time tracking and detection. Computer Vision and Pattern Recognition [cs.CV]. Université Paris-Saclay, 2017. English. ⟨NNT : 2017SACLY010⟩. ⟨tel-01712041⟩

Share

Metrics

Record views

200

Files downloads

225