Skip to Main content Skip to Navigation

Cutting the visual world into bigger slices for improved video concept detection

Abstract : Visual material comprising images and videos is growing ever so rapidly over the internet and in our personal collections. This necessitates automatic understanding of the visual content which calls for the conception of intelligent methods to correctly index, search and retrieve images and videos. This thesis aims at improving the automatic detection of concepts in the internet videos by exploring all the available information and putting the most beneficial out of it to good use. Our contributions address various levels of the concept detection framework and can be divided into three main parts. The first part improves the Bag of Words (BOW) video representation model by proposing a novel BOW construction mechanism using concept labels and by including a refinement to the BOW signature based on the distribution of its elements. We then devise methods to incorporate knowledge from similar and dissimilar entities to build improved recognition models in the second part. Here we look at the potential information that the concepts share and build models for meta-concepts from which concept specific results are derived. This improves recognition for concepts lacking labeled examples. Lastly we contrive certain semi-supervised learning methods to get the best of the substantial amount of unlabeled data. We propose techniques to improve the semi-supervised cotraining algorithm with optimal view selection.
Document type :
Complete list of metadata

Cited literature [220 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Tuesday, December 20, 2016 - 3:35:07 PM
Last modification on : Friday, July 31, 2020 - 10:44:08 AM
Long-term archiving on: : Monday, March 20, 2017 - 4:40:38 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01420419, version 1


Usman Niaz. Cutting the visual world into bigger slices for improved video concept detection. Image Processing [eess.IV]. Télécom ParisTech, 2014. English. ⟨NNT : 2014ENST0040⟩. ⟨tel-01420419⟩



Record views


Files downloads