Skip to Main content Skip to Navigation

Etude de noyaux de semigroupe pour objets structurés dans le cadre de l'apprentissage statistique

Marco Cuturi
Abstract : Kernel methods refer to a new family of data analysis tools which may be used in standardized learning contexts, such as classification or regression. Such tools are grounded on an \textit{a priori} similarity measure between the objects to be handled, which have been named ``kernels'' in the statistical learning and functional analysis literature. The simplicity of kernel methods comes from the fact that, given a learning task, such methods only require the definition of a kernel to compare the objects to yield practical results. The problem of selecting the right kernel for a task is nonetheless tricky, notably when the objects have complex structures. We propose in this work various families of generic kernels for composite objects, such as strings, graphs or images. The kernels that we obtain are tailored to compare clouds of points, histograms or more generally positive measures. Our approach is mainly motivated by algebraic considerations on the sets of interests, which is why we make frequent use of the theory of harmonic functions on semigroups in this work. The theoretical justification for such kernels is further grounded on the use of reproducing kernel Hilbert spaces, in which the measures are embedded, along with elements of convex analysis and descriptors of the measures used in statistics and information theory, such as variance and entropy. By mapping any structured object to a cloud of components, \eg taking a string and turning it into a cloud or a histogram of substrings, we apply these kernels on composite objects coupled with discriminative methods, such as the support vector machine, to address classification problems encountered in bioinformatics or image analysis. We extend this framework in the end of the thesis to propose a different viewpoint where objects are no longer seen as clouds of points but rather as nested clouds, where each cloud is labelled according to a set of events endowed with a hierarchy. We show how to benefit from such a description to apply a multiresolution comparison scheme between the objects.
Document type :
Complete list of metadatas

Cited literature [110 references]  Display  Hide  Download
Contributor : Ecole Mines Paristech <>
Submitted on : Friday, June 30, 2006 - 8:00:00 AM
Last modification on : Tuesday, September 29, 2015 - 10:32:45 AM
Long-term archiving on: : Tuesday, July 13, 2010 - 9:09:24 PM


  • HAL Id : pastel-00001823, version 1



Marco Cuturi. Etude de noyaux de semigroupe pour objets structurés dans le cadre de l'apprentissage statistique. Mathematics [math]. École Nationale Supérieure des Mines de Paris, 2005. English. ⟨pastel-00001823⟩



Record views


Files downloads