Skip to Main content Skip to Navigation

Mouvement et vidéo : estimation, compression et filtrage morphologique

Nicolas Laveau
Abstract : The PhD work developped in this document deals with the treatment of video sequences. This includes video compression for most of this thesis, but also spatio-temporal filtering and video segmentation. One of the recurrent analysis tools for each of these applications is motion measurement, that is the description of temporal coherence in a video sequence. A video compression system generally includes the three following components : motion estimation temporal and spatial transforms coefficient quantization and encoding We focus on each of these components successively. In a first time, we try to adapt a motion estimation scheme by optical flow projection of a complex-valued wavelet basis to a video compression use. The resulting field being dense and noise-sensitive, we introduce in the resolution a regularizing element in order to reduce its coding cost. In spite of a clear improvement brought by our modifications, the motion estimation scheme by projection on a wavelet basis is not competitive in comparison to block-matching which is the reference algorithm for video compression. It illustrates that the choice and the design of a scheme is tighly linked to the use for which it is intended : motion estimation schemes optimized for applications so diverse as video compression, filtering and segmentation or even 3D scene analysis are unlikely to be the same. In these experiments, we have used a motion measurement scheme which tries to optimize a criterion formally equivalent to a matching criterion for video compression on theoretical conditions which are not met in general. Such an approach is thus clearly sub-optimal. Fortiied of this observation, we have then developped another motion estimation scheme which relies on a piecewise bilinear motion field parametrization and which this time directly minimize the mean square error which is our evaluation criterion. We prove that it is possible to obtain good results when motion parameters are sparse. In video coding with temporal prediction, we need to encode heterogenous data such as motion fields or error prediction pictures. We have worked on rate allocation among error frames and more moderately between an error frame and a motion field.We have adapted a rate planification model introduced by Mallat and Falzon which was initially designed for still images and which is currently used in flow compression of satellite pictures. This approach proves to be better than others more classicaly used in video compression. To be able to perform a transform coding of motion fields and error frames, we tried to design new non-linear subband transform. With this intention, we have used the lifting scheme which insures the formal invertibility of the achievable transforms, whether these are linear or not. We have designed two new non-linear decompositions. The first one aims at reducing an artifact commonly called Gibbs' effect. This first decomposition consists in using a Deslauriers-Dubuc' predictor modified so as to reduce these artifacts. Our modification allows to reduce the ringing effect around the discontinuities at the moderate cost in terms of representation efficiency in the regular sections of the signal. The formulation avoids the lter-switching mechanism which is quite commonly used in this kind of approaches by using continuous operators such as min or max, so as to insure the transform continuity and thus its stability after quantization. The second one tries to improve the motion field wavelet decomposition by using the information each of its components gives on the other one. Indeed, our intuition leads us to believe that discontinuities are occurring at the same positions in both of its components. We take advantage of this fact to choose the prediction and update filters. In the two cases, the designed methods give encouraging results on synthesis signals but their effciency is lessen by using them on real data. One of the main difficulties is to design an update step in the lifting scheme. Moreover, the most efficient linear scheme is a 4-step scheme for which it is difficult to design a correspondant non-linear step since its properties are not easily read in the indivual steps of the lifting scheme. Lastly, we have transposed ideas from video compression to design morphological filtering operating on video sequences, which integrate the motion estimation by using structuring elements following the motion. The application of these ideas gives encouraging results in filtering and segmentation, in particular due to the strong spatio-temporal correlation introduced in the neighbourhoods : this approach leads to more stable segmentations since it imposes a much stronger correlation between region borders than temporally iterative schemes. We discuss then the possibilities of using sub-pixel accurate motion fields.
Document type :
Complete list of metadatas

Cited literature [51 references]  Display  Hide  Download
Contributor : Ecole Mines Paristech <>
Submitted on : Thursday, January 24, 2008 - 8:00:00 AM
Last modification on : Wednesday, November 29, 2017 - 3:02:33 PM
Long-term archiving on: : Wednesday, September 8, 2010 - 5:48:29 PM


  • HAL Id : pastel-00003299, version 1



Nicolas Laveau. Mouvement et vidéo : estimation, compression et filtrage morphologique. Mathematics [math]. École Nationale Supérieure des Mines de Paris, 2005. English. ⟨pastel-00003299⟩



Record views


Files downloads