PhD position in Multimodal Speech Processing - Aix Marseille University, Marseille, France
The NLP group of the Laboratoire d'Informatique Fondamentale (LIF - http://www.lif.univ-mrs.fr ) of the Aix Marseille University (AMU - http://www.univ-amu.fr ) has 1 open PhD position in Computer Science in the context of a co-funding with the French DGA (Direction Generale de l'Armement).
Location : Campus de Luminy
Starting : September 1st 2014
Deadline for application: April 25, 2014
- title: Multimodal Understanding: toward joint audio/video processing for multimodal understanding process of video broadcast
- keywords: Speech and Language Processing, Image Processing, Machine Learning, Multimedia Information Retrieval
- short description
The goal of this PhD proposal is to develop models for processing multimedia documents that will go further than a simple fusion between monomodal descriptors on the speech and image modalities. Based on preliminary work done in our team during the Defi-Repere ( http://www.defi-repere.fr ) on the concept of "multimodal understanding", this research project aims at using multimodal features at each level of processing of a video document. For example, during the Defi-Repere, we have shown that visual features characterizing a TV set can help the speaker diarization and identification task. Similarly we want to add visual features into the spoken language understanding component of a video analysis system, and demonstrate that a certain level of understanding of the current video scene can be used at the language processing level.
Description of the lab:
The University of Aix-Marseille (AMU) is currently one of the largest university in France, created in 2012 from the fusion of the 3 former Aix-Marseille universities (Université de Provence, Université de la Méditerranée, Université Paul Cézanne).
The LIF (Fundamental Computer Science Lab), is a JRU between the Centre National de la Recherhe Scientifique (CNRS) and AMU. The Natural Language Processing group of LIF aims at developing symbolic and statistical methods for the automatic processing of textual and speech data.
Frederic Bechet : frederic.bechet(at)lif.univ-mrs.fr