Multimedia content analysis offers many exciting research opportunities and is a necessary step towards automatic understanding of the content of digital documents. Digital documents are typically composite. Processing in parallel and integrating low-level information computed over each of the media that compose a multimedia document can yield knowledge that stand-alone and isolated analysis could not discover.
Joint processing of multiple media is very challenging, even at the lowest analysis levels. Coping with imperfect synchronization of pieces of information, mixing extremely different kinds of information (numerical or symbolic descriptions, values describing intervals or instants, probabilities and distances, HMM and Gaussians, ...), and reconciling contradictory outputs are some of the obstacles which make processing of multimedia documents much more difficult than it seems at first glance.
This talk will first show what may be gained from jointly analyzing multimedia documents. It will then briefly overview the typical information that can be extracted from major media (video, sound, images and text) before focusing on the problems that arise when trying to use all this information together. We hope to convince researchers to start trying to solve these problems, since they directly hamper the acquisition of higher-level knowledge from multimedia documents.
Patrick Gros has been involved in research in the field of Computer Vision for 14 years. After having finished his studies in Engineering Science at ``École Polytechnique'' and ``École Nationale Superieure de Techniques Avancees'' in Paris, he joined the Fundamental Computer Science and Artificial Intelligence Laboratory (LIFIA) in 1990, to achieve a Ph.D. in computer vision. Since July 1993 and the defense of this thesis, he has had a research position at CNRS, still in LIFIA, which became GRAVIR since then.
<\p> From november 1995 until october 1996, he was visiting research scientist at the Robotics Institute of Carnegie Mellon University in Pittsburgh, PA, USA working on a project of automatic landmark recognition for vehicles in urban environment.
<\p> In July 1999, he moved from Grenoble to Rennes where he joint the IRISA research unit. In 2002, he founded TexMex, a new research group devoted to multimedia document analysis and management, with a special emphasis on the problems raised by the management of very large volumes of documents.
<\p> His research interests are image indexing and recognition in large databases, multimedia documents description. He teaches graduate courses in computer science and computer vision. He is associate editor of the journal "Traitement du signal". He participates to numerous national projects on multimedia description and indexing, with applications to television archiving, copyright protection for photo agencies, personal picture management on set-top-boxes. In the frame of the 6th Framework Program from European Union, he is currently involved in the MUSCLE Network of Excellence and in the Enthrone and AceMedia Integrated projects. He published 17 papers in journals and book chapters, and 37 papers in conferences.