The group is actively involved in several national and international projects:
Emotion-driven personalized content-based multimedia management [SNF Grant: Principal Investigator: Prof. T Pun / project page]
(Swiss NCCR / Phase I: Jan 2002 - Dec 2005 / Phase II: Jan 2006 - Dec 2009)
Interactive Multimodal Information Management (IM)2 aims at advancing research and developing prototypes in the field of advanced multimodal man-machine interaction. More specifically (IM)2 addresses the technologies that co-ordinate natural input modes (such as speech, pen, touch, hand gestures, head and body movements, and eventually physiological sensors) with multimedia system output, such as speech, sound, images, 3D graphics, and animation. The field of multimodal interaction covers a wide range of critical activities and applications, including recognition and interpretation of spoken, written and gestural language, computer vision, as well as indexing and management of multimedia documents. Other key sub-themes include the protection of information content, limiting information access, and the structuring, retrieval and presentation of multimedia information. These multimodal interfaces represent a new highly strategic direction in the development of advanced information technologies. Thanks to these interfaces, man-machine interactions will be more flexible and easier to use, hence more productive. Ultimately, these multimodal interfaces should flexibly accommodate a wide range of users, tasks, and environments for which any single mode may not suffice. The ideal interface should primarily be able to deal with more comprehensive and realistic forms of data, including mixed data types (i.e., data from different input modalities such as image and audio).
Project main website: http://www.im2.ch.
(EU-FP7-NoE / Mar 2008 - Feb 2012)
PetaMedia is a Network of Excellence funded by the European Commission's Seventh Framework Programme. FP7 is a key tool to respond to Europe's needs in terms of jobs and competitiveness, and one of its goals is to maintain leadership in the global knowledge economy.
Four partners form the core of PetaMedia, each representing a national network: Delft University of Technology; EPFL from Switzerland; Queen Mary University London from the UK and from Germany the Technical University of Berlin. From The Netherlands, Delft University of Technology coordinates PetaMedia.
The goal of the NoE PetaMedia is to bring together the research of four national networks in the Netherlands, Switzerland, UK and Germany in the area of multimedia content analysis (MCA) and social peer-to-peer networks and eventually to establish a European virtual centre of excellence.
The collective research that will integrate the four national networks will be directed towards the synergetic combination of user-based collaborative tagging, peer-to-peer networks and multimedia content analysis, and towards identification and exploration of potentials and limitations of combined tagging, MCA and SP2P concepts. Solutions and collaborative research field trials will be build on the coordinating partner's open source P2P software Tribler.
Project main website: http://www.petamedia.eu
SNF Grant number 200021-119906/ Apr 2009 - Mar 2011)
Information search systems come as a solution to mine and retrieve information from large data collections. In the case of exploration, information networks are generally manually organized via moderated user communities (e.g. Wikipedia or YouTube). Fully automated solutions generally come from the domain of Adaptive Hypermedia, creating information networks from the information content. In this project, we wish to design a system that, starting from a rather classical and pragmatic content-based analysis of the media collection is able to setup an initial useful browsing environment. At this stage, experience from the design of multimedia content-based search systems is exploited for the representation and organization of the data along this network. Since it is often the case that data is partly annotated, an initial propagation of the semantic knowledge along this network will enhance its quality. Then, users are invited to actually use the browsing system. A short-term adaptation of the navigation environment according to estimates of the user mental state will allow for augmenting the browsing efficiency and increase the user satisfaction. From a long-term perspective, collaborative user interaction is collected and used as an extra source of semantic information to perform semantic information propagation and incrementally enhance the initial bootstrapping content-based navigation system (project page).
SNF Grant number 200021-122036/ Nov 2008 - Aug 2010)
This project aims at supporting, from an initial design of a feature selection methodology, a framework for the efficient calculation of N-way feature information interaction and its systematic exploitation in context of multi modal information processing such as classification or information retrieval. We will do so by reimplementing the calculation of the feature information interaction within the theoretical framework of combinatorics which underlies its main formula. For further speeding up, an intelligent search and sub sampling algorithm seems promising. To properly exploit feature information interactions, the simple feature selection we apply so far turns out not to be sufficient. We plan to develop an information fusion approach based on feature selection and construction such that the underlying relevant attribute relationships can be automatically learnt and exploited from training examples. We think that an improved information fusion will significantly enhance the performance of multimedia data classification and retrieval (project page).
(SNF grant number 200021-109377 / Oct 2006 - Sep 2010)
We look at multimedia information management in a 'queryless' context. We assume the presence of a (large) number of multimedia items and wish to construct a framework that should provide a comprehensive view of the content of these collections effectively. The baseline is a simple draw that shows items in random order. At the center of the project, we focus on estimating the properties and structures of the population of the feature space that represents the multimedia collection. The background we wish to exploit is that of discrete optimization, justified by our view of the collection as discrete points within a high-dimensional space whose topology is unknown and requires modeling. This departs from the traditional stochastic view where items in the collection are seen as being related to samples of a given distribution (e.g., on features). We show that from solutions to that problem, we may derive efficient and adaptive solutions to several problems such as collection sampling, clustering and visualization (project page).
(SNF grant number 200021-113274 / Mar 2007 - Feb 2009)
A central issue in the multimedia and human-computer interaction domains is to create systems that react in a nearly “human” manner. Imagine yourself coming home. You missed a movie on the TV. You have never seen it and would like to obtain a summary of that movie, more precisely a summary that you are going to like. Alternatively, you would like to retrieve movie segments that give you a particular feeling, such as joy. The project aims at developing a theoretical framework for accessing multimedia data, in particular videos, that will take into account user emotional preferences and behavior. This framework will be validated by a concrete platform for emotion-based access and summarization of videos.
Individual emotional reactions will first be learnt from a selected set of movies presented to a user, by means of several non-invasive physiological sensors. These reactions will constitute individual user profiles describing emotional responses, positive or negative, towards various elements that appear in a movie. A mapping between the video characteristics and the user responses will thus be established. This mapping will constitute the individual user emotional profile. Knowing this user emotional profile, that is what s/he prefers, dislikes, strongly reacts to, etc., it will be possible to develop a truly personalized platform for interacting with videos. Two application examples are planned. The first one is a personalized retrieval tool that will allow searching in the movie database on the basis of emotional features and criteria. The second application is an emotion-based video summarization tool, which would allow a user to ask for a summary emphasizing some particular emotions, for instance joy. To ensure privacy the emotional profiles will only be known by the user, who will always retain the ultimate control over the system.
This project lies at the frontier between human-computer interaction (HCI) and affective computing, emotion research, content-based processing of multimedia data. It aims at obtaining novel and fundamental results: in HCI, regarding user modeling and emotion-based interaction customization; in emotion research, regarding determination of higher level emotions, coupling between temporal emotion patterns and dynamic movie characteristics; in content-based processing of multimedia data, by significantly enhancing existing techniques though the use of emotional information. The targeted applications will exploit emotion recognition in an innovative way. The work will also permit various further developments in the domain of interaction with multimedia data (project page).
(EU-FP6-STREP 033104 / May 2006 - Oct 2008)
The MultiMatch search engine will be able to:
The project’s R&D work is organized around three activities:
(SNF grants number 2100-066648/200020-105282 / Oct 2002 - Oct 2006)
In the first phase of this project, we have proposed DEVA as annotation model, designed as an extension of the Dublin Core in the context of RDF. During this phase, we have soon identified the need for an intelligent environment capable of handling knowledge arising from differing sources and perspectives. In this respect, we have aligned our developments with that of the Semantic Web community and we have proposed the Semantic Web Knowledge Base as a reasoning engine capable of retracting previous conclusions as new contradicting facts are entered. SWKB has been integrated into a practical framework and early results show the appropriateness of the approach. The objective of this proposed second phase is to continue in the direction of re-enforcing the smooth acquisition and inference of knowledge from users. The main focus is to enable the semi-automatic annotation of multimedia collections for accurate high level semantic retrieval. We mostly address two aspects in this phase of the project: * We wish to enhance interactivity during the annotation process by removing situations where the user may be overwhelmed by a surge of information. This is done at the collection level where sampling-based and group-based annotation techniques are defined. In this part, we will construct a close relationship with our efforts on multimedia collection visualization and navigation. The general line that we follow is to define an incremental annotation process whereby a collection item incrementally receives semantic information from disparate sources. Such an incremental annotation process is facilitated and reinforced by a knowledge propagation strategy, based on inter-document relationships created by either content-based techniques or inspirations from collaborative filtering. We also want to enhance interactivity for the manipulation of ontologies whose size may easily become unmanageable. By exploiting our reasoning engine, we propose to highlight and filter out online respectively relevant and irrelevant parts of the ontology. Also by exploiting any co-occurrence of terms or between term and low-level features (in relation to our auto-annotation effort), we wish to construct a recommendation system for the multimedia description. We also propose to enable collaborative annotation context in which the annotation task is distributed amongst human and software operators by allowing multiple description to co-exist and complete each other. Conflict resolution should be addressed by either making hard decisions or by allowing different points of view to co-exist. We see collaborative annotation as beneficial to improve reliability and objectivity. It also forms a richer base for knowledge expansion and propagation. By exploiting strategies inspired from collaborative filtering, one may define user profiles (rather determine user domain of expertise) to add reliability measures on respective annotations. We consider this development is important in a context where classical multimedia document collection are of unmanageable size.
2002/03 - 2005/02 IST Program, Nr. 2001-34485.
SNF grants number 2100-045581/2000-059152/2000-052426 / Apr 1996 - Mar 2002