The digital era, the Internet, the World Wide Web and new powerful devices for storing, producing and capturing digital information have resulted in massive digital collections of data, such as document archives and image and video libraries. This, in turn, has created a need for methods and procedures to effectively browse and search the digital libraries. Methodological advancement has taken us to a point, where we can effectively index millions of textual documents in the blink of an eye, but at the same time face a formidable task in extracting semantically meaningful content-based cues from an image or a video clip without human intervention. Despite the extensive research investment on content-based multimedia retrieval of digital libraries during the last ten years, the current state-of-the-art solutions leave a lot to hope for in terms of usability and utilisation of content-based cues.

Computers are very good at automatically computing low-level features such as color histograms, but very poor in extracting high-level semantic features, such as objects and their meanings, actions, feelings, etc. - things that most likely would be much more useful than a colour histogram in content-based retrieval. This gap between low-level features and high-level semantic descriptions is often referred to as the semantic gap. Even though the low-level features are practical from a computational point of view, their use places the responsibility of bridging the semantic gap between the features and the high-level semantics on the user interface of the retrieval system. Establishing this bridge has turned out to be very difficult, hence all steps towards reducing the semantic gap represent a significant leap from the current state-of-the-art in content-based retrieval.

The objective of this research project is to reduce the semantic gap by multidisciplinary research based on close international and domestic collaboration between researchers with backgrounds in multimedia signal processing, mathematics, information studies and linguistics, and by integrating information from several media types into an efficient multimedia analysis. First, high-level semantic concepts relevant in the scope of the corpus and the retrieval task are identified. Second, efficient computational models for representing these concepts are developed. Third, the developed methodology is subjected to rigorous empirical performance evaluation in three complementary ways: 1) performance characterization of individual algorithms in form of ‘simple’ experiments (e.g. Matlab simulations), which may have no direct relationship with the tentative practical application; 2) participation in the annual TREC Video Retrieval Track, which is by far the most sophisticated existing benchmark in the scope of video retrieval, bringing together world’s leading research groups to jointly tackle this commanding challenge; 3) prototyping of retrieval applications, which are subjected to usability studies providing feedback to semantic modelling and methodological development.

Selected publications

Rautiainen M & Seppänen T (2005)
Comparison of visual features and fusion techniques in automatic detection of concepts from news video.
Proc. 2005 IEEE International Conference on Multimedia & Expo, Amsterdam, The Netherlands.
Abstract  Full paper (PDF)

Rautiainen M, Ojala T, Seppänen T (2005)
Content-based browsing in large news video databases.
Proc. 5th IASTED International Conference on Visualization, Imaging and Image Processing, Benidorm, Spain.
Abstract  Full paper (PDF)

Väyrynen P & Seppänen T (2003)
Using WordNet in information retrieval.
Informaatiotutkimus 2:52 - 55. (in Finnish)
Abstract  Full paper (PDF)

Rautiainen M, Penttilä J, Vorobiev D, Noponen K, Väyrynen P, Hosio M, Matinmikko E, Mäkelä SM, Peltola J, Ojala T & Seppänen T (2002)
TREC 2002 Video Track experiments at MediaTeam Oulu and VTT.
Proc. Text Retrieval Conference TREC 2002 Video Track, Gaithersburg, MD, 417 - 428.
Abstract  Full paper (PDF)

Väyrynen P, Seppänen T, Noponen K & Juuso I (2002)
On the usefulness of linguistic knowledge in different areas of application in language technology.
Informaatiotutkimus 3/2002:59 - 66. (in Finnish)
Abstract  Full paper (PDF)

