Clustering and Indexing of Multimedia Objects in the MARS System

Friday, February 2, 2001 - 11:00am - 12:45pm
Lind 400
Sharad Mehrotra (University of California)
The goal of the MARS project is the design and development of next generation information systems that provides seamless access to multimedia information based on its rich internal content. Due to many fundamental limitations of retrieving multimedia information based solely on textual annotations, we have adopted a vision centric approach in which objects are represented and retrieved based on low-level visual features (e.g., color, texture, layout, etc). These visual properties may be extracted automatically from images/video making the approach scalable to large as well as heterogeneous multimedia collections.

Supporting content-based queries over visual feature representations poses many significant challenges to existing practice of database management (DBMS) and information retrieval (IR). Existing IR techniques that deal primarily with textual information need to be generalized to support content-based retrieval over multimedia. Furthermore, since visual feature representations define complex non-euclidean vector spaces, techniques need to be developed to support such complex multidimensional information in DBMSs. Another challenge is to integrate multimedia IR techniques with DBMSs. Problem arises since existing DBMSs do not have any native support for storage and processing of imprecise information while content-based retrieval is inherently imprecise.

In this talk, I will provide an overview of the progress we have made in addressing some of the above challenges in supporting multimedia information in DBMSs. The focus of the talk will be on the problem of indexing and efficient retrieval of multimedia objects (viz., the dimensionality curse problem, support for arbitrary distance metrics, support for novel types of queries including refined queries in databases) and the solutions developed in the context of MARS.