Main navigation | Main content

HOME » PROGRAMS/ACTIVITIES » Annual Thematic Program

PROGRAMS/ACTIVITIES

Annual Thematic Program »Postdoctoral Fellowships »Hot Topics and Special »Public Lectures »New Directions »PI Programs »Math Modeling »Seminars »Be an Organizer »Annual »Hot Topics »PI Summer »PI Conference »Applying to Participate »

Talk Abstract

Level Search Filtering for IR Model Reduction

Level Search Filtering for IR Model Reduction

University of Tennessee

Information retrieval models such as Latent Semantic Indexing (LSI) have been demonstrated to outperform lexical matching in the presence of synonymy and polysemy. Unfortunately, such rank-reduced models require the decomposition (e.g., singular value decomposition) of large term-by- document matrices which can prohibit scalable information retrieval. This particular work demonstrates how information filtering using level search techniques can reduce the computational cost of LSI. For each query, level search extracts a much smaller subset of the original term-by-document matrix, containing on average 25% of the original non-zero entries. When LSI is applied to such subsets, the average precision degrades only by 5% due to level search filtering. Moreover, for some document collections an increase in precision has also been observed. Further enhancement of level search can be based on a "pruning" scheme which deletes terms connected to only one document from the query-specific submatrix. Such pruning has achieved a 65% reduction (on average) in the number of non-zeros with a precision loss of no more than 5% for most collections.

This
work is joint with **Xiaoyan Zhang**
and **Padma Raghavan**, University
of Tennessee.

Slides for the Talk powerpoint html