Main navigation | Main content

HOME » PROGRAMS/ACTIVITIES » Annual Thematic Program

PROGRAMS/ACTIVITIES

Annual Thematic Program »Postdoctoral Fellowships »Hot Topics and Special »Public Lectures »New Directions »PI Programs »Math Modeling »Seminars »Be an Organizer »Annual »Hot Topics »PI Summer »PI Conference »Applying to Participate »

Talk Abstract

A Semi-Discrete Matrix Decomposition for Latent Semantic Indexing in Information Retrieval

A Semi-Discrete Matrix Decomposition for Latent Semantic Indexing in Information Retrieval

In Latent Semantic Indexing, a set of documents is represented by a matrix whose entries are measures of frequencies of occurrences of terms in each document. This matrix is then approximated by a sum of rank-one matrices determined using a singular value or related decomposition.

In this work we investigate the use of a semi-discrete approximation to the matrix in which each rank-one matrix can be represented by a scalar multiplying two vectors whose entries equal plus or minus one or zero. Compression rates are thus much greater than for other decompositions, and the processing of each query is much faster. We compare the methods on three databases and discuss the effects of various frequency measures.

This is joint work with Tamara Gibson Kolda.

Back to Workshop Schedule