Talk abstract:
A Semi-Discrete Matrix Decomposition for Latent Semantic
Indexing in Information Retrieval
Dianne P. O'Leary, University of Maryland
In Latent Semantic Indexing, a set of documents is represented
by a matrix whose entries are measures of frequencies of occurrences
of terms in each document. This matrix is then approximated
by a sum of rank-one matrices determined using a singular value
or related decomposition.
In this work we investigate the use of a semi-discrete approximation
to the matrix in which each rank-one matrix can be represented
by a scalar multiplying two vectors whose entries equal plus
or minus one or zero. Compression rates are thus much greater
than for other decompositions, and the processing of each query
is much faster. We compare the methods on three databases and
discuss the effects of various frequency measures.
This is joint work with Tamara Gibson Kolda.
Back to Workshop Schedule
1996-1997
Mathematics in High Performance Computing
|