Online Text Classification with ATTICS
Monday, April 17, 2000 - 4:15pm - 4:40pm
David Lewis (AT&T Laboratories - Research)
ATTICS is a C++ platform implemented at AT&T Labs for training and use of predictive models on mixed text and nontext data. In spirit it is a hybrid between text retrieval systems such as SMART and machine learning toolkits such as MLC++. The design, data model, and emphasis on online classifier application are unusual for either type of software. The term weighting and supervised learning techniques used in information retrieval were developed in the context of ranked retrieval from relatively static text databases. I will discuss how we implemented these techniques in the online setting of ATTICS, and the research questions this exercise raises.