Massive Data Sets, Data Mining, and Cluster Analysis

Saturday, March 7, 1998 - 3:15pm - 3:40pm
Keller 3-180
Jon Kettenring (Alcatel-Lucent Technologies Bell Laboratories)
Practitioners are facing increasingly large amounts of data to analyze. Many standard approaches fall flat because they are inappropriate or fail to scale. Computer scientists (and others) are promoting data mining as the answer. Is it? One of the basic techniques that is often listed as part of data mining is cluster analysis. Cluster analysis can help, in principle, to break large amounts of data down into manageable chunks. What are the important research issues involved in this particular setting and in the general massive data sets/data mining context? In this brief talk I will try to give a quick perspective on all of these topics.