Data Mining for Genomics

Tuesday, April 27, 1999 - 10:30am - 11:15am
Keller 3-180
Vipin Kumar (University of Minnesota, Twin Cities)
Joint work with Sam Han, Mahesh Joshi, and George Karypis, University of Minnesota.

This talk will provide a brief introduction to the field of data mining, and its potential applications in discovering new information from genomic data. Data mining is a process of analyzing the given data in a supervised or unsupervised manner to discover useful and interesting information that is hidden within the data. Research in genomics is aimed at understanding the biological systems, by analyzing their structure as well as their functional behavior. As various projects of mapping and sequencing genomes are reaching successful completion, the researchers are focusing more on functional genomics. Rapid technological developments are enabling researchers to perform quicker and more cost-effective experiments. As an example, recently developed oligonucleotide chips and DNA micro-arrays use controlled environment to generate the gene expression data under various normal and abnormal conditions in considerably short time. Experiments of this kind are generating mountains of data at a rapid rate. Analyzing such functional data combined with the structural information would not be possible without automated and efficient computational techniques. In this talk, we will discuss data mining techniques, such as clustering of related data items or discovering temporal relationships, that could potentially help genomic researchers in gaining insights into the functional behavior of genes as well as to correlate stuctural information with functional information.