Talk
Abstract:
Relational Knowledge Discovery: Applications to Text
David
Jensen
Research Assistant Professor
Department of Computer Science
University of Massachusetts - Amherst
jensen@cs.umass.edu
Individual documents are situated within a complex web of relations.
Documents are related to other documents through citations,
hyperlinks, and other connections. Documents are also related
to non-document objects such as authors, publishers, and archives.
Finally, the content of documents can often establish relationships
between documents and the people, places, things, and other
topics they discuss. These relations are among the most accessible
and most useful information about a given document.
Unfortunately, nearly all current techniques in knowledge discovery
and data mining use extremely limited data representations,
and they are unable to express and analyze rich relational structures.
New techniques are needed to address the growing interest in
mining relational information in text, databases, XML and other
structured and semi-structured formats.
In this talk, I will discuss the special challenges of representing
and analyzing relational data, with special attention to the
problem of analyzing data derived from text. I will describe
several systems, focusing on Proximity, a system under development
in my research group at the University of Massachusetts.
Back to Workshop Schedule
Back to IMA "HOT TOPICS" Workshop: Text Mining
|