Campuses:

Detecting Overlapping Communities Using the Ground-truth Communities

Thursday, March 1, 2012 - 9:00am - 9:45am
Keller 3-180
Jure Leskovec (Stanford University)
Networks are a powerful way to describe and represent social, technological and biological systems. Such systems can all be studied as graphs, where nodes represent entities (e.g., people, web sites) and edges represent interactions (friendships, communication, links). Nodes in networks organize into groups that are commonly referred to as communities, clusters or modules. This is arguably the most important and useful resolution for studying networks as there are many reasons why networks organize into communities. For example, society is organized into groups, families, friendship circles, villages and associations. On the World Wide Web, topically related pages may link more densely among themselves. Therefore, the task of network community detection is to discover such organizational units given an unlabeled network.

Understanding of how network communities are organized in networks has evolved over time. The traditional view of communities was that edges appear with high concentration within the community, and low concentration between these communities. Recently models of overlapping communities, i.e., nodes
can belong to more than one community, have been proposed. The current view of how communities overlap assumes that community overlaps are less densely connected than the groups themselves, i.e., the the red nodes in the overlap are less likely to be connected than pairs of nodes that belong to the same single
community.

Our work directly challenges these assumptions. We study a set of diverse networks where nodes explicitly state their community memberships. We discover that community overlaps are in fact denser than the non-overlapping parts, which is in sharp contrast with the present models of community overlaps. Based
on these observations we develop the model which reliably captures the structure and overlaps of communities in networks. We then develop a model based community detection method and evaluate it on six large networks. Our experiments demonstrate that the proposed community detection method outperforms the
present state of the art methods.
MSC Code: 
91Dxx