The Block Two-Level Erdös-Rényi (BTER) Graph Model
Wednesday, October 26, 2011 - 3:00pm - 4:00pm
Large-scale graph analysis is becoming increasingly prevalent in our quest to understand the massive amounts of data arising from computer communications, social networks, email messages, purchase records, financial transactions, etc. In general, however, large-scale data sets are not available for study because they may contain private data or may relinquish some competitive advantage. Consequently, most researchers must rely on graph models as a proxy for real data. Moreover, even when real data is available, graph models may be utilized to test methodologies at varying scales or under a range of scenarios. Modeling real-world large-scale graphs, at scales upwards of a trillion nodes as proposed by Graph500, is the ultimate objective. In order to scale to large sizes, we require a model whose edges can be generated independently and thus in parallel. On the other hand, large-scale graphs contain many small-scale and even minuscule-scale interactions, generically referred to as community structure, that should be accurately represented in any graph model. Existing graph models either fail to generate edges independently or fail to capture community structure. We present analysis of the Stochastic Kronecker Graph (SKG) model, which is the current generator for the Graph500 benchmark, including a comparison to the Chung-Lu model (also known the edge configuration model). We propose a new approach, the Block Two-Level Erdös-Rényi (BTER) model which naturally contains community structure, even at the minuscule scale, but is able to generate edges independently and thus can scale to trillions of nodes.