IMA Tutorial (part II):
Measurement and modeling of the web and related data sets



Focus Areas

One view of the Internet: Inter-Domain Connectivity

Another view of the web: the hyperlink graph

Getting started – structure at the hyperlink level



Breadth-first search from random starts

A Picture of (~200M) pages.

Some distance measurements

Facts (about the crawl).

Analysis of power law

Component sizes.

Other observed power laws in the web

More Characterization: Self-Similarity

Ways to Slice the Web

Self-Similarity on the Web

In particular…

Is this surprising?

A structural explanation

The Navigational Backbone

Information Extraction from Large Graphs


Many approaches to this problem

General approach

Web Communities

Communities and cores

Other footprint structures

Subgraph enumeration

Enumerating cores

Results for cores

The cores are interesting

Elementary Schools in Japan


A word on evolution

More bursts

Integrating bursts and graph analysis