Classical information retrieval was built on the assumption that patient users are trying to extract information from a clean, trustworthy, manageable corpus and no adversaries are trying to get in the way. We can't rely on any of these assumptions in the era of the web. In this talk we'll see how leveraging mathematics and large-scale systems leads us to good web information retrieval. Finally we'll describe an interesting open problem with applications to computing PageRank.