Network topology as a source of biological information

Tuesday, February 28, 2012 - 11:15am - 12:00pm
Keller 3-180
Natasha Przulj (Imperial College London)
Sequence-based computational approaches have revolutionized biological understanding. However, they can fail to explain some biological phenomena. Since proteins aggregate to perform a function instead of acting in isolation, the connectivity of a protein interaction network (PIN) will provide additional insight into the inner working on the cell, over and above sequences of individual proteins. We argue that sequence and network topology give insights into complementary slices of biological information, which sometimes corroborate each other, but sometimes do not. Hence, the advancement depends on the development of sophisticated graph-theoretic methods for extracting biological knowledge purely from network topology before being integrated with other types of biological data (e.g., sequence). However, dealing with large networks is non-trivial, since many graph-theoretic problems are computationally intractable, so heuristic algorithms are sought.

Analogous to sequence alignments, alignments of biological networks will likely impact biomedical understanding. We introduce a family of topology-based network alignment (NA) algorithms, that we call GRAAL algorithms, which produces by far the most complete alignments of biological networks to date: our alignment of yeast and human PINs demonstrates that even distant species share a surprising amount of PIN topology. We show that both species phylogeny and protein function can be extracted from our topological NA. Furtermore, we demonstrate that the NA quality improves with integration of additional data sources (including sequence) into the alignment algorithm: surprisingly, 77.7% of proteins in the baker’s yeast PIN participate in a connected subnetwork that is fully contained in the human PIN suggesting broad similarities in internal cellular wiring across all life on Earth. Also, we demonstrate that topology around cancer and non-cancer genes is different and when integrated with functional genomics data, it successfully predicts new cancer genes in melanogenesis-related pathways.
MSC Code: