Three principles of data science: predictability, stability, and computability

Wednesday, September 14, 2016 - 3:10pm - 4:00pm
Keller 3-180
Bin Yu (University of California, Berkeley)
In this talk, I'd like to discuss the intertwining importance and connections of three principles of data science in the title in data-driven decisions. The ultimate importance of prediction lies in the fact that future holds the unique and possibly the only purpose of all human activities, in business, education, research, and government alike.
Making prediction as its central task and embracing computation as its core, machine learning has enabled
wide-ranging data-driven successes. Prediction is a useful way to check with reality. Good prediction implicitly assumes stability between past and future. Stability (relative to data and model perturbations) is also a minimum requirement for interpretability and reproducibility of data driven results. It is closely related to uncertainty assessment. Obviously, both prediction and stability principles can not be employed without feasible computational algorithms, hence the importance of computability. The three principles will be demonstrated through analytical connections, and in the context of neuroscience and genomics projects, for which data wisdom is also indispensable.