Reducing Exploration in Personalized Decision-Making

Tuesday, December 4, 2018 - 10:30am - 11:30am
Lind 305
Mohsen Bayati (Stanford University)
A central problem in personalized decision-making is to learn decision outcomes as functions of individual-specific covariates (contexts). Current literature on this topic focuses on algorithms that balance an exploration-exploitation tradeoff, to ensure sufficient rate of learning while optimizing for some objective. However, exploration may be undesirable for highly sensitive individuals (e.g., patients in clinical treatment planning or high-value customers of a platform). In this talk, we first introduce an algorithm that leverages free-exploration from the covariates and achieves rate optimal objective. Moreover, we show empirically that our algorithm significantly reduces exploration, compared to existing benchmarks. Next, we focus on settings when past data on decision outcomes is available. Motivated by literature on low-rank matrix estimation, we design algorithms that learn decision outcomes by targeting the learning towards shared similarities among patients.