Batched Bandit Problems

Monday, May 18, 2015 - 10:20am - 11:10am
Keller 3-180
Philippe Rigollet (Massachusetts Institute of Technology)
Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic multi-armed bandits under the constraint that the employed policy must split trials into a small number of batches. Our results show that a very small number of batches gives already close to minimax optimal regret bounds and we also evaluate the number of trials in each batch.
MSC Code: