multi-armed bandits

Thursday, December 6, 2018 - 3:00pm - 4:00pm
Shipra Agrawal (Columbia University)
Modern online marketplaces feed themselves. They rely on historical data to optimize content and user-interactions, but further, the data generated from these interactions is fed back into the system and used to optimize future interactions. As this cycle continues, good performance requires algorithms capable of learning actively through sequential interactions, systematically experimenting to improve future performance, and balancing this experimentation with the desire to make decisions with most immediate benefit.
Thursday, October 4, 2018 - 4:45pm - 5:30pm
Xi Chen (New York University)
In this talk, we study the dynamic assortment planning problem under various popular discrete choice models, including the multinomial-logit (MNL) model, the nested logit model, and a contextual MNL model. For each arriving customer, the seller offers an assortment of substitutable products, and then the customer makes the purchase according to a pre-specified choice model. Since all the utility parameters of customers are unknown, the seller needs to simultaneously learn customers' choice behavior and make dynamic decisions on assortments based on the current knowledge.
Subscribe to RSS - multi-armed bandits