reinforcement learning

Thursday, December 6, 2018 - 3:00pm - 4:00pm
Shipra Agrawal (Columbia University)
Modern online marketplaces feed themselves. They rely on historical data to optimize content and user-interactions, but further, the data generated from these interactions is fed back into the system and used to optimize future interactions. As this cycle continues, good performance requires algorithms capable of learning actively through sequential interactions, systematically experimenting to improve future performance, and balancing this experimentation with the desire to make decisions with most immediate benefit.
Friday, October 5, 2018 - 9:45am - 10:30am
Sharon Di (Columbia University)
Vacant taxi drivers’ cruising behavior to seek the next potential passenger in a road network generates additional vehicle traveled miles, adding congestion and pollution into the road network and the environment. This study aims to employ reinforcement learning to model idle e-hailing drivers’ optimal sequential decisions in passenger-seeking. While there exist a few studies that applied Markov decision process (MDP) to taxi drivers searching behavior, these studies were primarily focused on modeling traditional taxi drivers behavior.
