Where to find the next passenger on e-hailing platform? - A reinforcement learning approach

Friday, October 5, 2018 - 9:45am - 10:30am
Keller 3-180
Sharon Di (Columbia University)
Vacant taxi drivers’ cruising behavior to seek the next potential passenger in a road network generates additional vehicle traveled miles, adding congestion and pollution into the road network and the environment. This study aims to employ reinforcement learning to model idle e-hailing drivers’ optimal sequential decisions in passenger-seeking. While there exist a few studies that applied Markov decision process (MDP) to taxi drivers searching behavior, these studies were primarily focused on modeling traditional taxi drivers behavior. Transportation network companies (TNC) or e-hailing (e.g., Didi, Uber) drivers exhibit different behaviors from traditional taxi drivers because the e-hailing drivers do not need to actually search passengers. Instead, they reposition themselves so that the matching platform can match a passenger. Accordingly, we incorporate e-hailing drivers' new behavioral features into our model. We then train the model with over 15,000 Didi drivers 1-month trajectories. To validate the effectiveness of the model, a Monte Carlo simulation is conducted to simulate the performance of drivers if they follow optimal policies derived from our model. Two metrics, the rate of return and taxi utilization rate, show that our model can help drivers increase drivers' rate of return by 14% and improve taxi utilization rate by 26%.