Gibbs Sampler

Wednesday, November 7, 2018 - 10:10am - 10:40am
Thomas Murray (University of Minnesota, Twin Cities)
This talk will describe a new approach for optimizing dynamic treatment regimes that bridges the gap between Bayesian inference and Q-learning. The proposed approach fits a series of Bayesian regression models, one for each stage, in reverse sequential order. Each model regresses the remaining payoff assuming optimal actions are taken at subsequent stages on the current history and actions.
