Thursday, February 6, 2014 - 15:05
1 hour (actually 50 minutes)
In this talk, I will talk about some recent research development in the approach of information relaxation to explore duality in Markov decision processes and controlled Markov diffusions. The main idea of information relaxation is to relax the constraint that the decisions should be made based on the current information and impose a penalty to punish the access to the information in advance. The weak duality, strong duality and complementary slackness results are then established, and the structures of optimal penalties are revealed. The dual formulation is essentially a sample path-wise optimization problem, which is amenable to Monte Carlo simulation. The duality gap associated with a sub-optimal policy/solution also gives a practical indication of the quality of the policy/solution.