The Smoothed Linear Program for Approximate Dynamic Programming

April 17, 2009 @ 11:00 am

Vivek F. Farias (MIT Sloan)

Event Navigation

We present a novel linear program for the approximation of the dynamic programming value function in high-dimensional stochastic control problems. LP approaches to approximate DP naturally restrict attention to approximations that, depending on the context, are upper or lower bounds to the optimal value function. Our program — the `smoothed LP’ — relaxes this restriction in an appropriate fashion while remaining computationally tractable. Doing so appears to have several advantages: o We demonstrate superior bounds on the quality of approximation to the optimal value function afforded by our approach. o Experiments with our approach on a challenging problem (the game of Tetris) show that the approach outperforms a variety of approximate DP algorithms (including the LP approach, TD-learning and policy gradient methods) by a significant margin. Joint work with Vijay Desai and Ciamac Moallemi (Columbia).

MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764

Accessibility

Events

The Smoothed Linear Program for Approximate Dynamic Programming

April 17, 2009 @ 11:00 am

Event Navigation