Loading Events
  • This event has passed.
Stochastics and Statistics Seminar

The Smoothed Linear Program for Approximate Dynamic Programming

April 17, 2009 @ 11:00 am

Vivek F. Farias (MIT Sloan)

We present a novel linear program for the approximation of the dynamic programming value function in high-dimensional stochastic control problems. LP approaches to approximate DP naturally restrict attention to approximations that, depending on the context, are upper or lower bounds to the optimal value function. Our program — the `smoothed LP’ — relaxes this restriction in an appropriate fashion while remaining computationally tractable. Doing so appears to have several advantages: o We demonstrate superior bounds on the quality of approximation to the optimal value function afforded by our approach. o Experiments with our approach on a challenging problem (the game of Tetris) show that the approach outperforms a variety of approximate DP algorithms (including the LP approach, TD-learning and policy gradient methods) by a significant margin. Joint work with Vijay Desai and Ciamac Moallemi (Columbia).

© MIT Statistics + Data Science Center | 77 Massachusetts Avenue | Cambridge, MA 02139-4307 | 617-253-1764 |