Skip to Main content Skip to Navigation

Multigrid methods for zero-sum two player stochastic games

Abstract : In this thesis, we present some algorithms and numerical results for the solution of large scale zero-sum two player repeated stochastic games. In particular, we consider the class of games with perfect information and infinite horizon. In this class, we consider the games with discounted payoff and the games with mean payoff. Our algorithms are mainly based on policy iteration type algorithms and multigrid methods, and are implemented in C. These algorithms are applied either to the dynamic programming equation of a true finite state space zero-sum two player game or to the discretization of an Isaacs PDE of a zero-sum stochastic differential game. In a first part, we propose an algorithm which combines policy iterations for discounted games and algebraic multigrid methods to solve the linear systems involved in the policy iterations. We present numerical tests on discretizations of Isaacs equations or variational inequalities. We also present a full multilevel policy iteration, similar to FMG, which allows one to improve substantially the computation time for solving some variational inequalities. For the games with mean payoff, we develop a policy iteration algorithm to solve zero-sum stochastic games with finite state and action spaces, perfect information and in the general multichain case (i.e. without irreducibility assumption on the Markov chains determined by the strategies of the players), following an idea of Cochet-Terrasson and Gaubert (2006). This algorithm relies on a notion of nonlinear spectral projection of dynamic programming operators of a one player games (which are monotone and convex). We show that the sequence of values and relative values satisfies a lexicographical monotonicity property, which implies that the algorithm does terminate. We present numerical results on a variant of Richman games and on pursuit-evasion games. We propose new algebraic multigrid algorithms to solve particular singular linear systems that arise for instance in the above policy iteration algorithm for zero-sum two player stochastic games with mean payoff. Furthermore, we introduce a new method to find the stationary probability of an irreducible Markov chain using a stochastic control approach and present an algorithm which combines the Howard policy iterations and algebraic multigrid iterations for singular systems.
Document type :
Complete list of metadata
Contributor : Sylvie Detournay Connect in order to contact the contributor
Submitted on : Thursday, December 6, 2012 - 12:27:40 PM
Last modification on : Wednesday, March 27, 2019 - 4:08:31 PM
Long-term archiving on: : Saturday, December 17, 2016 - 9:33:03 PM


  • HAL Id : pastel-00762010, version 1



Sylvie Detournay. Multigrid methods for zero-sum two player stochastic games. Optimization and Control [math.OC]. Ecole Polytechnique X, 2012. English. ⟨pastel-00762010⟩



Record views


Files downloads