Game Development Reference
In-Depth Information
take exponentially many steps, a feat which had remained elusive ever
since Howard [1960] proposed policy iteration algorithms for Markov
decision processes . Fearnley [2010b] has adapted Friedmann's examples
to make Howard's policy iteration take exponentially many iterations also
on Markov decision processes, and hence exhibiting a surprising weakness of
policy iteration and strategy improvement algorithms, even on (stochastic)
one-player games. On the other hand, the proposal of Fearnley [2010a] to
consider non-oblivious strategy improvement algorithms is a promising
new way to explore strategy improvement in the quest for polynomial time
algorithms, despite the damage inflicted to this line of research by Friedmann's
examples.
Randomised variations of the strategy improvement technique, sometimes
referred to as the Random Facet algorithms, have been proposed for simple
stochastic games by Ludwig [1995] and Bjorklund and Vorobyov [2007].
They were inspired by a subexponential randomised simplex algorithm
of Matousek et al. [1996], and they were the first subexponential (randomised)
algorithms for solving parity, mean-payoff, discounted, and simple stochastic
games. A recent result of Friedmann et al. [2010] establishes that the Random
Facet algorithm requires super-polynomial time on parity games.
An important corollary of the applicability of strategy improvement algo-
rithms to solving games on graphs is that the search problems of com-
puting optimal strategies in parity, mean-payoff, discounted, and simple
stochastic games are in PLS (Johnson et al. [1988]). Moreover, those search
problems are also known to be in PPAD (Papadimitriou [1994]) because they
can be reduced in polynomial time (Gartner and Rust [2005], Jurdzinski
and Savani [2008]) to the P-matrix linear complementarity problem
(Cottle et al. [2009]). The latter is in PPAD since it is processed by Lemke's
algorithm (Lemke [1965], Papadimitriou [1994], Cottle et al. [2009]). It follows
that the problems of computing optimal strategies in games on graphs are
unlikely to be complete for either of the two important complexity classes of
search problems, unless one of them is included in the other.
The reductions from games on graphs to the P-matrix linear complemen-
tarity problem have recently facilitated applications of classical algorithms for
the linear complementarity problem to solving games. Fearnley et al. [2010]
and Jurdzinski and Savani [2008] have considered Lemke's algorithm ,
the Cottle-Danzig algorithm , and Murty's algorithm for discounted
games, and they have shown that each of those pivoting algorithms may
require exponential number of iterations on discounted games. Randomised
versions of these algorithms require further study.
Search Nedrilad ::




Custom Search