Game Development Reference

In-Depth Information

take exponentially many steps, a feat which had remained elusive ever

since Howard [1960] proposed
policy iteration
algorithms for
Markov

decision processes
. Fearnley [2010b] has adapted Friedmann's examples

to make Howard's policy iteration take exponentially many iterations also

on Markov decision processes, and hence exhibiting a surprising weakness of

policy iteration and strategy improvement algorithms, even on (stochastic)

one-player games. On the other hand, the proposal of Fearnley [2010a] to

consider
non-oblivious strategy improvement
algorithms is a promising

new way to explore strategy improvement in the quest for polynomial time

algorithms, despite the damage inflicted to this line of research by Friedmann's

examples.

Randomised variations of the strategy improvement technique, sometimes

referred to as the
Random Facet
algorithms, have been proposed for simple

stochastic games by Ludwig [1995] and Bjorklund and Vorobyov [2007].

They were inspired by a subexponential
randomised simplex algorithm

of Matousek et al. [1996], and they were the first subexponential (randomised)

algorithms for solving parity, mean-payoff, discounted, and simple stochastic

games. A recent result of Friedmann et al. [2010] establishes that the Random

Facet algorithm requires super-polynomial time on parity games.

An important corollary of the applicability of strategy improvement algo-

rithms to solving games on graphs is that the
search problems
of
com-

puting optimal strategies
in parity, mean-payoff, discounted, and simple

stochastic games are in PLS (Johnson et al. [1988]). Moreover, those search

problems are also known to be in PPAD (Papadimitriou [1994]) because they

can be reduced in polynomial time (Gartner and Rust [2005], Jurdzinski

and Savani [2008]) to the
P-matrix linear complementarity problem

(Cottle et al. [2009]). The latter is in PPAD since it is processed by Lemke's

algorithm (Lemke [1965], Papadimitriou [1994], Cottle et al. [2009]). It follows

that the problems of computing optimal strategies in games on graphs are

unlikely to be complete for either of the two important complexity classes of

search problems, unless one of them is included in the other.

The reductions from games on graphs to the P-matrix linear complemen-

tarity problem have recently facilitated applications of classical algorithms for

the linear complementarity problem to solving games. Fearnley et al. [2010]

and Jurdzinski and Savani [2008] have considered
Lemke's algorithm
,

the
Cottle-Danzig algorithm
, and
Murty's algorithm
for discounted

games, and they have shown that each of those pivoting algorithms may

require exponential number of iterations on discounted games. Randomised

versions of these algorithms require further study.

Search Nedrilad ::

Custom Search