Game Development Reference

In-Depth Information

Section 5.3). In such cases, all we can hope for is an e
cient algorithm which

approximates
val
f
(
v
) up to an arbitrarily small given precision.

Question (4) also requires special attention in the case of infinite-state

games, because then even MD strategies do not necessarily admit a finite

description. Since the vertices of infinite-state games typically carry some

algebraic structure, optimal strategies may depend just on some finite infor-

mation gathered by analysing the structure of vertices visited along a run in

a play.

The existing results

In this section we give a short summary of the existing results about turn-

based stochastic games with Borel measurable payoffs. We start with
finite-

state
games.

In general, optimal strategies in finite-state games with the Borel mea-

surable payoff functions introduced in the previous sections exist, do not

need to randomise, and are either memoryless or finite-memory. In some

cases, memory can be 'traded' for randomness (see Chatterjee et al. [2004a]).

The values are rational (assuming that transition probabilities in games are

rational) and computable. The main techniques for establishing these results

are the following:

•
Strategy improvement.
This technique was originally developed for

general stochastic games (see Hoffman and Karp [1966]). An initial strategy

for one of the players is successively improved by switching it at positions

at which the current choices are not locally optimal.

•

Value iteration.
The tuple of all values is computed/approximated

by iterating a suitable functional on some initial vector. For example,

the Bellman functional Γ defined in Section 5.3.1 can be used to com-

pute/approximate the values in games with reachability payoffs.

•

Convex optimisations.
These methods are particularly useful for MDPs

(see Puterman [1994], Filar and Vrieze [1996]). For example, the tuple of

values in MDPs with reachability payoffs is computable in polynomial time

by a simple linear program (see below).

•

Graph-theoretic methods.
Typically, these methods are used to design

e
cient (polynomial-time) algorithms deciding whether the value of a

given vertex is equal to one (see, e.g., Chatterjee et al. [1994]).

The individual payoff functions are discussed in greater detail below.

Reachability.
Turn-based stochastic games with reachability payoffs were

first considered by Condon [1992] where it was shown that the problem of

whether
val
(
v
)
>
2
for a given vertex
v
is in
NP
∩
coNP
. It was also observed

Search Nedrilad ::

Custom Search