Game Development Reference

In-Depth Information

PSPACE
-complete, and the same holds for the problem of whether

val
(
v
) = 1 (see Chatterjee [2007], Hunter and Dawar [2005]).

Mean-payoff and discounted payoff.
In mean-payoff and discounted payoff

turn-based stochastic games, both players have optimal MD strategies (see

Gillette [1957], Liggett and Lippman [1969]), and the problem of whether

val
(
v
)

is in
NP
∩
coNP
. We refer to Filar and Vrieze [1996], Neyman and

Sorin [2003] for more comprehensive expositions of algorithms for mean-payoff

and discounted payoff turn-based stochastic games.

Basic properties of the other quantitative payoffs (in particular,
lim-min

and
lim-max
payoffs) are carefully discussed in Chatterjee et al. [2009].

Finally, we give a brief overview of the existing results about
infinite-state

turn-based stochastic games. There are basically three types of such games

studied in the literature.

≥

•
Recursive stochastic games
, also known as
stochastic BPA games
.

These are games defined over stateless pushdown automata or (equivalently)

1-exit recursive state machines. Roughly speaking, a stochastic BPA game

is a finite system of rules of the form
X

α
, where
X
is a stack symbol

and
α
is a (possibly empty) sequence of stack symbols. The finitely many

stack symbols are split into three disjoint subsets of symbols that 'belong'

to player

→

, or the virtual random player. A
configuration

is a finite sequence of stack symbols. The leftmost symbol of a given

configuration is rewritten according to some rule, which is selected by the

respective player (for every stochastic stack symbol
Y
, there is a fixed

probability distribution over the rules of the form
Y

, player

♦

β
).

A
termination
objective is a special type of reachability objective

where the only target vertex is
ε
(i.e., the empty stack). BPA MDPs and

stochastic BPA games with termination objectives were studied by Etes-

sami and Yannakakis [2005, 2006]. Some of these results were generalised

to
reachability
objectives by Brazdil et al. [2008] and Brazdil et al. [2009a].

BPA MDPs and stochastic BPA games with
positive rewards
were studied

by Etessami et al. [2008]. Here, each rule is assigned some fixed reward

r>
0 which is collected whenever the rule is executed, and the objective of

player

→

is to maximise the expected total reward (which can be infinite).

•
Stochastic games with lossy channels.
A lossy channel system is a

finite-state automaton equipped with a finite number of unbounded but

unreliable (lossy) channels. A transition may change a control state and

read/write from/to a channel. Since the channels are lossy, an arbitrary

number of messages may be lost from the channels before and after each

transition. A probabilistic variant of lossy channel systems defines a proba-

Search Nedrilad ::

Custom Search