Game Development Reference

In-Depth Information

often in order to win. In practice, since one wrong guess make Player 1 win,

this suggests that the probability that Player 2 wins (making infinitely many

right guesses) is 0, and thus Player 1 can win with probability 1.

We now formally define a notion of probabilistic winning. First, a
ran-

domised strategy
for Player 1 is a function
α
:(
L

Σ)
∗
L

×

→D

(Σ) where

D

(Σ) denotes the set of probability distributions over Σ, i.e., the set of all

functions
f
:Σ

[0
,
1] such that
σ∈
Σ
f
(
σ
) = 1. Intuitively, if Player 1

uses distribution
f
, then he plays each action
σ
with probability
f
(
σ
). We

assume that Player 1 is informed about which actual actions he played.

Hence, strategies are mappings from interleaved sequences of states and

actions of the form
ρ
=
0
σ
0
1
σ
1
...σ
n−
1
n
that we call
labelled histories
.

We denote by
L
(
ρ
)=
0
1
...
n
the
projection
of
ρ
onto
L
∗
. A strategy
α

is
observation-based
if for all pairs of labelled histories
ρ
,
ρ
of the form

ρ
=
0
σ
0
1
σ
1
...σ
n−
1
n
and
ρ
=
0
σ
0
1
σ
1
...σ
n−
1
n
such that for all
i
,

1
≤ i ≤ n
, obs(
i
)=obs(
i
), we have that
α
(
ρ
)=
α
(
ρ
).

A randomised strategy for Player 2 is a function
β
:(
L×
Σ)
+

→

→D
(
L
) such

that for all labelled histories
ρ
=
0
σ
0
1
σ
1
...σ
n−
1
n
and
σ

∈

Σ, for all
such

Δ. We extend in the expected way

(using projection of labelled histories onto
L
∗
when necessary) the notions of

observation-based randomised strategies for Player 2, memoryless strategies,

consistent plays, outcome, etc.

Given strategies
α
and
β
for Player 1 and Player 2, respectively, and an

initial location
0
, the probability of a labelled history
ρ
=
0
σ
0
1
σ
1
...σ
n−
1
n

is

that
β
(
ρ, σ
)(
)
>
0, we have (
n
,σ,
)

∈

(
ρ
)=
n−
1

P

i
=0
α
(
0
σ
0
...σ
i−
1
i
)(
σ
i
)

·

β
(
0
σ
0
...σ
i−
1
i
σ
i
)(
i
). The probability

(
π
)=
ρ∈L
−
1
(
π
)
P

of a history
π
=
0
1
...
n
is

(
ρ
), which uniquely defines

the probabilities of measurable sets of (infinite) plays (Vardi [1985]). The

safety, reachability, and parity objectives being Borel objectives, they are

measurable (Kechris [1995]). We denote by Pr
α,β

P

(
ϕ
) the probability that an

objective
ϕ
is satisfied by a play starting in
in the game
G
played with

strategies
α
and
β
. A randomised strategy
α
for Player 1 in
G
is
almost-

surely-winning
for the objective
ϕ
if for all randomised strategies
β
of

Player 2, we have Pr
α,β

l
I
(
ϕ
) = 1. A location
ˆ
∈ L
is
almost-surely-winning

for
ϕ
if Player 1 has an almost-surely-winning randomised strategy
α
in the

game
G
=
L,
ˆ
,
Σ
,
Δ
where
ˆ
is the initial location.

Note that our definition is again asymmetric in the sense that Player 1

has imperfect information about the location of the game while Player 2

has perfect information. While having perfect information does not help

Player 2 in the case of surely-winning, it makes Player 2 stronger in the

probabilistic case. Recent works (Bertrand et al. [2009], Gripon and Serre

[2009]) study a symmetric setting where the two players have imperfect

Search Nedrilad ::

Custom Search