Game Development Reference

In-Depth Information

a, b

b

a

a

b

0

1

2

Figure 6.5 A reachability game
G

6.4 Games with imperfect information:

almost-surely-winning

We revisit the 3-coin game. In Exercise 6.11, we have seen that Player 1

does not have an observation-based deterministic winning strategy in this

game when Player 2 is allowed to exchange the position of the coins that are

not toggled. This is because Player 2 can guess the choice that Player 1 will

make in the next round. When a deterministic strategy for Player 1 is fixed,

this information is formally available to Player 2 but this is not realistic in

practice. Player 1 should use a source of randomisation in order to avoid

the possibility that Player 2 can guess the choice she will make in the next

round. Whenever the game is in a configuration with two heads, Player 1

chooses uniformly at random one of the three coins. Clearly the probability

to choose the coin showing tail is

1

3
no matter if Player 2 has decided to

exchange the coins or not at the previous step. Otherwise, she should play

the same coin a second time to make sure she comes back to a configuration

with two heads. She then repeats the same randomised strategy. Every two

rounds, Player 1 has a

1

3
probability to reach the winning configuration.

Note also that she is sure to avoid the losing configuration (all coins on

tails). This simple strategy is thus winning the reachability objective with

probability one. This illustrates the power of randomised strategies in games

with imperfect information.

6.4.1 Playing with randomised strategies

Before going into the formalisation, let us take a look at the example of

Figure 6.3. From the initial location
1
, we have seen that Player 1 has no

surely-winning strategy for reaching
4
. This is because for all strategies
α

of Player 1, there exists a play
π ∈
Outcome
1
(
G, α
) that visits
3
infinitely

often, and therefore never visits
4
. However, the strategy
β
of Player 2 such

that
π
= outcome(
G, α, β
) chooses the successor
ˆ
of
1
in a way that depends

on the next move of Player 1, namely
ˆ
=
2
if
α
plays action
a
next, and

ˆ
=
2
if
α
plays action
b
next. In a concrete implementation of the system,

this means that Player 2 needs to predict the behaviour of Player 1 infinitely

Search Nedrilad ::

Custom Search