Game Development Reference

In-Depth Information

v

1

2

1

2

1

2

1

2

1

2

1

1

2

1

2

1

2

1

2

1

1

1

1

Figure 5.3 An optimal maximising strategy does not necessarily exist

(b) an optimal maximising strategy in v (if it exists) may require infinite

memory even if G is finitely branching;

(c) if G has finitely many vertices, then there is a MD strategy which is

optimal maximising in every vertex of G.

A counterexample for claim (a) is given in Figure 5.3 (target vertices are grey).

Observe that
val
(
v
) = 1 but there is no optimal maximising strategy in
v
.

An example demonstrating that an optimal maximising strategy may require

infinite memory (originally due to Brozek [2009]) is given in Figure 5.4. The

outgoing transitions of the vertex
v
are the same as the outgoing transitions

of the vertex
v
in Figure 5.3 and they are not shown in the picture. Observe

that
val
(
v
) = 1 and hence
val
(
e
i
) = 1 for all
i

1. Further, we have that

val
(
s
i
)=1
−
(
2
)
i
for all
i ≥
1, and hence also
val
(
d
i
)=1
−
(
2
)
i
for all

i ≥
1. From this we get
val
(
v
)=
3
. Also observe that player hasaHD

optimal maximising strategy
σ
which simply ensures that player ♦ cannot

gain anything by using transitions
d
i
→ e
i
. That is, whenever the vertex
v
is

visited, the strategy
σ
finds a
d
i
stored in the history of a play, and starts to

behave as an
ε
-optimal maximising strategy in
v
, where
ε<
(
2
)
i
. Thus,
σ

achieves the result

≥

2

3

or better against every strategy of player

♦

. However,

for every finite-memory strategy
σ
of player

there is a fixed constant

(
σ,π
)

v

P
σ

P
σ

Π. Since
P
σ

<
1 such that

P

(
Reach
(
T
))

≤

for every
π

∈

<
1,

(
2
)
k
for all
k>j
. Now let
π
be

1 such that
P
σ
<
1

there surely exists
j

≥

−

a MD strategy of player

♦

which in
d
i
selects either the transition
d
i
→

e
i
or

d
i
→

s
i
, depending on whether
i

≥

j
or not, respectively. Now one can easily

(
σ,π
)

v

2

check that

P

(
Reach
(
T
))
<

3
, which means that
σ
is not an optimal

maximising strategy in
v
.

Let us note that Claim (b) of Proposition 5.7 does
not
hold for MDPs. By

applying Theorem 7.2.11 of Puterman [1994], we can conclude that if there

is
some
optimal maximising strategy in a vertex
v
of a (possibly infinitely

Search Nedrilad ::

Custom Search