Game Development Reference

In-Depth Information

Brandenburger, Friedenburg, and Kiesler [2008] resolve this paradox in the

context of iterated deletion of weakly dominated strategies by assuming that

strategies were not really eliminated. Rather, they assumed that strategies

that are weakly dominated occur with infinitesimal (but non-zero) probability.

Unfortunately, this approach does not seem to help in the context of iterated

regret minimisation. Assigning deleted strategies infinitesimal probability

will not make 97 a best response to a set of strategies where 97 is given very

high probability. Pass and I deal with this problem by essentially reversing

the approach taken by Brandenburger, Friedenberg, and Keisler. Rather

than assuming common knowledge of rationality, we assign successively

lower probability to higher orders of rationality. The idea is that now, with

overwhelming probability, no assumptions are made about the other players;

with probability
, they are assumed to be rational, with probability
2
,

the other players are assumed to be rational and to believe that they are

playing rational players, and so on. (Of course, 'rationality' is interpreted

here as minimising expected regret.) Thus, players proceed lexicographically.

Their first priority is to minimise regret with respect to all strategies; their

next priority is to minimise regret with respect to strategies that a rational

player would use; and so on. For example, in the traveller's dilemma, all the

choices between 96 and 100 minimise regret with respect to all strategies.

To choose among them, we consider the second priority: minimising regret

with respect to strategies that a rational player would use. Since a rational

player (who is minimising regret) would choose a strategy between 96 and

100, and 97 minimises regret with respect to these strategies, 97 is preferred

to the other strategies between 96 and 100. In [Halpern and Pass, 2009], this

intuition is formalised, and a formal epistemic characterisation is provided

for iterated regret minimisation. This characterisation emphasises the fact

that this approach makes minimal assumptions about the strategies used by

the other agent.

Of course, an agent may have some beliefs about the strategies used by

other agents. These beliefs can be accommodated by allowing the agent to

start the deletion process with a smaller set of strategies (the ones that he

considers the other players might actually use). The changes required to deal

with this generalisation are straightforward.

Example 8.4
The role of prior beliefs is particularly well illustrated in

the finitely repeated prisoner's dilemma. Recall that always defecting is the

only Nash equilibrium of FRPD; it is also the only strategy that is rational-

izable, and the only one that survives iterated deletion of weakly dominated

strategies. Nevertheless, in practice, we see quite a bit of cooperation. We

Search Nedrilad ::

Custom Search