Game Development Reference
In-Depth Information
empirically tested among game theorists (with p = 2) they typically did not
play anywhere close to 2. Becker, Carter, and Naeve [2005] asked members of
the Game Theory Society (presumably, all experts in game theory) to submit
a strategy for the game. Fifty-one of them did so. Of the 45 that submitted
pure strategies, 33 submitted a strategy of 95 or higher, and 38 submitted a
strategy of 90 or higher; only three submitted the 'recommended' strategy
of 2. The strategy that performed best (in pairwise matchups against all
submitted strategies) was 97, which had an average payoff of $85.09. The
worst average payoff went to those who played 2; it was only $3.92.
Another sequence of experiments by Capra et al. [1999] showed, among
other things, that this result was quite sensitive to the choice of p . For low
values of p , people tended to play high values, and keep playing them when
the game was repeated. By way of contrast, for high values of p , people
started much lower, and converged to playing 2 after a few rounds of repeated
play. The standard solution concepts (Nash equilibrium, rationalizability,
etc.) are all insensitive to the choice of p ; for example, (2,2) is the only Nash
equilibrium for all choices of p> 1.
Arguably, part of the problem here is that Nash equilibrium presumes
that players know what other players are doing. I now consider a solution
concept that Rafael Pass and I [2009] recently introduced, iterated regret
minimisation , that attempts to deal with this problem, and exhibits the
same qualitative behaviour as that observed in experiments in many games of
interest. (The discussion in the rest of this section is taken almost verbatim
from [Halpern and Pass, 2009].)
The idea of minimising regret was introduced (independently) in decision
theory by Niehans [1948] and Savage [1951]. To explain how we use it in a
game-theoretic context, I first review how it works in a single-agent decision
problem. Suppose that an agent chooses an act from a set A of acts. The
agent is uncertain as to the true state of the world; there is a set S of possible
states. Associated with each state s
A is the utility u ( a, s )
of performing act a if s is the true state of the world. For simplicity, we take
S and A to be finite here. The idea behind the minimax regret rule is to
hedge the agent's bets, by doing reasonably well no matter what the actual
state is.
Formally, for each state s , let u ( s ) be the best outcome in state s ; that is,
u ( s )=max a∈A u ( a, s ). The regret of a in state s , denoted regret u ( a, s ), is
u ( s )
S and act a
u ( a, s ); that is, the regret of a in s is the difference between the utility
of the best possible outcome in s and the utility of performing act a in s .
Let regret u ( a )=max s∈S regret u ( a, s ). For example, if regret u ( a ) = 2, then
in each state s , the utility of performing a in s is guaranteed to be within 2
Search Nedrilad ::




Custom Search