Games theory

Although often discredited, Games Theory can still shed light on human behaviour. What follows is an extract from a work-in-progress:

Game Theory Games Theory is the study of situations where two or more players interact in accordance with a set of rules, and in particular where such situations arise in our political and economic life. Surprisingly it is fairly recent, generally considered to date from 1928 when John von Neumann’s book Zur Theorie der Gesellschaftsspiele (‘On the Theory of Parlour Games’) was first published. This was followed in 1944 by Theory of Games and Economic Behavior which Neumann co-authored with Oskar Morgenstern.

There are essentially two types of games. Zero-sum games are those where one player wins and the other loses. Typical examples include tic-tac-toe (also known as noughts and crosses), chess, poker and football. Rather more interesting are non zero-sum games where outcomes include the possibility of both players improving their situation. Perhaps the most important non zero-sum game in the economic world is the act of shopping. Sellers only sell their goods when they regard the money being offered as having greater value than the goods itself. Buyers only buy when they regard the goods as being worth more than the money asked. If both sides can agree a price then both walk away from the deal as winners.

By far the most famous of non zero-sum games is The Prisoner’s Dilemma. This was first formulated by Merrill Flood and Melvin Dresher of the RAND Corporation in 1950 (Poundstone, 1993, p. 103). They found the game sufficiently interesting to show it to Albert Tucker, chairman of mathematics at Princeton University, who devised a story to illustrate the game for a lecture he was giving to a group of psychologists at Stanford University. The story has been retold in many ways, but typically goes like this.

Two suspects are picked up by the police near the scene of a murder and placed in separate cells, without a chance to communicate with each other. The police officer then says to each of them, “As things stand, we do not have enough evidence to convict either of you for murder, so if you both remain silent we will have no choice but to convict you of the lesser offence of carrying a concealed weapon, which carries a three year sentence. However, if you are prepared to testify against your partner, and he remains silent, then we will let you go free while your partner goes down for the full ten years. Think carefully though, because I’ve offered exactly the same deal to your partner, and if you both accuse each other then we’ll make sure you both go down for five years.”

Imagine yourself in this situation. If you maintain the code of silence then the best you can expect is a three-year sentence, but if your partner testifies against you then you risk getting ten years. On the other hand, if you betray your partner then the worst you’ll get is five years, and there is a chance you could get off with no sentence at all. The obvious choice is to testify against your partner.

The problem is that your partner is doubtless following the same line of reasoning, so the most likely outcome is that you will both serve five years. If, on the other hand, you had both kept silent then you would have done better by each serving just three years. Hence The Prisoner’s Dilemma.

Incidentally, the choice by both suspects to testify against each other is known as the ‘Nash Equilibrium’ after John Nash who put forward the concept in his Ph.D. dissertation, written in 1950 under Tucker’s supervision. A Nash Equilibrium occurs when none of the players in a game can better their chances by unilaterally changing their strategy. John Nash went on to become the subject of the 2001 film A Beautiful Mind.

The reason The Prisoner’s Dilemma is of such interest is because of the number of social, political and economic situations that appear to accord with its rules. The most notorious example has to be the strategy pursued by the American and Soviet governments which resulted in the Cold War.

The trigger event was America’s detection of the Soviet’s first nuclear test, carried out in Kazakhstan in 1949. At this point, both sides knew the other was capable of building nuclear weapons powerful enough to annihilate their enemy. Each side had a choice: to build a nuclear arsenal or not. The best outcome for all concerned would have been for neither side to build nuclear weapons. However if one side built an arsenal and the other did not, then the side without the arsenal would be in the power of the other and risk being annihilated. The Nash Equilibrium was, therefore, for both sides to build their arsenals and rely on Mutually Assured Destruction (MAD) to prevent a war. (Poundstone, 1993, p. 129)

Thankfully that is not the end of the story. Right from the start, games theory proved unreliable in predicting people’s behaviour. Flood, Dresher and Nash frequently found that the people they recruited for their experiments failed to follow the rational strategies suggested by their theories. (Poundstone, 1993, p. 102)

The reason becomes apparent when one considers two players confronting each other in a series of such games, rather than a single instance. In such circumstances each player is aware of the how the other behaved in previous games, and that their current response may affect their opponent’s behaviour in future encounters. Two work colleagues, for example, are less likely to betray each other if they are concerned about their continuing relationship after the game is over.

Such a series of games is referred to as Iterated Prisoner’s Dilemma. Perhaps the most important contribution here was the publication in 1980 of the details of a number of computer tournaments held Robert Axelrod of the University of Michigan. (Axelrod, The Evolution of Co-operation, 1990, pp. 27-54)

For each tournament, Axelrod created an environment in which players could exist as game strategies defined by software programs. He then arranged for each strategy to play every other strategy in repeated games of The Prisoner’s Dilemma, as well as against a copy of itself and against a player that simply responded randomly. In each game the strategy could chose to defect or cooperate. If both strategies cooperated they were each awarded three points. If both defected they each got just one point. If one defected and the other cooperated then the defector got five points while the cooperating strategy got none. Axelrod arranged for each set of strategies to play in the region of 200 games against each player, but made sure that players could not predict how many iterations would be played in each round so they could not make use of an ‘end game’ strategy. The only information available to each strategy was the history of the current round.

Some 62 strategies were submitted to the more significant second tournament, with entries coming from six countries and participants ranging from a computer hobbyist to professors of mathematics, economics and evolutionary biology. Surprisingly, the strategy that gained the highest score, submitted by Professor Anatol Rapoport of the University of Toronto, was extremely simple. Called Tit-for-Tat, the strategy has just two rules: cooperate in the first game, and then do whatever your opponent did in the previous game. If your opponent defected then you defect; if your opponent cooperated then you cooperate.

What is striking about the result is that, despite the rational response to a single round of The Prisoner’s Dilemma being defection, Tit-for-Tat is essentially a ‘nice’ rule in that it will not be the first to defect. Indeed the eight ‘nice’ rules received by Axelrod for the tournament secured the eight highest scores. Of course ‘nice’ rules will do well against each other as they will consistently pick up three points on each game, while ‘nasty’ rules risk getting a lot of zero or one scores, along with the occasional lucky five. However the persistent supremacy of ‘nice’ rules over even very sophisticated ‘nasty’ rules was unexpected.

In order to further test the resilience of Tit-for-Tat, Axelrod went on to design a sequence of tournaments which would start with an even distribution of strategies amongst the players, but in each successive tournament the number of players adopting any given strategy would be proportional to the score it had achieved in the previous tournament. If one strategy gained twice the score of another in the first tournament then it would be represented by twice as many players in the second, and so forth. This could be seen as analogous to the way in which we learn from the behaviour of more successful colleagues.

In this ‘survival of the fittest’ scenario the more successful strategies would be expected to spread throughout the whole population as each tournament was played, which is indeed what happened: although some of the more exploitative strategies did well in the early tournaments, these later died out as the less successful strategies on which they preyed became extinct. By the 50th tournament the ‘nice’ strategies were the only ones whose populations were increasing, with Tit-for-Tat clearly in the lead. Axelrod went on to demonstrate that Tit-for-Tat can become the dominant strategy in a population even if it starts out only adopted by five percent of the players. (Axelrod, The Evolution of Co-operation, 1990, p. 63)

Subsequent work has thrown some doubt on Axelrod’s results. Experiments by Bruce Linster (Linster, 1992) and others have showed that Tit-for-Tat is not as resilient as Axelrod suggested. However Axelrod’s experiments remain significant as they demonstrate that there are a wide range of conditions in which a cooperative approach is the best strategy in a competitive or even hostile environment, and indeed can evolve without the need for altruism on the part of the players, or any outside influence. It was for this reason that Axelrod entitled his 1984 book The Evolution of Co-Operation.

At this point it is important to remember that games theory is the study of games in terms of their rules and strategies. Many have assumed that games theory can help explain and even control human behaviour, to the extent that some have adopted the concept of ‘Economic Man’ (or ‘Homo Economicus’) to represent the human being as a rational decision-making machine that will respond to situations in ways that games theory can predict. However that is a dangerous approach.

While human beings do indeed make decisions all the time, often in situations analogous to Prisoner’s Dilemma, such decisions are usually influenced by factors that do not figure in the calculations of games theorists, having more to do with things like status, religion or psychology. Real human beings are far less predictable than Homo Economicus would suggest, and indeed right from the beginning, in a set of experiments carried out at RAND in 1952 and 1954, it was apparent that the interaction between the subjects was down far more to personality than any rational attempt to maximise opportunity. Reflecting back on this, John Nash recently stated:

I realise what I had said at some time may have over-emphasised rationality, and I don’t want to over-emphasise rational thinking on the part of humans. Human beings are much more complicated than the human being as a businessman. Human behaviour is not entirely motivated by self interest… Game theory works in terms of self interest, but some game theory concepts could be unsound. There’s over dependence on rationality. That is my enlightenment.” (Curtis, 2007)

Nevertheless there are numerous real-life examples that demonstrate the evolution of cooperation, and Axelrod’s work does provide some insight into how they might arise. Perhaps the most moving is that supplied by German and English troops during the First World War, which is the subject of a whole chapter in The Evolution of Co-operation.

The earlier part of the war saw German and English units facing each other for long periods of time in trenches separated by several hundred metres of ‘no-man’s-land’ across the fields of France. At a national level, the War could be viewed as a zero-sum game, with losses (whether of territory or human lives) on one side equating to gains on the other. However at the unit level, these periods could be viewed as iterated games of The Prisoner’s Dilemma, and indeed there are many recorded examples of informal cooperation breaking out between enemy units. Axelrod quotes an eye witness as stating:

In one section the hour of 8 to 9am was regarded as consecrated to ‘private business’, and certain places indicated by a flag were regarded as out of bounds by the snipers on both sides.” (Morgan, 1916, pp. 16-17)

Such examples persisted until British High Command introduced a new policy involving random raids on enemy trenches. As these raids were ordered and monitored externally, the fragile unspoken agreements that had built up were quickly destroyed.

There are also many examples from the natural world, where the participants are responding instinctively rather than rationally. Cooperation between family members can be explained in terms of the perpetuation of genetic material, but the mechanisms by which cooperative behaviour between different species could evolve is harder to understand – particularly where one of the participants would make a perfectly good meal for the other. Typical examples come from the marine world where larger fish, such as sharks or rays, allow smaller fish to approach and clean away parasites and food debris. Usually the ‘client’ signals that it is ready to be cleaned by arriving at a specific ‘cleaning station’ and adopting a particular posture. The smaller ‘cleaner’ fish will even enter the mouths of their clients with impunity. (Grutter, 2005)

So what are the conditions under which a cooperative strategy can become successful? The principle condition is what Axelrod called ‘the shadow of the future’. As we have seen, the best strategy when faced with a single round of The Prisoner’s Dilemma is to defect: cooperation can only become a winning strategy when both players are sufficiently concerned about the outcome of future encounters. As Axelrod put it:

In an iterated game, a player can use a strategy that relies on the information available so far to decide at each move which choice to make. Since the players do not know when the game will end, they both have an incentive and an opportunity to develop cooperation based upon reciprocity. The shadow of the future provides the basis for cooperation, even among egoists.” (Axelrod, On Six Advances in Cooperation Theory, 2000, pp. 130-151)

What we are talking about here is, of course, the evolution of trust. A concern about future interactions is vital. If the participants are unlikely to meet again – as is the case in the original Prisoner’s Dilemma story – then the rational response is betrayal and distrust. In this situation, cooperation can only occur if the players behave in a genuinely altruistic manner. If, on the other hand, future interactions with the same participants are likely, then cooperation is more likely to flourish.

For example, even today there are isolated villages where few inhabitants bother to lock their doors when they venture out. Although they may feel this is because they live amongst ‘good honest folk’, it is in fact because repeated interaction has resulted in an understanding that they are all better off if they maintain that trust. However this can change when outsiders move in to the community, particularly if they are transient visitors. Axelrod cites research showing that interactions between gypsies and residents are fraught with distrust on both sides. The ‘shadow of the future’ is less strong here, making betrayal a safer response. (Axelrod, The Evolution of Co-operation, 1990, p. 115) It also changes as the community gets larger and repeated interactions between the same participants becomes less likely. Only a fool would leave his door unlocked in a big city.

Of course business would not be possible without trust between participants. Most successful business deals involve significant mutual trust, precisely because both sides are expecting and indeed desirous of future interactions. However once the likelihood of future interaction diminishes, as when a company goes bankrupt, then its owners can no longer expect the same cooperation:

The great enforcer of morality in commerce is the continuing relationship, the belief that one will have to do business again with this customer, or this supplier, and when a failing company loses this automatic enforcer, not even a strong-arm factor is likely to find a substitute.” (Mayer, 1974, p. 280)

The ‘shadow of the future’ is particularly important when it comes to resolving racial or ethnic tension. What game theory demonstrates is that cooperation is best cultivated by increasing the likelihood of future interaction between participants, and making such interaction important to the parties concerned. Where ethnic tensions exist this is best achieved by integrating the two sides as much as possible – ensuring that children from both groups attend the same schools, for example. Solutions that involve partition, such as that between Israel and Palestine, are doomed to failure.

Another important factor is that iterated games of The Prisoner’s Dilemma can only foster cooperation if both players are aware of the history of their past interactions. This can only be the case if participants are able to recognise each other whenever they meet. Human beings generally have a good memory for faces, but the same is not necessarily the case in the lower reaches of the animal kingdom. This is why exhibitions of cleaning behaviour amongst fish tend to take place at specific cleaning stations, and involve specific signals on the part of the client. The ‘cleaner’ may not recognise a specific client as someone it has cleaned before, but it has learnt to recognise the situation as one where it can safely interact with its client.

This can have unfortunate consequences. To demonstrate Axelrod imagines a scenario where players of Iterative Prisoner’s Dilemma are distinguished only by the wearing of a green or a blue label. In this scenario, players adopt a ‘nice’ strategy with those wearing the same colour and a ‘nasty’ strategy against those wearing a different colour. He goes on to show that such a situation is stable, even though both groups would score higher if they cooperated. He also demonstrates that the smaller group will achieve a lower score than the larger group, in proportion to the discrepancy in size. (Axelrod, 1990, p. 146) The implication is that conflicts that centre around easily visible labels such as race or ethnicity will not resolve themselves.

REFERENCES

Axelrod, R. (1990). The Evolution of Co-Operation Penguin Books.

Axelrod, R. (2000, July). On Six Advances in Cooperation Theory. Analyse and Kritik, 22.

Curtis, A. (Director). (2007). The Trap: What Happened to Our Dream of Freedom (Episode 2) [Motion Picture].

Grutter, A. (2005). Cleaning mutualism in the sea. In K. Rohde (Ed.), Marine Parasitology. CSIRO Publishing.

Linster, B. (1992). Evolutionary Stability in the Infinitely Repeated Prisoners’ Dilemma Played by Two-state Moor Machines. Southern Economic Journal(58), pp. 880-903.

Mayer, M. (1974). The Bankers. New York: Ballantine Books.

Morgan, J. (1916). Leaves from a Field Note Book. London: Macmillan.

Poundstone, W. (1993). A Prisoner’s Dilemma Oxford University Press.

Games theory

REFERENCES

One response to “Games theory”

Leave a comment Cancel reply

The State We’re In

Follow this Blog via Email

Games theory

REFERENCES

Share this:

Related

One response to “Games theory”

Leave a comment Cancel reply

The State We’re In

Follow this Blog via Email