« Epilogue: Moral Panics and Body Cameras | Main | Nursing Homes as Guardians of Their Debtor Patients »

Monday, January 26, 2015

Game theory post 5 of N: the joy and madness of repeated games

One thing about strategic interactions is that humans tend to repeat them.  For example, participants in a market may engage in trades over and over, neighbors may make the same decisions with respect to borders, common resources, etc. over and over, even some litigants in a particularly litigious industry may find themselves facing one another in court over and over (ahem, cough, cough, AppleandGoogleandSamsungandMicrosoftandAllTheRest). Unsurprisingly, game theorists have developed a body of knowledge for dealing with repeated games—that is, games that can be divided into subgames which are played over and over.

There are two categories of repeated games: finitely repeated, and indefinitely or infinitely repeated games.  And as it turns out, they behave very differently.  Generally speaking, finitely repeated games tend to behave (at least formally) sorta more-or-less like one-short games; and we would intuitively expect that to be true, for a finitely repeated strategic form game is just the same thing as a longer game written in extensive form.  But things go really wild when you move to the indefinite/infinite category.  

To illustrate, let’s think about the prisoners’ dilemma again.  Here’s one thing you might think about the finitely repeated PD: “hey, wait a minute, maybe now cooperation can be sustained!  After all, if the players cooperate in the first round, maybe they’ll learn to trust one another, and continue to cooperate in future rounds—especially if they both understand that this trust will be destroyed if they don’t cooperate, or, equivalently, if someone stabs the other player in the back, the stabbee can be expected to punish the stabber by defecting in future rounds.  (These kinds of strategies have all kinds of flashy names among game theorists: there’s “tit-for-tat,” the strategy of cooperating except when your opponent/partner has defected in the previous round, then defect; there’s “grim trigger,” cooperating unless your opponent/partner has ever defected, then defecting forever…)

As it turns out, in the finitely repeated PD, that’s just not true.  (Again, people sometimes behave differently in the real world, but we ought to get out our purely instrumentally rational and strategic starting point before we start worrying about when and why observed reality deviates from it.*)  Suppose there’s ten rounds to the game, and imagine you’re a player trying to figure out whether this cooperation strategy will work.  Here’s how your internal monologue could go: 

Ok, there are ten rounds here.  If we both cooperate in the first round, then the threat of future defection should keep everyone on the straight and narrow in the future.  But what constrains us in round ten?  After all, in round 10, there’s no future round in which I can threaten the other player with punishment; accordingly, defection is a strictly dominant strategy in round 10, we should predict it no matter what. If I cooperate in round 10, I’m just a sucker.  So we’ll both defect in round 10.  But then, wait a minute.  If defection is definitely going to happen in round 10, then in round 9 there’s no realistic (credible) threat of punishment either. You can’t threaten someone with an act you’re going to take anyway.  So defection is strictly dominant in round 9 too.  But then what constrains us in round 8?  …

This, of course, is just a more intuitively expressed version of the notion of backward induction, given in the previous post.  And we can see that it’s aptly named, for the reasoning process in cases like this actually looks kind of like mathematical induction: if the conclusion at this point compels the same conclusion at the next point in the sequence, then we’re warranted in making inferences all the way down. Unsurprisingly, the only subgame perfect equilibrium of the finitely repeated PD is mutual defection at every round. And this is a general fact about repeated games with unique Nash equilibria in the one-shot version (see proof on pg. 10 of these lecture slides — which also give an excellent math-ier presentation of the stuff I’m describing here): the Nash equilibrium of the one-shot game is, repeated over every round, the subgame perfect equilibrium of the repeated game.

But when we get into infinitely repeated games, then everything goes out the window.  We don’t need to get into the mathematics of it, but just think about the same logic in the context of the PD again: all of a sudden, there’s no end point to carry out backward induction from.  Because of that small change in the facts, punishment for prior defection (or reward for prior cooperation) is a realistic prospect at every single round: the rounds never stop (or they stop at an unpredictable time), so players always have a threat to make against one another.  Conditional strategies like grim trigger and tit for tat suddenly start to be plausible, and the prospect of sustained cooperation again appears on the horizon.  

In fact, as it turns out, there are a series of results known collectively as “the folk theorem” that suggest that infinitely repeated games have infinite subgame perfect equilibria.  Anything can be sustained in equilibrium, under two conditions: 1) players can’t discount the future too highly; 2) the strategy set in question has to yield single-round payoffs better than those that can be obtained by the one-shot Nash equilibrium.  

On the one hand, this is great.  It allows us to explain how things like sustained cooperation can be possible in strategic contexts where there’s an incentive to defect. For example, it can be used to explain how reputation mechanisms work in markets to keep people honest.  One of the most influential papers in the economic history of law, Milgron, North & Weingast 1990, essentially uses a more complex (because multiplayer) version of the indefinitely repeated PD to model how decentralized commercial enforcement institutions work. 

However, while the folk theorem is useful in that sense for backward-looking explanation, it’s bad news for prediction: given that there are an infinite number of behavior patterns that are supportable in equilibrium in such situations, how do you predict which ones will show up?  It ain’t easy.  (Many game theorists just wave their hands around and say “focal points!”—about which more later.)  If you’re a Popperian falsificationist about your philosophy of science, of course, then blowing up prediction is also a good way to blow up backward-looking explanation…but you probably shouldn’t be a Popperian falsificationist.  

So.  Anyway.  That’s quite enough of that.  I’m writing this post as the blizzard of our nightmares moves into Princeton (where I’m holed up this year), so perhaps soon we’ll see real-life applications of the finitely repeated PD as civilization breaks down, looters descend, hyenas emerge from the woods to drag away the weak, &c. Memo to fellow denizens of the impending weather apocalypse: I have a fixed cooperative disposition!  Honest!  Please don’t eat me! First. 


* Teaser: sometimes players with a fixed disposition to be “irrational,” like to play tit-for-tat or grim trigger, can actually do better when they play with one another.  In a context where doing better is selected for, players with such dispositions can prosper. See Axelrod, The Evolution of Cooperation and Skyrms, The Evolution of the Social Contract (previewed in a freely available Tanner Lecture http://tannerlectures.utah.edu/_documents/a-to-z/s/Skyrms_07.pdf ); also see basically all of evolutionary game theory, which I think I’ll probably post about at some point even though it is more advanced material than most of the stuff in this series, just because I find it delightful.  


Posted by Paul Gowder on January 26, 2015 at 03:29 PM in Games | Permalink


Post a comment