« JOTWELL: Leong on Rush on geographic diversity | Main | Law School Centers: The Good, the Not-So-Bad, and the Largely Unknown »

Saturday, January 24, 2015

Game theory post 4 of N: extensive form games, a deep dive

How about some Saturday game theory over brunch?

The one-round strategic form games of the previous post are the simplest possible presentation of some actual game theory. Now I want to put on my political scientist hat and dig into a slightly less simple, but much beloved, game.

We might call this the “punishment game.” It imagines a boss or a dictator or a parent giving commands to a subordinate or a subject or a child, where the boss prefers her commands be obeyed, and the subordinate prefers not to obey; if the subordinate defies the command, the boss has the power to inflict punishment at a personal cost. The following illustration (now with actual numbers, for clarity!) captures the situation, with the subordinate’s payoffs listed first; discussion is after the fold. (Sorry for the ugliness; remember how I said that I’m horrible at graphics?)

Punishment game0

Let’s look at obedience here. Remember that a full strategy includes a specification of the moves that will be made at every possible decision point, even if they won’t be reached in equilibrium. This fact will be important in a moment.

So suppose the subordinate plays the strategy “always obey” and the boss plays the strategy “never punish.” It’s easy to see that this isn’t an equilibrium: given that the boss is playing never punish, the subordinate can do better by switching the strategy to “always defy.” By contrast, the strategy pair “always obey, always punish” IS a Nash equilibrium: the subordinate does worse by deviating (getting smacked), and the boss is indifferent because no matter what she does, she gets the 10 payoff in the left-most terminus of the picture.

But there’s a certain unintuitiveness to this equilibrium. Suppose you’re the subordinate. You might reasonably think: “my boss has this strategy of always punishing, but it’s irrational for her to have that strategy: if I defy, she does worse by punishing than she does by just letting me slide. So why shouldn’t I just defy?” In other words, the boss’s threat to punish isn’t credible, because it’s too costly for her to actually carry it out. So, intuitively, we ought not to predict that the players will actually end up in the [obey; punish] equilibrium.

The technique game theorists have come up with to eliminate threats that are not credible from our prediction pool is a refinement to Nash equilibrium called “subgame perfect equilibrium.” A loose description of that solution concept is that a strategy set is subgame perfect if it is a Nash equilibrium of every subgame of the original game. Here, the [obey; punish] strategy set is a Nash equilibrium of the subgame that begins when the subordinate obeys, but is not a Nash equilibrium of the subgame that begins when the subordinate defies. Accordingly, it isn’t a subgame perfect equilibrium. (All subgame perfect equilibria are also Nash equilibria.)

The easy way to find subgame perfect equilibria is a process known as “backward induction.” Essentially, what you do is look at the last decision each player can make in each line of play and figure out what is best; then you count the payoffs from that decision as the payoffs for the choice that leads to it in the prior step, and keep going until you’ve solved the whole thing. (We call these decision points “nodes.”)

That’s a little abstract, but it will become clear when applied to the example. Think of the boss’s decision: if the subordinate has defied, she may either punish or refrain from punishing; her payoff from punishing is -1, and her payoff from refraining is 0. She can be expected to not punish. Given that, we can impute the subordinate’s payoff at that node: if he chooses to defy, he can expect a payoff of 10, based on the boss’s most rational response; this may be compared to her payoff if she obeys = 0. From this, we can conclude that the only subgame perfect equilibrium is subordinate always defies, boss never punishes. And that’s the prediction we ought to make.

Note how this is a really interesting problem for lawyers, for it suggests that punishment---like the sort that the legal system deploys---can be irrational. The obvious example is consumer contract enforcement: it can easily be irrational to enforce a consumer contract, because the costs of doing so are so high relative to the small payoffs; a mass dealer in consumer goods and services can in principle look down the game tree and breach its contracts with impunity, at least in the absence of something like fee-shifting, a class action mechanism, statutory demages, etc. to give plaintiffs a sufficient incentive to punish them. This model is a concise explanation of those features of our legal institutions. It’s also a favorite model of political scientists, mainly because of its obvious relevance to, e.g., international relations problems of deterrence.

Standard solutions to the problem: 1) Repetition---if the boss deals with the subordinate many times (an indefinite number, actually), we can sometimes find subgame perfect equilibria in which punishment happens thanks to its deterrent effect (more on repeated games later); 2) precommitment---if, for example, the boss can hand over the job of punishing to an independent agent (like, say, a judge!) who does not incur the costs to do so, this might make the threat credible. But there’s lots and lots to say about credible threat models; this is really just a teaser to show why we might want to say some of it.


Posted by Paul Gowder on January 24, 2015 at 01:41 PM in Games | Permalink


Post a comment