## Friday, December 9, 2011

## Wednesday, July 27, 2011

### Could You Be a Crackpot?

Of course not! That's why I invented the following barometer for identifying crackpots, so you'll see how much you're not a crackpot. This won't take long. Just read the following essay, then answer the question afterward.

OK, now that you've read that (

Scoring:

You have always, from an early age, been fascinated by the workings of the universe. For longer than you can remember, the suspicion that something is amiss in the prevailing scientific theory has gnawed at you, but only recently, in your maturity, have you recognized how to rectify its flaws.

This is no patchwork repair—it is a fundamental revision in how to perceive the universe. You have discovered this shift by virtue of your stark insight, uncluttered by years of outdated academic instruction. You have been labelled a crackpot, but this has always been the mark of the true genius. Witness Galileo: He had to withstand the prejudices and superstitions of his time, even amongst his colleagues, in his pursuit of the truth! And you, like Galileo,are no crackpot.

Your breakthrough must be distributed personally, as part of a grass-roots campaign, in order to avoid the censorship that is part and parcel of the mainstream scientific press. But you have no doubt that you will succeed; you are brave enough to face the naysayers who have shouted down other talented scientists who lacked your resolve. Witness Einstein: His original mindset differed from the battered viewpoint that jealous criticism reduced him to! You can quote effectively from his writings to support this.

The prevailing scientific theory requires—in fact, relies crucially on—a body of mathematics that was clearly inventedad hoc: for the purpose of showing exactly that for which it was invented. This circular line of reasoning is a basic but crushing flaw. Your alternative, for those who have the vision to properly appreciate it, is conversely a work of great elegance and beauty, which resonates with rather than challenges your acute intuition.

When at last it is properly recognized, your breakthrough will enable great boons for humanity. You will be offered vast rewards for your work, but you will turn most of them down, asking only to continue your scientific investigations without the further distraction of interacting with the scientific community. Those who doubted you will permit this reluctantly, out of respect for your talent, but all the while will admire your work from afar.

OK, now that you've read that (

*haven't you?!*), here is your one question:On a scale of 0 to 10—0 indicating complete disagreement and 10 indicating complete agreement—how well does the foregoing essay represent your viewpoint?

Scoring:

10: You are a crackpot.

0-9: You are not a crackpot.Truecrackpots are in for a pound if they're in for a penny.

## Wednesday, June 22, 2011

### The Myth of Common Sense

How many times have you heard someone say, "It's just common sense"? I just heard it myself (well, read it in an e-mail) two days ago, and it was in relation to something that I might argue wasn't really common sense at all—at least, not beforehand. With the benefit of hindsight, it became common sense. (If you're inordinately curious, it had to do with how hot a playground surface could get on a cool day. Pretty hot, as it turns out.)

Right now, as I write this in late June 2011, if you google "it's just common sense," you get the following things that are supposed to be common sense:

Now, one thing I expected was that there would be a split between things that were asserted to be common sense in a descriptive way (that is, people do commonly agree on them, or would if they were asked), and those that were asserted to be common sense in a

The remainder were all prescriptive; in general, they even conceded that a large segment of the population—be it liberals, non-religious people, gun-control advocates—were opposed to their viewpoint, but they then went on to say that these people were mistaken, and they were mistaken because they went against common sense. In most cases, they don't really explain

And that demonstrates the appeal of saying that something is common sense:

Granted, it's always been a bit hazy exactly who has the burden in any particular case. The negation of an assertion is, of course, another assertion, so who really has the burden of proof? A convenient rule of thumb is that anyone who goes against the conventional wisdom position (the descriptive common sense, basically) assumes that burden, but there are, I'm sure, plenty of exceptions to that. But I argue that in any borderline case, where there's some dispute as to who has the burden of proof,

So when someone writes that something is "just common sense," it almost always turns out (and I'm being as generous as I can here) that they don't exactly know why they hold their position. Or won't say. Or it's just too much trouble to actually work out and explain what their position is. To which I'd say, "Well, then, why are you wasting your time explaining your common-sense position?"

To its credit, the bicycle helmet post actually points this out. From

I can't emphasize strongly enough what a transforming revelation that was for me. For much of my life, I'd encounter people holding what (to me, at least) appeared to be some kooky position or another, and my reaction was nearly always something along the lines of "How can you possibly believe that?" And that was a rhetorical question; I wasn't really interested in how they came to believe that, I just wanted to point out that it was a nonsensical position. As you might expect, I eventually came to realize that most people didn't particularly take kindly to that sort of question, so I stopped saying it. But I still thought it.

Shermer's thesis, however, made me start asking that question again, but internally, and this time at face value: Why

But the other lesson is important, too: When someone says something is common sense,

Right now, as I write this in late June 2011, if you google "it's just common sense," you get the following things that are supposed to be common sense:

- Domestic drilling for oil
- Creation science (I think—the post wasn't entirely coherent)
- The necessity of broad-ranging budget cuts
- Wearing a bicycle helmet reduces the risk of injury
- The use of backscatter scanners (so-called "naked X-rays")
- Avoiding texting while driving
- Essentially any Republican viewpoint on fiscal policy
- Showing discretion on social networks
- Allowing schoolteachers to bring guns to class (!)
- Using alternative medicine

Now, one thing I expected was that there would be a split between things that were asserted to be common sense in a descriptive way (that is, people do commonly agree on them, or would if they were asked), and those that were asserted to be common sense in a

*prescriptive*way (that is, people*should*agree on them). And indeed there is, but the split was fairly unbalanced: I'd say that out of the ten hits I listed above, just three—the bicycle helmet one, texting while driving, and discretion on social networks—were even close to the descriptive sense, and the bicycle helmet one only alluded to common sense to set up the contrasting finding that apparently, it*doesn't*reduce the risk of injury. (Very interesting, by the way. But a post for another time.)The remainder were all prescriptive; in general, they even conceded that a large segment of the population—be it liberals, non-religious people, gun-control advocates—were opposed to their viewpoint, but they then went on to say that these people were mistaken, and they were mistaken because they went against common sense. In most cases, they don't really explain

*why*their viewpoints were common sense; it was enough to say simply that they were.And that demonstrates the appeal of saying that something is common sense:

**It removes the burden of proof from the person making the assertion, and places it on anyone who disagrees with it.**Essentially, it abdicates any responsibility for backing up your position. More than that, it demeans anyone who disagrees, as they obviously lack common sense (whatever*that*might be).Granted, it's always been a bit hazy exactly who has the burden in any particular case. The negation of an assertion is, of course, another assertion, so who really has the burden of proof? A convenient rule of thumb is that anyone who goes against the conventional wisdom position (the descriptive common sense, basically) assumes that burden, but there are, I'm sure, plenty of exceptions to that. But I argue that in any borderline case, where there's some dispute as to who has the burden of proof,

*both*sides should assume that burden.So when someone writes that something is "just common sense," it almost always turns out (and I'm being as generous as I can here) that they don't exactly know why they hold their position. Or won't say. Or it's just too much trouble to actually work out and explain what their position is. To which I'd say, "Well, then, why are you wasting your time explaining your common-sense position?"

To its credit, the bicycle helmet post actually points this out. From

*Doug's Darkworld*:"It’s just common sense" is probably one of the most seductive and deadly false arguments out there. When someone says "it’s just common sense" what they are really saying is "reality conforms to my idea of what makes sense."I would say there's other cases, but that is a big reason that people say something is common sense. It puts me in mind of a point made by Michael Shermer. Shermer's a skeptic of possibly the most compelling kind: a recovering occultist (an anti-skeptic, if you will). He wrote a book in 1997 entitled

*Why People Believe Weird Things*; he updated it five years later, most significantly including a new chapter entitled "Why*Smart*People Believe Weird Things." In it, he argues the following thesis: Smart people believe weird things because they are skilled at defending beliefs they arrived at for non-smart reasons.I can't emphasize strongly enough what a transforming revelation that was for me. For much of my life, I'd encounter people holding what (to me, at least) appeared to be some kooky position or another, and my reaction was nearly always something along the lines of "How can you possibly believe that?" And that was a rhetorical question; I wasn't really interested in how they came to believe that, I just wanted to point out that it was a nonsensical position. As you might expect, I eventually came to realize that most people didn't particularly take kindly to that sort of question, so I stopped saying it. But I still thought it.

Shermer's thesis, however, made me start asking that question again, but internally, and this time at face value: Why

*do*they hold that position? It's very often not for the reason they espouse. (For instance, most of the common sense cases, I pointed out, are not in fact commonly held.) Maybe it's because of their own personal experience; people tend to overvalue personal experience. Maybe it's because of their religious or cultural upbringing. Or maybe it's a position that has to be taken in order to avoid cognitive dissonance with something that they've done. It's an interesting intellectual exercise, and sometimes I can work it out without coming straight out and asking them, "Well, why*do*you think it's common sense?"But the other lesson is important, too: When someone says something is common sense,

*and*that you should act in such-and-such a way because of it, it's vital not to adopt that common-sense attitude, if you don't already agree with it. It can be surprisingly compelling, if you're not careful (after all, who wants to be demeaned?), and you may sooner rather than later find yourself espousing the same position, seeing as it's "just common sense."## Monday, April 18, 2011

### When Does It Start?

So I'm driving to work the other day, and I'm stuck behind this car whose driver has decided that today, freeways shall be traversed at the speed of 42 mph. (In reality, I suspect this decision applies to most days, but I'm trying to be conservative here.) And it's almost impossible to pass him, because the stream of cars passing both of us is too dense and too much faster than we are to enter safely.

Eventually, I manage, and at a relatively safe moment, I cast a quick sidelong glance at him and affirm that he's northward of 80 years old. Now, there's a lot of talk that drivers that old should be looked at fairly hard and often to establish that they're able to drive safely, but I'm actually not thinking about that. What I'm thinking about is, at what point did he become a 42 mph kind of driver on the freeway? Was he

*always*like that, or did he start out as what most of us would consider an ordinary kind of driver, and over time got slower and slower? I mean, maybe there are places where 42 mph is considered sort of daring, and that's where he grew up.At this point in the discussion, someone invariably pipes up and mentions that such drivers are in fact safer than those driving at some higher speed, say, 80 mph, on the assumption that 80 mph is just inherently less safe. There's something to be said for that point of view, in that there's less time to avoid impacts if you're driving at a higher speed, and any impacts you do end up in are more dangerous. But that's only part of the picture.

The reality of the situation is that although (in Los Angeles) the freeway traffic occupies a continuum of speeds, most of the traffic—perhaps 90 to 95 percent—falls between 60 and 80 mph. And your risk of impact depends primarily on how often you encounter cars travelling at that range of speed. A long time ago, in grad school, I spent a little time figuring out how often you encounter cars on the road: either passing slower cars, or being passed by faster ones. And what I found was that the details of the speed distribution of cars matters very little. There are only four parameters of interest: the density of cars on the road, the percentage of cars you're faster than, the average speed of those cars that you're faster than, and the average speed of those cars that are faster than you.

That means that you could pretty much figure out the rate at which both my 42 mph driving friend and the hypothetical 80 mph driver would encounter cars by simply assuming everybody else was driving 70 mph. Our superannuated man behind the wheel would encounter cars nearly three times more often than the 80 mph driver, and encounter them at nearly three times the relative rate of speed. To be sure, the combined energy of an actual collision would be greater for 70 mph and 80 mph than it would be for 42 mph and 70 mph, but the increased frequency of encounters and the much shorter period of time drivers would have to avoid them would, I think, more than compensate for that.

All in all, I think if you want to drive slower to be safer, you're better off driving 60 mph, or whatever the lower end of speeds is for your road of choice.

Labels:
questionable sanity,
traffic

## Tuesday, March 1, 2011

### A Different Kind of Search

Because I have a bad habit of imagining all sorts of bizarre, improbable scenarios, I dreamt up this one: Suppose I woke up in a strange town, with no idea of who I was or where I came from, or indeed of any of my past. How do I find out?

Of course, I might go to the police or something like that, but perhaps they'd be no more helpful than myself. So I imagined I might start a blog, mostly unlike this one, where hopefully someone would recognize me. I'd start out by saying I'm most likely a missing person. Over time, as I remembered more about my past, hopefully, I'd start putting that down in the blog. Ordinary blog stuff I'd post much as anyone would, but all the identifying information I'd collect in a single post, accumulating edits, so it'd be more easily indexed by Google.

But what kind of stuff would be useful identifying information?

It occurred to me that the usual "obvious" stuff isn't necessarily all that useful in a case like this. My height and weight are not really specific enough to be a useful filter if someone else were looking for me. There are about 100,000 active missing persons in the U.S. (as of the end of 2009, according to the FBI); there must be hundreds that are my height and weight, or thereabouts. A picture would be more distinguishing, but it's hard to search for a picture of someone if all that's known is that they're missing.

So I think I'd try to post stuff that's more distinctively me. Maybe I'd somehow realize that I like to play jazz piano, or enjoy recreational mathematics, or have a rather deep interest in sports and statistics. Maybe I'd recollect some piece of poetry I'd memorized (or even composed). Along with any other more personal tidbits, I'd put them all in my Google flypaper for my old identity.

OK, obviously, I'm surpassingly unlikely to ever need to do anything like this (and it's unclear whether or not this planning would be something I'd remember if I ever did), but then it also occurred to me: Why don't we post this kind of information when we're looking for someone?

When a missing person poster is put up in my neighborhood, it always gives out some basic statistics for them, but the information is mostly generic: a photo, name, date of birth, height, weight, date missing, etc. If a person is amnesiac or doesn't wish to be found, that information is nearly useless—even the photo, since the person may have changed appearance quite dramatically.

Of course, I might go to the police or something like that, but perhaps they'd be no more helpful than myself. So I imagined I might start a blog, mostly unlike this one, where hopefully someone would recognize me. I'd start out by saying I'm most likely a missing person. Over time, as I remembered more about my past, hopefully, I'd start putting that down in the blog. Ordinary blog stuff I'd post much as anyone would, but all the identifying information I'd collect in a single post, accumulating edits, so it'd be more easily indexed by Google.

But what kind of stuff would be useful identifying information?

It occurred to me that the usual "obvious" stuff isn't necessarily all that useful in a case like this. My height and weight are not really specific enough to be a useful filter if someone else were looking for me. There are about 100,000 active missing persons in the U.S. (as of the end of 2009, according to the FBI); there must be hundreds that are my height and weight, or thereabouts. A picture would be more distinguishing, but it's hard to search for a picture of someone if all that's known is that they're missing.

So I think I'd try to post stuff that's more distinctively me. Maybe I'd somehow realize that I like to play jazz piano, or enjoy recreational mathematics, or have a rather deep interest in sports and statistics. Maybe I'd recollect some piece of poetry I'd memorized (or even composed). Along with any other more personal tidbits, I'd put them all in my Google flypaper for my old identity.

OK, obviously, I'm surpassingly unlikely to ever need to do anything like this (and it's unclear whether or not this planning would be something I'd remember if I ever did), but then it also occurred to me: Why don't we post this kind of information when we're looking for someone?

When a missing person poster is put up in my neighborhood, it always gives out some basic statistics for them, but the information is mostly generic: a photo, name, date of birth, height, weight, date missing, etc. If a person is amnesiac or doesn't wish to be found, that information is nearly useless—even the photo, since the person may have changed appearance quite dramatically.

What about favorite activities? What about peculiar habits or reflexes? What about a recording of the person's voice? Even people who have forgotten who they are, or wish others would, would not easily change such essential characteristics of themselves. And with the Web as adjunct for missing persons posters, putting up a voice recording is simplicity itself. I'm not suggesting that anything be put up that the family not be comfortable with, but lots of this stuff would violate privacy but little, and would (it seems to me) provide substantial aid in finding the missing person. I'm sure it's done to some limited extent—some posters do list some minor personal details—but a casual survey of local posters shows precious little of it. Can anyone explain why this isn't done more, if only on a voluntary basis? I know if a family member went missing, I'd want the poster to have as much trivial (i.e., non-security) detail as possible on it.

## Thursday, February 17, 2011

### A Little Learning (Game Theory, Part Deux)

Here, as promised, is the dangerous thing.

Suppose you're getting a sequence of playing cards, and you're trying to figure out some statistics for the playing cards. At first, the cards seem utterly random, but after a while, a pattern emerges: There are slightly more black face cards than red ones, and there are slightly more red low-rank cards than black ones. You're a statistician, so you can quantify the bias—measure the correlation coefficient between color and rank, estimate the standard error in the observed proportions, and so forth. There are rigorous rules for computing all these things, and they're quite straightforward to follow.

Except, you're playing gin rummy, and the reason you're receiving a biased sequence of cards is that you're trying to collect particular cards. If you change your collection strategy, you'll affect the bias. You may have followed all the statistical rules, but you've forgotten about the context.

It might seem entirely obvious to you, now that I've told you the whole story, what the mistake is, and how to avoid it, but I contend that a wholly parallel thing is happening in sports statistics. I'm going to talk about basketball again, because I'm most familiar with it, but the issue transcends that individual sport, yes?

I've previously touched upon this, but this time, with the first post on game theory as background, I'm actually going to go through some of the analysis. Again, we won't be able to entirely avoid the math, but I'll try to describe in words what's going on at the same time. If calculus makes you squeamish, feel free to skip the following and move down to the part in bold.

In our simple model, the offense has two basic options: have the perimeter player shoot the ball, or pass it into the post player, and have him shoot the ball. The defense, in turn, can vary its defensive pressure on the two players, and it can do that continuously: It can double team the perimeter player aggressively, double the post player off the ball, or anything in between. We'll use the principles of game theory to figure out where the Nash equilibrium for this situation is.

We'll denote the defensive strategy by b, for on-ball pressure: If b = 1, then all of the pressure is on the perimeter ball-handler; if b = 0, all of it's on the post player. An intermediate value, like b = 1/2, might mean that the defense is equally split between the two of them (man-to-man defense on each), but the exact numbers are not important; the important thing is that the defensive strategy varies smoothly, and its effects on the offensive efficiency also vary smoothly.

Each of the two offensive options has an associated efficiency, which represents how many points on average are scored when that player attempts a shot. We'll call the perimeter player's efficiency r, and the post player's efficiency s. As you might expect, both efficiencies depend on the defensive strategy, so we'll actually be referring to the efficiency functions r(b) and s(b). The perimeter player is less efficient when greater defensive pressure is placed on him, naturally, so r(b) is a decreasing function of b. On the other hand, the post player is more efficient when greater defensive pressure is placed on the perimeter player, so s(b) is an increasing function of b.

Now let's look at this situation from a game theory perspective. Will the Nash equilibrium of this system involve pure strategies, or mixed strategies? (A pure defensive strategy in this instance consisting of either b = 0 or b = 1.) Right away, we can eliminate the pure strategies as follows: If the offense funnelled all of its offense through one of those players, and the defense knew it, they would muster all their defensive pressure on that player. On the other hand, if the defense always pressured one of the players, and the offense knew it, they would always have the other player shoot it. Since those two scenarios are incompatible with one another, the Nash equilibrium must involve mixed strategies. Our objective, then, is to figure out what those mixed strategies are.

The offensive mix, or strategy, we'll represent by p, the fraction of time that the perimeter player shoots the ball. The rest of the time, 1-p, the post player shoots the ball. The overall efficiency function of the offense, as a function of defensive strategy b, is then

Q(b) = p r(b) + (1-p) s(b)

The objective of the defense, in setting its defensive strategy, will be to ensure that the offense cannot improve its outcome by varying its strategy p. That is, it will set the value b such that the partial derivative of Q with respect to p (not b) is equal to 0:

∂Q/∂p = r(b) - s(b) = 0

which happens when r(b) = s(b)—in other words, when the efficiencies of the two options are equal. The offense, in setting its strategy p, will aim to zero out the partial derivative of Q with respect to b:

∂Q/∂b = p r'(b) + (1-p) s'(b) = 0

which happens when

p = s'(b) / [s'(b) - r'(b)]

where b is taken to be the point where the two efficiency curves meet, since the offense knows the defense will play there.

But let's not worry about the offensive strategy; the important thing to take away is that at the Nash equilibrium, the defense will adjust its pressure until the efficiencies of the two offensive options are equal. Let's show what that looks like graphically.

We'll see here how game theory tells us what should be common sense: If the current defensive strategy were somewhere else than at the Nash equilibrium—say, if it were further to the left—the offense could improve its outcome by shifting more of its offensive load to the perimeter player, since he's the more efficient option on the left side of the graph. The reverse holds on the right side of the graph. Only at the point where they cross is the offense powerless to improve its situation by changing its offensive mix, which is exactly the outcome the defense wants.

As a corollary, the exact location of the Nash equilibrium depends vitally on the efficiency functions of the offensive components. If, for instance, one of the efficiency functions drops, the observed efficiency of the offense (that is, the efficiency measured by statistics) will also drop. Let's take a look at that graphically:

In this figure, the efficiency function of the post player, represented by s(b), has dropped. This has the effect of sliding the Nash equilibrium point down and to the right, which indicates increased ball pressure and a decrease in the observed efficiency of both the post player and the perimeter player. It's important to recognize that the efficiency function of a player refers to the entire curve, from b = 0 to b = 1, but when we gather basketball statistics, we merely get the observed efficiency, the value of that curve at a single point—the point where the team strategies actually reside (in this case, the Nash equilibrium).

Consider: Why might the efficiency function of the post player drop, as depicted above? It might be because the backup post player came in. It might be because a defensive specialist post player came in. In short, it might be because of a variety of things, none of which have to do with the perimeter player and his efficiency function—and yet the perimeter player's observed efficiency (whether we're talking about PER, or WP48, or whatever) drops as a result.

There's nothing special about the perimeter player in this regard; we would see the same effect on the post player if the perimeter player (or his defender) were swapped out. In general, the observed efficiency of a player goes up or down owing, in part, to the efficiency function of his teammates.

We see here an analogy to the distinction, drawn in economics, between demand and quantity demanded. Suppose we see that sales of a particular brand of cheese spread have dropped over the last quarter. That is to say, the quantity demanded has decreased. Does that necessarily mean that demand itself has dropped? Not necessarily. It could be that a new competing brand of cheese spread has arrived on the market. Or, it could be that production costs of the cheese spread have increased, leading to a corresponding increase in price. Both of these decrease the quantity demanded, but only the former represents a decrease in actual demand. Demand is a function of price; quantity demanded is just a number. If all we measure is quantity demanded, and we ignore the price, we haven't learned all we need to carry on our business. As economists, we would be roundly criticized (and rightly so) for neglecting this critical factor.

We are, in the basketball statistics world (and that of sports statistics in general), at a point where all we measure is the number. We don't, as a rule, measure the function. We apply our statistical rules with rigor and expect our results to acquire the patina of that rigor. But we mustn't be hypnotized by that patina and forget what we are measuring. If our aim is to describe the observed situation, then the number may be all we need. But if our aim is to describe some persistent quality of the situation—as must be the case if we are attempting to (say) compare players, or if we are hoping to optimize strategies—then we are obligated to measure the function. Doing so is very complex indeed for basketball; there are an array of variables to account for, and we have at present only the most rudimentary tools for capturing them. It is OK to punt that problem for now. But in the meantime, we must not delude ourselves into thinking that by measuring that one number, we have all we need to carry on our business.

Suppose you're getting a sequence of playing cards, and you're trying to figure out some statistics for the playing cards. At first, the cards seem utterly random, but after a while, a pattern emerges: There are slightly more black face cards than red ones, and there are slightly more red low-rank cards than black ones. You're a statistician, so you can quantify the bias—measure the correlation coefficient between color and rank, estimate the standard error in the observed proportions, and so forth. There are rigorous rules for computing all these things, and they're quite straightforward to follow.

Except, you're playing gin rummy, and the reason you're receiving a biased sequence of cards is that you're trying to collect particular cards. If you change your collection strategy, you'll affect the bias. You may have followed all the statistical rules, but you've forgotten about the context.

It might seem entirely obvious to you, now that I've told you the whole story, what the mistake is, and how to avoid it, but I contend that a wholly parallel thing is happening in sports statistics. I'm going to talk about basketball again, because I'm most familiar with it, but the issue transcends that individual sport, yes?

I've previously touched upon this, but this time, with the first post on game theory as background, I'm actually going to go through some of the analysis. Again, we won't be able to entirely avoid the math, but I'll try to describe in words what's going on at the same time. If calculus makes you squeamish, feel free to skip the following and move down to the part in bold.

In our simple model, the offense has two basic options: have the perimeter player shoot the ball, or pass it into the post player, and have him shoot the ball. The defense, in turn, can vary its defensive pressure on the two players, and it can do that continuously: It can double team the perimeter player aggressively, double the post player off the ball, or anything in between. We'll use the principles of game theory to figure out where the Nash equilibrium for this situation is.

We'll denote the defensive strategy by b, for on-ball pressure: If b = 1, then all of the pressure is on the perimeter ball-handler; if b = 0, all of it's on the post player. An intermediate value, like b = 1/2, might mean that the defense is equally split between the two of them (man-to-man defense on each), but the exact numbers are not important; the important thing is that the defensive strategy varies smoothly, and its effects on the offensive efficiency also vary smoothly.

Each of the two offensive options has an associated efficiency, which represents how many points on average are scored when that player attempts a shot. We'll call the perimeter player's efficiency r, and the post player's efficiency s. As you might expect, both efficiencies depend on the defensive strategy, so we'll actually be referring to the efficiency functions r(b) and s(b). The perimeter player is less efficient when greater defensive pressure is placed on him, naturally, so r(b) is a decreasing function of b. On the other hand, the post player is more efficient when greater defensive pressure is placed on the perimeter player, so s(b) is an increasing function of b.

Now let's look at this situation from a game theory perspective. Will the Nash equilibrium of this system involve pure strategies, or mixed strategies? (A pure defensive strategy in this instance consisting of either b = 0 or b = 1.) Right away, we can eliminate the pure strategies as follows: If the offense funnelled all of its offense through one of those players, and the defense knew it, they would muster all their defensive pressure on that player. On the other hand, if the defense always pressured one of the players, and the offense knew it, they would always have the other player shoot it. Since those two scenarios are incompatible with one another, the Nash equilibrium must involve mixed strategies. Our objective, then, is to figure out what those mixed strategies are.

The offensive mix, or strategy, we'll represent by p, the fraction of time that the perimeter player shoots the ball. The rest of the time, 1-p, the post player shoots the ball. The overall efficiency function of the offense, as a function of defensive strategy b, is then

Q(b) = p r(b) + (1-p) s(b)

The objective of the defense, in setting its defensive strategy, will be to ensure that the offense cannot improve its outcome by varying its strategy p. That is, it will set the value b such that the partial derivative of Q with respect to p (not b) is equal to 0:

∂Q/∂p = r(b) - s(b) = 0

which happens when r(b) = s(b)—in other words, when the efficiencies of the two options are equal. The offense, in setting its strategy p, will aim to zero out the partial derivative of Q with respect to b:

∂Q/∂b = p r'(b) + (1-p) s'(b) = 0

which happens when

p = s'(b) / [s'(b) - r'(b)]

where b is taken to be the point where the two efficiency curves meet, since the offense knows the defense will play there.

But let's not worry about the offensive strategy; the important thing to take away is that at the Nash equilibrium, the defense will adjust its pressure until the efficiencies of the two offensive options are equal. Let's show what that looks like graphically.

We'll see here how game theory tells us what should be common sense: If the current defensive strategy were somewhere else than at the Nash equilibrium—say, if it were further to the left—the offense could improve its outcome by shifting more of its offensive load to the perimeter player, since he's the more efficient option on the left side of the graph. The reverse holds on the right side of the graph. Only at the point where they cross is the offense powerless to improve its situation by changing its offensive mix, which is exactly the outcome the defense wants.

As a corollary, the exact location of the Nash equilibrium depends vitally on the efficiency functions of the offensive components. If, for instance, one of the efficiency functions drops, the observed efficiency of the offense (that is, the efficiency measured by statistics) will also drop. Let's take a look at that graphically:

In this figure, the efficiency function of the post player, represented by s(b), has dropped. This has the effect of sliding the Nash equilibrium point down and to the right, which indicates increased ball pressure and a decrease in the observed efficiency of both the post player and the perimeter player. It's important to recognize that the efficiency function of a player refers to the entire curve, from b = 0 to b = 1, but when we gather basketball statistics, we merely get the observed efficiency, the value of that curve at a single point—the point where the team strategies actually reside (in this case, the Nash equilibrium).

Consider: Why might the efficiency function of the post player drop, as depicted above? It might be because the backup post player came in. It might be because a defensive specialist post player came in. In short, it might be because of a variety of things, none of which have to do with the perimeter player and his efficiency function—and yet the perimeter player's observed efficiency (whether we're talking about PER, or WP48, or whatever) drops as a result.

There's nothing special about the perimeter player in this regard; we would see the same effect on the post player if the perimeter player (or his defender) were swapped out. In general, the observed efficiency of a player goes up or down owing, in part, to the efficiency function of his teammates.

We see here an analogy to the distinction, drawn in economics, between demand and quantity demanded. Suppose we see that sales of a particular brand of cheese spread have dropped over the last quarter. That is to say, the quantity demanded has decreased. Does that necessarily mean that demand itself has dropped? Not necessarily. It could be that a new competing brand of cheese spread has arrived on the market. Or, it could be that production costs of the cheese spread have increased, leading to a corresponding increase in price. Both of these decrease the quantity demanded, but only the former represents a decrease in actual demand. Demand is a function of price; quantity demanded is just a number. If all we measure is quantity demanded, and we ignore the price, we haven't learned all we need to carry on our business. As economists, we would be roundly criticized (and rightly so) for neglecting this critical factor.

We are, in the basketball statistics world (and that of sports statistics in general), at a point where all we measure is the number. We don't, as a rule, measure the function. We apply our statistical rules with rigor and expect our results to acquire the patina of that rigor. But we mustn't be hypnotized by that patina and forget what we are measuring. If our aim is to describe the observed situation, then the number may be all we need. But if our aim is to describe some persistent quality of the situation—as must be the case if we are attempting to (say) compare players, or if we are hoping to optimize strategies—then we are obligated to measure the function. Doing so is very complex indeed for basketball; there are an array of variables to account for, and we have at present only the most rudimentary tools for capturing them. It is OK to punt that problem for now. But in the meantime, we must not delude ourselves into thinking that by measuring that one number, we have all we need to carry on our business.

Labels:
basketball,
game theory,
probability,
statistics

## Monday, February 14, 2011

### Matching Up in Hyperspace (or, Thirty Dancing)

Maybe it's because I've been writing about basketball a lot, but I thought today I'd do something a little different before continuing on, as promised, with a second game theory post.

A while ago, I remember reading an analogy about why it is that oil and water don't mix. (I don't remember where I read it, though, so if you recognize it, please tell me.) Is it that water molecules only "like" water molecules, and oil molecules only "like" oil molecules? Not at all—they all like water molecules!

A while ago, I remember reading an analogy about why it is that oil and water don't mix. (I don't remember where I read it, though, so if you recognize it, please tell me.) Is it that water molecules only "like" water molecules, and oil molecules only "like" oil molecules? Not at all—they all like water molecules!

A water molecule is often drawn as H-O-H, but that drawing is a bit misleading. The hydrogen atoms are actually attached at an angle, as below.

This one looks a bit like a Japanese cartoon character, if you ask me. At any rate, this asymmetry, top to bottom (as drawn here), means that we can speak of an oxygen end (the bottom) and a hydrogen end (the top). What's more, because of the way that electrons are arranged in each atom, the oxygen atom tends to draw electrons away from the hydrogen atoms. The oxygen end, so to speak, has more electrons hanging around it than the hydrogen end. Since electrons are negatively charged, the water molecule has a positive pole (the hydrogen end) and a negative pole (the oxygen end), and we say that the water is a polar molecule.

Water molecules attract each other because they are polar. The positively charged hydrogen end of one attracts the negatively charged oxygen end of another. In steam, the gaseous form, this is almost impossible to make out, because the molecules are too far apart and energetic, bouncing around far too wildly to show any mutual attraction. However, in ice, the solid form, the attraction is much more obvious.

It's a bit hard to tell which hydrogen atoms are associated with each oxygen atoms, but that's because in ice, the bonds are a bit confused. Even so, however, it's clear that we don't have water molecules bonding together oxygen-to-oxygen, or hydrogen-to-hydrogen. They only attach oxygen-to-hydrogen (in the hexagonal arrangement that yields those lovely snowflakes), because the molecules are polar that way. That's the way water molecules "like" each other.

Liquid water is intermediate between ice and steam. The molecules aren't fixed in place to each other as they are in ice, but neither are they bouncing wildly as they are in steam. Instead, they wander amongst each other, like people milling about in a crowd. And as they wander around, they stick to each other a bit, on account of their polarity. They attach and cohere, which makes water bead up, among other things.

It's a bit hard to tell which hydrogen atoms are associated with each oxygen atoms, but that's because in ice, the bonds are a bit confused. Even so, however, it's clear that we don't have water molecules bonding together oxygen-to-oxygen, or hydrogen-to-hydrogen. They only attach oxygen-to-hydrogen (in the hexagonal arrangement that yields those lovely snowflakes), because the molecules are polar that way. That's the way water molecules "like" each other.

Liquid water is intermediate between ice and steam. The molecules aren't fixed in place to each other as they are in ice, but neither are they bouncing wildly as they are in steam. Instead, they wander amongst each other, like people milling about in a crowd. And as they wander around, they stick to each other a bit, on account of their polarity. They attach and cohere, which makes water bead up, among other things.

What about oil molecules? Oil molecules tend to be symmetric in such a way that there is no clear polar end as there is in water. As a result, they are much less polar than water molecules are. Nonetheless, being weakly polar (under appropriate circumstances), they "like" other polar molecules, too. So why don't they attach to the water molecules, too?

The reason is that there is only so much room for molecules to attract each other. And here's where the analogy I mentioned earlier comes into play. You often find, at a school, that the most popular kids date other most popular kids (when they date), and the least popular kids date other least popular kids (again, when they date). Why is that? Is it that the least popular kids aren't attracted to the most popular kids? Well, it might sometimes be because of that, but often, they are attracted to the most popular kids; that is, after all, part of what makes someone most popular.

What gets in the way, however, is that the most popular kids, like most others perhaps, are also attracted to the most popular kids, and since such pairings satisfy both attractions, they get paired first. Then the next most popular kids pair up with other next most popular kids, they get paired next. And so on down the line. Or so the story goes.

Of course, it isn't quite that neat and clean with kids, but it is a reasonable approximation with what happens when you combine oil and water. They don't mix because the most popular water molecules hook up with other most popular water molecules, while the least attractive oil molecules are left hooking up with each other.

So much for oil and water. But now let's go back to that analogy, which as it so happens is what I really wanted to talk about. (The rest of that science was just for show?!) It doesn't ring true because we all know couples where we think, "Wow, she paired up with him?" How does that happen? It happens because people aren't one-dimensional.

Suppose all people were one-dimensional. Then you could rate each person with a number x—say, from 0 to 100. (I hate it when things are rated from 1 to 100. What's middle-of-the-road on such ratings? 50.5?) In such a case, if you have two 100's, wouldn't they choose each other above all others? You couldn't easily see a 100 pairing with a 25, if there's another 100 to choose from. Under such circumstances, the nth highest-rated male would always match up with the nth highest-rated female. Just like the kids at our hypothetical school.

Note: For reasons I despise (expositional convenience, basically), I'm writing this out heterosexually. Let it be clear that this isn't mandated in any way, and I'm aware of that. This treatment unfortunately makes it easiest for me to separate out two groups and draw what amounts to a bipartite graph between them. Sorry!

The reason is that there is only so much room for molecules to attract each other. And here's where the analogy I mentioned earlier comes into play. You often find, at a school, that the most popular kids date other most popular kids (when they date), and the least popular kids date other least popular kids (again, when they date). Why is that? Is it that the least popular kids aren't attracted to the most popular kids? Well, it might sometimes be because of that, but often, they are attracted to the most popular kids; that is, after all, part of what makes someone most popular.

What gets in the way, however, is that the most popular kids, like most others perhaps, are also attracted to the most popular kids, and since such pairings satisfy both attractions, they get paired first. Then the next most popular kids pair up with other next most popular kids, they get paired next. And so on down the line. Or so the story goes.

Of course, it isn't quite that neat and clean with kids, but it is a reasonable approximation with what happens when you combine oil and water. They don't mix because the most popular water molecules hook up with other most popular water molecules, while the least attractive oil molecules are left hooking up with each other.

So much for oil and water. But now let's go back to that analogy, which as it so happens is what I really wanted to talk about. (The rest of that science was just for show?!) It doesn't ring true because we all know couples where we think, "Wow, she paired up with him?" How does that happen? It happens because people aren't one-dimensional.

Suppose all people were one-dimensional. Then you could rate each person with a number x—say, from 0 to 100. (I hate it when things are rated from 1 to 100. What's middle-of-the-road on such ratings? 50.5?) In such a case, if you have two 100's, wouldn't they choose each other above all others? You couldn't easily see a 100 pairing with a 25, if there's another 100 to choose from. Under such circumstances, the nth highest-rated male would always match up with the nth highest-rated female. Just like the kids at our hypothetical school.

Note: For reasons I despise (expositional convenience, basically), I'm writing this out heterosexually. Let it be clear that this isn't mandated in any way, and I'm aware of that. This treatment unfortunately makes it easiest for me to separate out two groups and draw what amounts to a bipartite graph between them. Sorry!

We might say, callously, that only one pair of people would say they feel completely satisfied with the pairing; everyone else is "envious" in the sense that there's someone else with whom they would rather have paired up. That's inevitable with one-dimensional people.

So let's give people another dimension: Let them now be rated with two numbers (x, y). Now, there is no universal and complete ordering on people. We might agree that if someone has both numbers higher than someone else, they are more appealing, but there is no universally accepted way to compare two people with one number higher and one number lower. This is akin to the problem with PER. It's entirely possible that everyone could be envy-free.

Here's what I mean. Suppose you have three males and three females. The three males are (60, 30), (50, 50), and (30, 60). So are the three females. Now there's no way you can say that the (60, 30) male is inherently superior to the (50, 50) male, or vice versa. The same is true of any other two males, or any two females. To decide amongst the alternatives, one needs a discriminating function of some sort. Let's say your function is 2x+2y. Then you would rank your three choices 180, 200, and 180, and you would choose the (50, 50) over either the (60, 30) or the (30, 60). If, on the other hand, your function was 3x+y, you would rank your choices 210, 200, and 150, and you'd choose the (60, 30) over the other two. Finally, if your function was x+3y, you'd pick the (30, 60) first. So it's possible for each of the alternatives to be first in someone's eyes.

Of course, to be a completely satisfactory pairing, both sides of the pairing must feel they got the best catch. But consider the (60, 30) male. Being a high-x kind of guy, he naturally values x more than y, perhaps, and his discriminating function will reflect that. (Some people, all they care about is x.) He might be exactly the sort of guy with a function like 3x+y, and would therefore pick the (60, 30) female. She, thinking likewise, would pick the (60, 30) male back. Likewise, the (50, 50) people might pair up with each other as mutually optimal choices, and the (30, 60) people too. It doesn't have to match that way, of course; it just has to match one-to-one. Maybe the (60, 30)'s love the (30, 60)'s, for instance, and vice versa.

On the other hand, this matching leaves someone who's (40, 40) out in the cold, because no discriminating function will rate them ahead of everybody else. Whoever matched up with them would always be upset that they didn't at least match up with ol' (50, 50).

It boils down to who's on the orthogonally convex hull. The hull is made up of everyone who isn't universally worse than any other option. An illustration of this in two dimensions should hopefully make it clear why it's called the hull:

It's called an orthogonally convex hull because everyone is contained in it. [Note: I had previously written just convex hull here, but it later occurred to me that what I mean is an

**orthogonally**convex hull.] Everyone on the hull could be someone's optimal choice; everyone else would be a consolation prize. It's possible that everyone would be on the hull, but it's unlikely, given a random selection of people.

Let's not be too hasty, though. There's an interesting dependency between dimensionality and being on the hull. In one dimension, exactly one person is on the hull (barring ties); everyone else is beneath him or her. In two dimensions, it's a bit more complex, but suppose you had a hundred people, evenly spread out between (0, 0) and (100, 100). On average, maybe five people would be on the hull. (The actual average is the sum 1 + 1/2 + 1/3 + ... + 1/100.)

Now let's increase it to three dimensions. If you have a hundred people spread out between (0, 0, 0) and (100, 100, 100), on average about 14 people would be on the hull. All 14 could be the optimal choice for some prospective mate. As the number of dimensions goes up (and the number of possible discriminating functions, too!), the percentage of people on the hull also goes up. With four dimensions, the average number of people on the hull is 28; with five, it's 44; with six, 59—more than half! Ten dimensions are sufficient to push it up to 94, and by the time you have, oh, let's say thirty dimensions, the odds are about ten million to one in favor of every last person being on the hull. Remember, it isn't necessary to have a highest value in any of the dimensions to be on the hull; all you need is to not be lower than anyone else in all of the dimensions. As the number of dimensions goes up, it becomes awfully unlikely that you'll be lower than anyone else in every single dimension. We can have an entirely envy-free matching, all with the help of increased dimensionality.

OK, this may seem completely crazy, and I wouldn't blame you for calling shenanigans. Who would actually go and rank people using a set of thirty numbers? But this is exactly what one of those on-line dating sites advertises it does. Well, not exactly; it actually claims to use 29 dimensions. Why 29? I would imagine because it sounds somewhat more scientific than thirty. But beyond that, I think that they use as many as 29 because it makes it almost inevitable that you'll be on the hull, that there'll be someone who you find optimal (or very nearly so), for whom you will likewise be optimal (or very nearly so). And although I think that's partly a marketing gimmick, I think there's some truth to it, too; if there weren't, the human race would have died out long ago.

Now let's increase it to three dimensions. If you have a hundred people spread out between (0, 0, 0) and (100, 100, 100), on average about 14 people would be on the hull. All 14 could be the optimal choice for some prospective mate. As the number of dimensions goes up (and the number of possible discriminating functions, too!), the percentage of people on the hull also goes up. With four dimensions, the average number of people on the hull is 28; with five, it's 44; with six, 59—more than half! Ten dimensions are sufficient to push it up to 94, and by the time you have, oh, let's say thirty dimensions, the odds are about ten million to one in favor of every last person being on the hull. Remember, it isn't necessary to have a highest value in any of the dimensions to be on the hull; all you need is to not be lower than anyone else in all of the dimensions. As the number of dimensions goes up, it becomes awfully unlikely that you'll be lower than anyone else in every single dimension. We can have an entirely envy-free matching, all with the help of increased dimensionality.

OK, this may seem completely crazy, and I wouldn't blame you for calling shenanigans. Who would actually go and rank people using a set of thirty numbers? But this is exactly what one of those on-line dating sites advertises it does. Well, not exactly; it actually claims to use 29 dimensions. Why 29? I would imagine because it sounds somewhat more scientific than thirty. But beyond that, I think that they use as many as 29 because it makes it almost inevitable that you'll be on the hull, that there'll be someone who you find optimal (or very nearly so), for whom you will likewise be optimal (or very nearly so). And although I think that's partly a marketing gimmick, I think there's some truth to it, too; if there weren't, the human race would have died out long ago.

I mean, how else does Ric Ocasek land Paulina Porizkova? For real, I mean!

Labels:
chemistry,
mathematics,
probability,
romance

## Tuesday, February 8, 2011

### A (Kind of) Gentle Introduction to Game Theory

Prefatory to more basketball talk, I want to take a bit of time out to describe what I think is a rather elegant area of mathematics: game theory. Even the name is elegant—simple and to-the-point. As its name implies, game theory is the study of game strategies and tactics from a mathematical point of view. Rather than describe its foundations and move on from there, as a textbook would, I'm going to leap right in and use game theory in a couple of simple situations, which I hope will be a less obscure way of conveying what it's all about.

Suppose that David and Joshua are playing a friendly game of Global Thermonuclear War. At some point, both players have to decide whether to launch an attack or not. If David attacks and Joshua does not, then David wins and earns 5 points (this is a game, after all) and Joshua earns 1 point for being peaceable. Conversely, if Joshua attacks and David does not, Joshua earns 5 points and David earns just 1 point. If both attack, the Earth is rendered a wasteland and neither side earns any points; if both sides do not attack, everybody wins and both sides earn 6 points. The foregoing can be summarized in tabular form, as below.

David's payoffs are shown in blue, Joshua's in red. Let's run through a simple game-theoretical analysis of GTW. If we focus on just the blue numbers (David's payoffs), we see that if Joshua attacks, David's best strategy is to stand down (1 > 0). If Joshua stands down, David's best strategy is, again, to stand down (6 > 5). No matter what Joshua does, in short, David should stand down.

Moving over to the red numbers, we come to a similar conclusion for Joshua's strategy: No matter what David does, Joshua is better off standing down (either 1 > 0, or 6 > 5). As a result, both sides stand down; the only way to win is, indeed, not to play. Whew!

Suppose that David and Joshua are playing a friendly game of Global Thermonuclear War. At some point, both players have to decide whether to launch an attack or not. If David attacks and Joshua does not, then David wins and earns 5 points (this is a game, after all) and Joshua earns 1 point for being peaceable. Conversely, if Joshua attacks and David does not, Joshua earns 5 points and David earns just 1 point. If both attack, the Earth is rendered a wasteland and neither side earns any points; if both sides do not attack, everybody wins and both sides earn 6 points. The foregoing can be summarized in tabular form, as below.

David's payoffs are shown in blue, Joshua's in red. Let's run through a simple game-theoretical analysis of GTW. If we focus on just the blue numbers (David's payoffs), we see that if Joshua attacks, David's best strategy is to stand down (1 > 0). If Joshua stands down, David's best strategy is, again, to stand down (6 > 5). No matter what Joshua does, in short, David should stand down.

Moving over to the red numbers, we come to a similar conclusion for Joshua's strategy: No matter what David does, Joshua is better off standing down (either 1 > 0, or 6 > 5). As a result, both sides stand down; the only way to win is, indeed, not to play. Whew!

But maybe we shouldn't all relax quite yet. Suppose that we were to adjust the payoff matrix (which is what we call that table up there). Heads of state often get a little nationalistic, and they may well decide that a world without the enemy is better after all, even if we do have to suffer from a little radioactive fallout. At the same time, perhaps it is better to go out fighting and take out the enemy, even as we ourselves are getting wiped out. Then, possibly, the payoff matrix would look like this:

The numbers have changed only a little, but the conclusion is quite different: This time, no matter what Joshua does, David is better off attacking, and no matter what David does, Joshua is better off attacking. The upshot is that both sides end up attacking and wiping each other off the map. The rest of you, I hope you look forward to serving your cockroach overlords...

In the two examples above, the eventual solution has the property that the strategy for both sides was the best they could do, no matter what the opponent did. Such a solution is called a Nash equilibrium, after John Forbes Nash, who won a Nobel Prize in economics for his work in such games. (Yes, the Beautiful Mind guy.) In fact, in each case, the winning strategy was a single choice: either "always attack" or "always stand down."

That is not always the case. Consider a rather less violent game of football (well, somewhat less violent). On a crucial third down, the Steelers can choose to run the ball, or throw the ball; the Packers, on the other hand, can choose to defend the pass or defend the run.

Here's how we might model things. We'll let the payoff be simply the chance, the probability, that the Steelers make a first down. We'll also say that if the Steelers pass, they make a first down 60 percent of the time when the Packers defend the run, but only 20 percent of the time when they defend the pass. If the Steelers run, they make a first down 50 percent of the time when the Packers defend the pass, but only 30 percent of the time when they defend the run. Here's the payoff table:

(We've included the Packers' payoffs in red, although they can be derived from the Steelers' payoff by subtracting from 100 percent.) This time, matters are not as clear cut: For obvious reasons, the Steelers' best option depends on what the Packers do, and the Packers' best option depends on what the Steelers do. But, for our purposes, they both have to show their hands at the same time. What does game theory have to say about this kind of situation?

To figure that out, we'll have to consider a new kind of strategy, called a mixed strategy. A mixed strategy (as opposed to the pure strategies we considered above) is simply one that chooses each of the options with a certain probability. For instance, one possible mixed strategy the Steelers could employ is to run the ball half the time, and pass it the other half. Similarly, the Packers could defend the pass 60 percent of the time, and defend the run 40 percent of the time. There are an infinite number of different mixed strategies both teams could employ. How do we figure out what mixed strategies are actually the best for each side?

Here's where game theory gets a bit hairy (hence the "kind of" in the title). Essentially, what the Steelers want to do is to make their strategy "resistant" against the Packers, in the sense that no matter what the Packers do, they can't damage the Steelers' chances of making their first down. And the Packers want to set their strategy so that no matter what the Steelers do, they can't improve their chances of making the first down. Such a situation, where neither side can do any better without the other side changing what they do, is also a Nash equilibrium. The brilliant thing that Nash did, which earned him that Nobel Prize, was to show that in such games, there is always a set of mixed or pure strategies that yields a Nash equilibrium.

What follows is pretty heavy mathematical stuff. If you don't want me to go all calculus upside your head, feel free to skip it and go to the conclusion in bold, below. Here's what we do. We characterize the Steelers' strategy by p, the probability that the Steelers pass the ball, and the Packers' strategy by q, the probability that they defend the pass. From the Steelers' perspective, there are four distinct possibilities:

The numbers have changed only a little, but the conclusion is quite different: This time, no matter what Joshua does, David is better off attacking, and no matter what David does, Joshua is better off attacking. The upshot is that both sides end up attacking and wiping each other off the map. The rest of you, I hope you look forward to serving your cockroach overlords...

In the two examples above, the eventual solution has the property that the strategy for both sides was the best they could do, no matter what the opponent did. Such a solution is called a Nash equilibrium, after John Forbes Nash, who won a Nobel Prize in economics for his work in such games. (Yes, the Beautiful Mind guy.) In fact, in each case, the winning strategy was a single choice: either "always attack" or "always stand down."

That is not always the case. Consider a rather less violent game of football (well, somewhat less violent). On a crucial third down, the Steelers can choose to run the ball, or throw the ball; the Packers, on the other hand, can choose to defend the pass or defend the run.

Here's how we might model things. We'll let the payoff be simply the chance, the probability, that the Steelers make a first down. We'll also say that if the Steelers pass, they make a first down 60 percent of the time when the Packers defend the run, but only 20 percent of the time when they defend the pass. If the Steelers run, they make a first down 50 percent of the time when the Packers defend the pass, but only 30 percent of the time when they defend the run. Here's the payoff table:

(We've included the Packers' payoffs in red, although they can be derived from the Steelers' payoff by subtracting from 100 percent.) This time, matters are not as clear cut: For obvious reasons, the Steelers' best option depends on what the Packers do, and the Packers' best option depends on what the Steelers do. But, for our purposes, they both have to show their hands at the same time. What does game theory have to say about this kind of situation?

To figure that out, we'll have to consider a new kind of strategy, called a mixed strategy. A mixed strategy (as opposed to the pure strategies we considered above) is simply one that chooses each of the options with a certain probability. For instance, one possible mixed strategy the Steelers could employ is to run the ball half the time, and pass it the other half. Similarly, the Packers could defend the pass 60 percent of the time, and defend the run 40 percent of the time. There are an infinite number of different mixed strategies both teams could employ. How do we figure out what mixed strategies are actually the best for each side?

Here's where game theory gets a bit hairy (hence the "kind of" in the title). Essentially, what the Steelers want to do is to make their strategy "resistant" against the Packers, in the sense that no matter what the Packers do, they can't damage the Steelers' chances of making their first down. And the Packers want to set their strategy so that no matter what the Steelers do, they can't improve their chances of making the first down. Such a situation, where neither side can do any better without the other side changing what they do, is also a Nash equilibrium. The brilliant thing that Nash did, which earned him that Nobel Prize, was to show that in such games, there is always a set of mixed or pure strategies that yields a Nash equilibrium.

What follows is pretty heavy mathematical stuff. If you don't want me to go all calculus upside your head, feel free to skip it and go to the conclusion in bold, below. Here's what we do. We characterize the Steelers' strategy by p, the probability that the Steelers pass the ball, and the Packers' strategy by q, the probability that they defend the pass. From the Steelers' perspective, there are four distinct possibilities:

- Steelers pass (p), Packers defend the pass (q): Payoff is 20 percent.
- Steelers pass (p), Packers defend the run (1-q): Payoff is 60 percent.
- Steelers run (1-p), Packers defend the pass (q): Payoff is 50 percent.
- Steelers run (1-p), Packers defend the run (1-q): Payoff is 30 percent.

Putting it all together, we get an expression for the Steelers' payoff S:

S = 0.2 pq + 0.6 p(1-q) + 0.5 (1-p)q + 0.3 (1-p)(1-q)

S = 0.3 + 0.3 p + 0.2 q - 0.6 pq

S = 0.2 pq + 0.6 p(1-q) + 0.5 (1-p)q + 0.3 (1-p)(1-q)

S = 0.3 + 0.3 p + 0.2 q - 0.6 pq

We want to find the value of p that makes the partial derivative of S with respect to q equal to 0. That is, we need the value of p that makes it utterly irrelevant what the Packers do with their q (as it were).

∂S/∂q = 0.2 - 0.6 p = 0

which happens when p = 1/3. We can do a similar expression for the Packers' payoff P from their side of the payoff matrix:

P = 0.8 pq + 0.4 p(1-q) + 0.5 (1-p)q + 0.7 (1-p)(1-q)

P = 0.7 - 0.3 p - 0.2 q + 0.6 pq

which happens when p = 1/3. We can do a similar expression for the Packers' payoff P from their side of the payoff matrix:

P = 0.8 pq + 0.4 p(1-q) + 0.5 (1-p)q + 0.7 (1-p)(1-q)

P = 0.7 - 0.3 p - 0.2 q + 0.6 pq

and then the optimal strategy for the Packers is dictated by

∂P/∂p = -0.3 + 0.6 q = 0

which happens when q = 1/2. So the Nash equilibrium happens when the Steelers pass the ball 1/3 of the time and run the ball 2/3 of the time, and the Packers defend the run 1/2 of the time, and defend the pass 1/2 of the time. Under these strategies, the Steelers make their first down 40 percent of the time, and nothing the Steelers do on their own can increase it, and nothing the Packers do on their own can decrease it (given the payoff table). That's what makes it a Nash equilibrium.

Notice that none of the pure strategies work as well as the mixed strategies do. If the Steelers always ran the ball on third down, the Packers knew that, they would just defend the run and limit the Steelers to making first downs 30 percent of the time. It's even worse if the Steelers passed all the time; they'd make a first down only 20 percent of the time. Conversely, if the Packers always defended the run, and the Steelers knew that, they'd just pass all the time and make their first down with 60 percent efficiency. And so on.

The salient thing to take from all this, though, which I'll get into in my next post, is that although the Steelers' odds of making the first down don't depend on the Packers' strategy, at the Nash equilibrium, their best strategy does depend on how good the Packers are at defending the various options (which is represented by the payoff matrix). Although that is hardly earth-shattering in this particular case, we'll see that has interesting repercussions when trying to rate individual player achievement.

∂P/∂p = -0.3 + 0.6 q = 0

which happens when q = 1/2. So the Nash equilibrium happens when the Steelers pass the ball 1/3 of the time and run the ball 2/3 of the time, and the Packers defend the run 1/2 of the time, and defend the pass 1/2 of the time. Under these strategies, the Steelers make their first down 40 percent of the time, and nothing the Steelers do on their own can increase it, and nothing the Packers do on their own can decrease it (given the payoff table). That's what makes it a Nash equilibrium.

Notice that none of the pure strategies work as well as the mixed strategies do. If the Steelers always ran the ball on third down, the Packers knew that, they would just defend the run and limit the Steelers to making first downs 30 percent of the time. It's even worse if the Steelers passed all the time; they'd make a first down only 20 percent of the time. Conversely, if the Packers always defended the run, and the Steelers knew that, they'd just pass all the time and make their first down with 60 percent efficiency. And so on.

The salient thing to take from all this, though, which I'll get into in my next post, is that although the Steelers' odds of making the first down don't depend on the Packers' strategy, at the Nash equilibrium, their best strategy does depend on how good the Packers are at defending the various options (which is represented by the payoff matrix). Although that is hardly earth-shattering in this particular case, we'll see that has interesting repercussions when trying to rate individual player achievement.

Labels:
football,
game theory,
mathematics,
probability

## Friday, January 28, 2011

### How to Be Wrong, With Statistics!

Please, just stop it. You're hurting me.

Anyone who understands statistics at all cannot dispute that Kobe Bryant does not perform well statistically, in the clutch. But anyone who understands statistics well cannot dispute that the current statistics are woefully under-equipped to discern who is the clutchiest player in the league.

Look: Nothing happens in a vacuum. We look at crunch-time statistics because it's the most exciting part of the game, when it happens. But it's only one way to condition a play.

What do I mean by condition? I mean "to restrict the characteristics of." With respect to comparing players on their clutchiosity, the objective should be to condition the crunch-time plays sufficiently that we are comparing apples to apples, and oranges to oranges. And here, as with many other aspects of basketball, we simply don't have the statistics to do it at our disposal.

For instance, suppose that we wish to compare two players, A and B. Suppose that A's offensive efficiency (points per possession) is greater than B's, with less than 24 seconds on the clock and the team tied or down no more than three points. Does that mean that A is clutchier than B?

Not at all. If B has stiffs for teammates, compared to A, then he's likely going to be faced with tighter individual defense than A, and likely earn a lower offensive efficiency than A. That's a couple of instances of "likely" in there, but the point doesn't have to be ironclad, it just has to be plausible, even probable. We just don't know enough to conclude with anything approaching certainty that A is clutchier, because we haven't conditioned on the teammates. (Or the defense, for that matter.)

Observe that this is mostly independent of what statistic you use to measure clutchiness. Suppose, instead, that you decide to use win probability increment. A player's ability to increase his team's likelihood of winning is still going to be affected by his teammates: If he passes, they will have a lower probability of scoring; if he doesn't, the defense can afford to defend him more tightly.

Of course, maybe you're OK with this kind of quality vacillating with things like which teammates a player has. But personally, I think such a measure has a certain ephemeral aspect that we don't usually associate with clutchiness.

The problem is, how can you possibly condition on the kind of teammates that a player has? Players don't change teammates the way they change their clothes (or at least they shouldn't). So what do you do?

Here's my gentle suggestion: Stop trying to answer these abstract questions statistically. I've been using outlandish forms of the word "clutch" to underscore this, in case you haven't noticed, but my point is serious. Use statistics to answer the questions they can. As the field advances, we'll be able to answer more of these questions, but in the meantime, use the same method we've been using all along: subjective observation. Western civilization didn't break down before we had PER. Nothing hinges on who people outside the game think is clutch. And mostly, stop pretending to any degree of certainty in the matter, just because a number is attached to it.

EDIT: Since I'm a fan of Kobe Bryant, one might reasonably wonder whether or not I've got a built-in bias against crunch-time statistics, since almost all of them (except perhaps a raw count of shots made in crunch time, as opposed to efficiency) point to quite a few players as being superior in the clutch. Obviously, I can't deny said bias. Quite possibly I would not be making these same arguments, or making them with quite the same degree of vehemence, if those statistics showed Bryant in a better light.

That being said, however, I don't think the question of using statistics to examine clutchitude should be predicated on how well they accord with conventional wisdom (where Bryant is, indeed, king of clutch). In my opinion, there are quite compelling fundamental arguments that straightforward linear classifiers such as PER or offensive efficiency or wins produced, conditioned on crunch time or not, are simply not reliable indicators of individual performance, and those arguments would remain valid regardless of whether I espoused them, or of whom they revealed to be the top performers, in crunch time or in the game overall.

Anyone who understands statistics at all cannot dispute that Kobe Bryant does not perform well statistically, in the clutch. But anyone who understands statistics well cannot dispute that the current statistics are woefully under-equipped to discern who is the clutchiest player in the league.

Look: Nothing happens in a vacuum. We look at crunch-time statistics because it's the most exciting part of the game, when it happens. But it's only one way to condition a play.

What do I mean by condition? I mean "to restrict the characteristics of." With respect to comparing players on their clutchiosity, the objective should be to condition the crunch-time plays sufficiently that we are comparing apples to apples, and oranges to oranges. And here, as with many other aspects of basketball, we simply don't have the statistics to do it at our disposal.

For instance, suppose that we wish to compare two players, A and B. Suppose that A's offensive efficiency (points per possession) is greater than B's, with less than 24 seconds on the clock and the team tied or down no more than three points. Does that mean that A is clutchier than B?

Not at all. If B has stiffs for teammates, compared to A, then he's likely going to be faced with tighter individual defense than A, and likely earn a lower offensive efficiency than A. That's a couple of instances of "likely" in there, but the point doesn't have to be ironclad, it just has to be plausible, even probable. We just don't know enough to conclude with anything approaching certainty that A is clutchier, because we haven't conditioned on the teammates. (Or the defense, for that matter.)

Observe that this is mostly independent of what statistic you use to measure clutchiness. Suppose, instead, that you decide to use win probability increment. A player's ability to increase his team's likelihood of winning is still going to be affected by his teammates: If he passes, they will have a lower probability of scoring; if he doesn't, the defense can afford to defend him more tightly.

Of course, maybe you're OK with this kind of quality vacillating with things like which teammates a player has. But personally, I think such a measure has a certain ephemeral aspect that we don't usually associate with clutchiness.

The problem is, how can you possibly condition on the kind of teammates that a player has? Players don't change teammates the way they change their clothes (or at least they shouldn't). So what do you do?

Here's my gentle suggestion: Stop trying to answer these abstract questions statistically. I've been using outlandish forms of the word "clutch" to underscore this, in case you haven't noticed, but my point is serious. Use statistics to answer the questions they can. As the field advances, we'll be able to answer more of these questions, but in the meantime, use the same method we've been using all along: subjective observation. Western civilization didn't break down before we had PER. Nothing hinges on who people outside the game think is clutch. And mostly, stop pretending to any degree of certainty in the matter, just because a number is attached to it.

EDIT: Since I'm a fan of Kobe Bryant, one might reasonably wonder whether or not I've got a built-in bias against crunch-time statistics, since almost all of them (except perhaps a raw count of shots made in crunch time, as opposed to efficiency) point to quite a few players as being superior in the clutch. Obviously, I can't deny said bias. Quite possibly I would not be making these same arguments, or making them with quite the same degree of vehemence, if those statistics showed Bryant in a better light.

That being said, however, I don't think the question of using statistics to examine clutchitude should be predicated on how well they accord with conventional wisdom (where Bryant is, indeed, king of clutch). In my opinion, there are quite compelling fundamental arguments that straightforward linear classifiers such as PER or offensive efficiency or wins produced, conditioned on crunch time or not, are simply not reliable indicators of individual performance, and those arguments would remain valid regardless of whether I espoused them, or of whom they revealed to be the top performers, in crunch time or in the game overall.

Labels:
basketball,
Kobe Bryant,
probability,
questionable sanity,
statistics

## Wednesday, January 5, 2011

### Voter Mixing Equals Criterion Mixing

I'm going to talk about basketball and probability again. Wasn't that obvious from the title of this post?

It's apparently never too early to talk about the MVP award for the NBA. We're coming up on the halfway point of the season, and writers have been tracking the MVP candidates for, oh, about half a season. Nobody takes them seriously until about now, though.

One side effect of the question being taken seriously is that some wag will point out that the MVP is not—and has never been—defined precisely. In fact, I can't find anywhere where it's been defined at all by the NBA, precisely or otherwise. That leaves the voters (sportswriters and broadcasters, mostly, plus a single vote from NBA fans collectively) to make up their own definition, a situation that said wag invariably finds ludicrous.

Well, here's one wag that finds this situation perfectly acceptable. Desirable, even.

Listen: There is no way that everybody will ever agree on a single criterion for being the "most valuable player." Most valuable to whom? The team? The league? The fans? Himself? (I can think of a few players who certainly aim to be most valuable to themselves.) And what kind of value? Wins? Titles? Highlights? Basketball is entertainment, after all. There are just too many different ways to evaluate players.

Instead, we might imagine that some writers would get together at some point and define MVP as a mixture of criteria. For instance, the title of MVP could be based in equal parts—or inequal parts, for that matter—on individual output, contributions to team success, and entertainment value.

Except, I'd argue that that is exactly what we've been doing for all these years. We have all these voters, all of whom have differing ideas of what the MVP does (or should) stand for. Some people think it should be based on individual statistics (Hollinger's Player Effectiveness Rating, or PER, is a current favorite). Some people think it should be based, at least in part, on team success, so team wins are an input to the decision (a 50-win minimum is a popular threshold). Still others dispense with explicit criteria altogether and vote based on reputation or flash.

Well, if exactly the same number of voters take each of those different perspectives on MVP, then we will have an MVP based in equal parts on individual output, contributions to team success, and entertainment value. And if more voters lean on individual output than on entertainment value, then the MVP make-up will show that same leaning. Voter mixing equals criterion mixing!

It's apparently never too early to talk about the MVP award for the NBA. We're coming up on the halfway point of the season, and writers have been tracking the MVP candidates for, oh, about half a season. Nobody takes them seriously until about now, though.

One side effect of the question being taken seriously is that some wag will point out that the MVP is not—and has never been—defined precisely. In fact, I can't find anywhere where it's been defined at all by the NBA, precisely or otherwise. That leaves the voters (sportswriters and broadcasters, mostly, plus a single vote from NBA fans collectively) to make up their own definition, a situation that said wag invariably finds ludicrous.

Well, here's one wag that finds this situation perfectly acceptable. Desirable, even.

Listen: There is no way that everybody will ever agree on a single criterion for being the "most valuable player." Most valuable to whom? The team? The league? The fans? Himself? (I can think of a few players who certainly aim to be most valuable to themselves.) And what kind of value? Wins? Titles? Highlights? Basketball is entertainment, after all. There are just too many different ways to evaluate players.

Instead, we might imagine that some writers would get together at some point and define MVP as a mixture of criteria. For instance, the title of MVP could be based in equal parts—or inequal parts, for that matter—on individual output, contributions to team success, and entertainment value.

Except, I'd argue that that is exactly what we've been doing for all these years. We have all these voters, all of whom have differing ideas of what the MVP does (or should) stand for. Some people think it should be based on individual statistics (Hollinger's Player Effectiveness Rating, or PER, is a current favorite). Some people think it should be based, at least in part, on team success, so team wins are an input to the decision (a 50-win minimum is a popular threshold). Still others dispense with explicit criteria altogether and vote based on reputation or flash.

Well, if exactly the same number of voters take each of those different perspectives on MVP, then we will have an MVP based in equal parts on individual output, contributions to team success, and entertainment value. And if more voters lean on individual output than on entertainment value, then the MVP make-up will show that same leaning. Voter mixing equals criterion mixing!

What's more, this criterion mixing is automatic. No committee needs to be formed, and the exact mixture evolves as the voter population evolves. If someday team success becomes more important to the basketball cognoscenti, then it'll automatically have a larger impact on MVP selection. No redefinition is necessary.

Can this equivalence be demonstrated on any kind of formal level? In something as complex as basketball, my guess is not. But it's close enough, and intuitive enough, that I think it just doesn't make sense to gripe about the MVP lacking a precise definition. As long as each voter comes to their own decision about what it stands for, we'll get the mix that we should.

Subscribe to:
Posts (Atom)