Thursday, March 21, 2013

Mad as March


Note: This post has been updated to correct some of the probability figures, and to mention UMBC's defeat of Virginia in the 2018 tournament.
 
Hey, it's March, it's mad, it's March Madness!


Which means that there's math, too.

I was inspired to write math this time (as opposed to all those other times) by a short video in which some mathematics guy explained why filling out a perfect bracket (ignoring the "First Four" and focusing only on the 64 teams in the "real" tournament) is so hard.  According to him, it's because there are 63 games, each one eliminating one of the 64 teams, each of which has to be prognosticated correctly in a perfect bracket.  The number of possible brackets is therefore 2 to the 63rd power, or about 9 times 10 to the 18th power.  And so the odds of filling out a perfect bracket is 1 in that enormous number.  Even if everyone in the whole world filled out a bracket, the odds are still a billion to one against anyone getting it all right.

As viewers too numerous to list pointed out (correctly), this line of reasoning is entirely bogus because it assumes that each of the 63-game sequences is equally likely.  Of course they aren't.  Higher seeds are more likely to win their games than lower seeds.  In particular, in the 28 years they've had 64 teams in the tournament, no 16 seed has ever beaten a 1 seed.  That doesn't mean it's impossible, or that it'll never happen, only that it's very unlikely on a game to game basis.  Eventually, though, it's inevitable, provided the tournament goes on year after year. (ETA 2024-03-29: Sure enough, it's happened twice in the last several years. In 2018, the 16th seed University of Maryland Baltimore County Retrievers beat the top seed Virginia Cavaliers by twenty, 74–54, and in 2023, the 16th seed Fairleigh Dickinson Knights beat the top seed Purdue Boilermakers by the more sedate score of 63–58. Both Cinderellas went on to lose their second-round matchups, though.)

The upshot is that certain brackets are more likely to be correct than others, and I've even heard tell that people have filled out perfect brackets in the past.  So I wondered to myself: What are the odds of someone filling out a perfect bracket?

To estimate that (because this really isn't something you can determine empirically), I had to construct a model for simulating tournaments.  This has to be done because although there are plenty of statistics for how often a 5 seed beats a 12 seed (because that pairing always meets four times in the first round, once for each of the four regions), there aren't going to be statistics for how often a 5 seed beats a 14 seed, because that could only happen in the regional finals, after both teams have defeated three teams (mostly teams better than they are), which is highly unlikely.  In fact, I'm not sure that it's ever happened.  I needed a model that would be reasonably simple to calibrate, quick to evaluate, and was generally applicable to any pair of seeds.

The model I decided upon, fairly quickly, works as follows.  Each team has a certain "strength," which depends only on its seeding.  Then, if teams A and B meet, one with strength SA and the other one strength SB, the probabilities of each of them winning are given by

P(A wins) = SA / (SA+SB)
P(B wins) = SB / (SA+SB)

Fairly straightforward. (ETA 2024-01-21: This turns out to be the Bradley-Terry model. I'm sure it's been independently reinvented many times, since it's such a natural idea.) I calibrated it by finding statistics on how often teams of different seeds made it to various stages of the tournament.  I was a bit surprised, actually, by some of the statistics. I had imagined that the chances of winning the first-round game would decrease very little from the 1 seed down to about the 4 seed or so, and then accelerate quickly down to about the 13 seed, and then decrease very slowly again.  (I was aware that there were some seeds that historically have won more often than you might expect, such as the 12 seed, but I assumed those were statistical anomalies that one could reasonably expect to show up in only thirty years of history.  In particular, I did not want to assume that 12 seeds, for instance, were magically better than 11 seeds, or even 5 seeds.  I assumed seeds properly reflected relative talent.)

But no such pattern appeared.  Instead, the probability of winning the first round seems to decrease fairly steadily from the 1 seed down to the 16 seed.  The 12 seed teams do seem to win slightly more than you might expect, but they win no more often than do 11 seeds.  So here, at least, the 12 seed bump wasn't great.  (EDIT: Ha!  Both 12 seeds playing on the first day of the 2013 tournament won: the Oregon Ducks, and my California Golden Bears.  Further EDIT: And now 12th-seeded Ole Miss.  And 13th-seeded La Salle, for that matter.)

While I was trawling for these statistics, by the way, I also came upon the assertion that the way teams were seeded (and in particular, not re-seeded after each round) placed a penalty on the middle seeds, so that the 12 and 13 seeds, it was claimed, were in some cases likely to advance further than the 8 and 9 seeds, for instance.  I thought it might be interesting to see if that came out of the model.

Anyway, as a result of these statistics, I came up with the following strengths for the 16 seeds:

1, 1/2, 1/3, 1/4, 1/5, 1/6, 1/7, 1/8, 1/9, 1/10, 1/11, 1/13, 1/17, 1/25, 1/41, 1/73

You may notice that these strengths imply that a 16 seed will win the first round 1/74 of the time.  (But they never win, I can hear you saying.  Nonsense.  You're only saying that because no one ever has. ETA: It happened! In 2018, the 16 seed University of Maryland, Baltimore County defeated the 1 seed Virginia. As of this writing, before the 2023 tournament, I think that makes the 16 seed one for 148, or exactly half as often as predicted by my wholly ex recto model.)  As you can see, the top 11 seeds have strengths that decrease harmonically—that is, as the reciprocal of the seed.  After that...well, maybe if you're dorky enough, you'll see what the pattern is.

To be sure, I don't really know how accurate this model is, largely because (as I mentioned previously) statistics for many of the match-ups just don't exist, or at least don't exist in sufficient quantity.  But since I'm taking the results with a grain of salt (I only need rough order-of-magnitude estimates), it only has to be moderately accurate for it to do what I want.

Anyway, my simulation engine runs through each of the 32,768 possible regional brackets (each of the regions is identically seeded) and determines how likely each bracket is, and how far each seed got in that bracket.

The results were sort of interesting.  In the first round, there are no notable oddities, which makes sense because the better teams always play against worse teams, so the higher a team is seeded, the more likely it is to win the first round.

1 seed wins First Round with probability 0.986486
2 seed wins First Round with probability 0.953488
3 seed wins First Round with probability 0.892857
4 seed wins First Round with probability 0.809524
5 seed wins First Round with probability 0.722222
6 seed wins First Round with probability 0.647059
7 seed wins First Round with probability 0.588235
8 seed wins First Round with probability 0.529412
9 seed wins First Round with probability 0.470588
10 seed wins First Round with probability 0.411765
11 seed wins First Round with probability 0.352941
12 seed wins First Round with probability 0.277778
13 seed wins First Round with probability 0.190476
14 seed wins First Round with probability 0.107143
15 seed wins First Round with probability 0.046512
16 seed wins First Round with probability 0.013514

(Remember to take all those digits with a sizable grain of salt.)  The second round is where it gets interesting.  Consider a moderately lower seed, like the 12 seed.  If it happens to win its first round, its second round game will be against either the 4 seed or the 13 seed.  It's likely to be against the 4 seed, but not overwhelmingly so; it will be the 13 seed about 19 percent of the time.  In that latter case, the 12 seed will actually be a slight (57 percent) favorite in the second round.  But even if it meets up against the 4 seed, it will be an underdog, but not a prohibitive one.  In such match-ups, the 12 seed beats the 4 seed about 24 percent of the time.

Compare that to one of the middle seedssay, the 8 seed.  If it reaches the second round, it has to play against either the 1 seed or the 16 seed.  Not only is the 1 seed overwhelmingly likely to win its first round match-up, but it is also a much stronger opponent for the 8 seed than the 4 seed was for the 12 seed.  As I mentioned above, this happens because the teams are not re-seeded after each round, with the best of the surviving teams playing the worst of the surviving teams, the second best playing the second worst, and so on.  The moral of this story is that as far as the Sweet Sixteen is concerned, the pundits are right: The middle seeds are (slightly) cursed!

1 seed reaches Sweet Sixteen with probability 0.882035
2 seed reaches Sweet Sixteen with probability 0.763414
3 seed reaches Sweet Sixteen with probability 0.632753
4 seed reaches Sweet Sixteen with probability 0.496767
5 seed reaches Sweet Sixteen with probability 0.366148
6 seed reaches Sweet Sixteen with probability 0.248486
7 seed reaches Sweet Sixteen with probability 0.148009
8 seed reaches Sweet Sixteen with probability 0.064476
9 seed reaches Sweet Sixteen with probability 0.052084
10 seed reaches Sweet Sixteen with probability 0.080832
11 seed reaches Sweet Sixteen with probability 0.093788
12 seed reaches Sweet Sixteen with probability 0.082892
13 seed reaches Sweet Sixteen with probability 0.054193
14 seed reaches Sweet Sixteen with probability 0.024973
15 seed reaches Sweet Sixteen with probability 0.007745
16 seed reaches Sweet Sixteen with probability 0.001405

Now that I think about it, I think it's this bump for the 10 through 13 seeds that accounts for their "upset" reputation, more than their success in the first round.

The trend continues, though with decreased intensity, in the third round, in order to reach the Elite Eight (although notice that there's a bump at both the 6 seed and the 10/11 seed)...

1 seed reaches Elite Eight with probability 0.732698
2 seed reaches Elite Eight with probability 0.510341
3 seed reaches Elite Eight with probability 0.302688
4 seed reaches Elite Eight with probability 0.127560
5 seed reaches Elite Eight with probability 0.081095
6 seed reaches Elite Eight with probability 0.081461
7 seed reaches Elite Eight with probability 0.056441
8 seed reaches Elite Eight with probability 0.025441
9 seed reaches Elite Eight with probability 0.019169
10 seed reaches Elite Eight with probability 0.024748
11 seed reaches Elite Eight with probability 0.020596
12 seed reaches Elite Eight with probability 0.009123
13 seed reaches Elite Eight with probability 0.004812
14 seed reaches Elite Eight with probability 0.002918
15 seed reaches Elite Eight with probability 0.000807
16 seed reaches Elite Eight with probability 0.000101

...but interestingly, it's essentially gone by the time one reaches the Final Four.

1 seed reaches Final Four with probability 0.535913
2 seed reaches Final Four with probability 0.222277
3 seed reaches Final Four with probability 0.106313
4 seed reaches Final Four with probability 0.053660
5 seed reaches Final Four with probability 0.030044
6 seed reaches Final Four with probability 0.018613
7 seed reaches Final Four with probability 0.011601
8 seed reaches Final Four with probability 0.006982
9 seed reaches Final Four with probability 0.004848
10 seed reaches Final Four with probability 0.003929
11 seed reaches Final Four with probability 0.003042
12 seed reaches Final Four with probability 0.001761
13 seed reaches Final Four with probability 0.000753
14 seed reaches Final Four with probability 0.000221
15 seed reaches Final Four with probability 0.000039
16 seed reaches Final Four with probability 0.000004

There is a bit of an inflection in the curve at about the 10 seed, but it's still monotonically decreasing across all seeds.

Now, one might think that this is just an artifact of the strengths I chose, somewhat arbitrarily, and that either the trend itself, or its vanishing by the Final Four, might not arise with a different set of strengths.  For what it's worth, I thought that myself, and tried running the simulations with different sets of strengths.

As it turns out, with any reasonable set of strengths that I chose, the trend of the middle seeds being somewhat worse off at the Sweet Sixteen than the moderately lower seedsthat trend might arise or not, but if it did, it always disappeared by the Final Four.  I didn't find any set of strengths which retained the trend all the way through to the Final Four.  Heuristically, I think this is because by the time you get to the regional finals, you have to play the best teams in the entire bracketyou can't avoid anyone for sure.  So if your goal is to reach the Final Four, then have no fear: I don't think the lack of re-seeding significantly hurts you if you're in the middle seeds.  Your answer may be different, of course, if your goal is simply to make it to the Sweet Sixteen.

Oh yes, the question that first stimulated this exploration: What are the odds of filling out a perfect bracket?  Well, given the strengths as I initially had them, the most likely bracket is the one where all the favorites win every game.  For each region, this happens about one time in 123; since there are four regions in the entire bracket, the odds on the ultimate "chalk" bracket are that raised to the fourth power, or about one time in 225 million. That leaves the Final Four. Since any seeds can meet there, we'll simplify things and just assume that all three games can go any way, leading to an eight-fold increase in the number of overall tournament brackets, or 1.8 billion.

Some people, not accustomed to thinking through probability, will point out that no bracket has ever turned out perfectly chalk (which is true) and suggest that a slightly off-chalk bracket is individually more likely (which isn't).  One must not confuse "most likely" with "likely."  The reason that each year the tournament turns out slightly off-chalk is that there are enormously more off-chalk brackets than chalk ones.  (Strictly speaking, of course, there is only one perfectly chalk bracket.)  The fact that these off-chalk brackets are slightly less likely than the pure chalk bracket is more than compensated for by their superior numbers.

Anyway, there are a lot of brackets filled out each year, so given that the odds on the chalkier ones are probably not too much worse than 1 in 1.8 billion, it wouldn't be surprising if somewhere along the way, someone did end up filling out a perfect bracket. As far as we know, however, it hasn't happened.

Note: By the way, if you've only a passing familiarity with sports, you may wonder what I mean by "chalk."  Chalk is a sports betting term that refers to favorites winning; the more favorites win, the chalkier the outcome.  I've heard that it came from betting lines on horse races, which were written up in chalk, but I've no good idea if that's actually so.  Anyone?

EDIT: Here's a good write-up on how the term came to be.

Monday, March 4, 2013

A Harry Potter Puzzler

I recently had occasion to put together a bit of a puzzle for a fellow math fiend on the occasion of his birthday.  If you enjoy Harry Potter and recreational mathI know that's a limited bunch, but it occurred to me this blog readership might just select a little for thatyou may well like this small offering:
Gryffindor and Slytherin played a Quidditch match.
  1. Alicia Spinnet scored the first goal of the match, giving Gryffindor its one and only lead of the match until the moment that Harry caught the Snitch, winning the match for Gryffindor. The match was, however, tied three times (other than at 0-0).
  2. Gryffindor never scored two goals in succession; Slytherin always scored at least one goal in between. Furthermore, each pair of Gryffindor goals was separated by a different number of Slytherin goals.
  3. Ginny Weasley and Katie Bell each scored a hat trick (three goals) for Gryffindor.
  4. Each Gryffindor score was assisted by the Chaser who scored the previous Gryffindor goal. Each Slytherin score was also assisted by the Chaser who scored the previous Slytherin goal.
  5. Katie Bell's second goal came one hour into the match. Slytherin scored twice as many goals after that point than they did before it. Interestingly, Graham Montague scored all of his goals before that point, Cassius Warrington scored all of his after that point, and Adrian Pucey scored more than the other two combined.
Who scored the last goal of the match, and what was the final score after Harry ended it?

(Muggles—non-magical folk—may find it helpful to know that in Quidditch, goals are worth 10 points apiece.  Play continues until a player catches the Golden Snitch, which is worth 150 points and ends the match.  There are no other ways to score points.  The team with more points at the end of the match wins.  Assists are passes from one player to a different player leading directly to that second player scoring a goal.  Finally, each team has three and only three Chasers, who score all the goals; the other players are two Beaters, a Keeper, and a Seeker.)