Bracketology: Odds of Deep Runs or Early Outs

Submitted by dmonet on

With all the talk about seedings and potential national championship runs, I've been curious about what the odds look like for Michigan getting to the Final Four / Championship Game / Winning it all.

I decided to take ESPN's bracket and calculate the odds using KenPom's Pyth score to simulate the bracket.  The percentage in each column reflects the odds of the team reaching that round given the opponents they are likely to face along the way.

The percentages are calculated using a composite of a team's odds of winning against all possible opponents they might face in a given round rather than just against the team they are most likely to see.  I think this gives a better overall picture.

Here's what I have:

    Odds of making it to Round #:          
    Round of 32 Sweet 16 Elite 8 Final 4 Title Game Champion
  SOUTH (North Texas)            
1 Kansas 94.2% 49.6% 39.7% 23.8% 8.1% 3.9%
16 Southern 5.8% 0.4% 0.1% 0.0% 0.0% 0.0%
8 Wisconsin 38.3% 16.5% 12.1% 6.1% 1.6% 0.6%
9 Pittsburgh 61.7% 33.5% 27.3% 16.8% 6.0% 3.0%
               
5 North Carolina St. 56.8% 32.7% 7.8% 2.5% 0.4% 0.1%
12 La Salle 43.2% 22.1% 4.2% 1.1% 0.1% 0.0%
4 New Mexico 65.4% 33.1% 7.3% 2.2% 0.3% 0.1%
13 Louisiana Tech 34.6% 12.1% 1.6% 0.3% 0.0% 0.0%
               
6 Ucla 51.1% 20.3% 5.9% 1.7% 0.3% 0.1%
11 Iowa St. 48.9% 18.9% 5.3% 1.5% 0.2% 0.0%
3 Michigan St. 83.3% 56.0% 24.5% 10.7% 2.6% 0.9%
14 Harvard 16.7% 4.8% 0.7% 0.1% 0.0% 0.0%
               
7 Mississippi 65.1% 20.3% 10.1% 3.6% 0.7% 0.2%
10 Memphis 34.9% 7.2% 2.5% 0.6% 0.1% 0.0%
2 Syracuse 93.9% 71.4% 50.8% 29.1% 10.0% 4.8%
15 Northeastern 6.1% 1.1% 0.2% 0.0% 0.0% 0.0%
               
  WEST (Los Angeles)            
1 Florida 98.7% 91.2% 78.5% 67.0% 54.6% 41.7%
16 Montana 1.3% 0.2% 0.0% 0.0% 0.0% 0.0%
8 Georgetown 48.9% 4.1% 1.5% 0.5% 0.1% 0.0%
9 Colorado 51.1% 4.4% 1.6% 0.5% 0.2% 0.0%
               
5 Wichita St. 70.1% 29.4% 4.5% 1.9% 0.7% 0.2%
12 Maryland 29.9% 7.3% 0.6% 0.1% 0.0% 0.0%
4 Ohio St. 76.7% 53.5% 12.4% 7.1% 3.6% 1.6%
13 Akron 23.3% 9.8% 1.0% 0.3% 0.1% 0.0%
               
6 Nevada Las Vegas 64.1% 25.9% 10.4% 1.7% 0.6% 0.2%
11 Illinois 35.9% 10.3% 2.9% 0.3% 0.1% 0.0%
3 Miami Fl 87.6% 60.7% 33.4% 8.3% 4.0% 1.6%
14 Davidson 12.4% 3.1% 0.5% 0.0% 0.0% 0.0%
               
7 Kansas St. 37.9% 11.3% 3.9% 0.5% 0.1% 0.0%
10 Kentucky 62.1% 24.8% 11.4% 2.0% 0.7% 0.2%
2 Arizona 91.6% 62.5% 37.3% 9.7% 4.8% 2.0%
15 Niagara 8.4% 1.4% 0.2% 0.0% 0.0% 0.0%
               
  MIDWEST (Indianapolis)            
1 Michigan 97.0% 82.3% 61.1% 38.1% 22.3% 9.8%
16 Charleston Southern 3.0% 0.5% 0.0% 0.0% 0.0% 0.0%
8 Notre Dame 49.5% 8.5% 3.0% 0.7% 0.2% 0.0%
9 North Carolina 50.5% 8.8% 3.1% 0.8% 0.2% 0.0%
               
5 Creighton 68.3% 47.4% 19.2% 8.8% 3.7% 1.1%
12 Southern Mississippi 31.7% 16.3% 4.2% 1.2% 0.3% 0.1%
4 Oregon 66.1% 27.2% 7.7% 2.5% 0.7% 0.1%
13 Stephen F. Austin 33.9% 9.1% 1.6% 0.3% 0.1% 0.0%
               
6 San Diego St. 56.3% 17.1% 7.2% 2.4% 0.7% 0.2%
11 Belmont 43.7% 11.2% 4.1% 1.1% 0.3% 0.1%
3 Louisville 89.9% 68.9% 46.4% 26.5% 14.8% 6.2%
14 Stony Brook 10.1% 2.7% 0.6% 0.1% 0.0% 0.0%
               
7 Virginia Commonwealth 67.8% 28.6% 10.6% 3.7% 1.3% 0.3%
10 Oklahoma 32.2% 8.7% 2.0% 0.4% 0.1% 0.0%
2 Gonzaga 92.4% 61.5% 29.1% 13.3% 5.9% 1.9%
15 Long Beach St. 7.6% 1.1% 0.1% 0.0% 0.0% 0.0%
               
  EAST (Washington, D.C.)            
1 Duke 94.9% 70.7% 46.0% 26.9% 14.7% 6.0%
16 Western Illinois 5.1% 0.7% 0.1% 0.0% 0.0% 0.0%
8 Baylor 44.5% 11.7% 4.4% 1.4% 0.4% 0.1%
9 Colorado St. 55.5% 16.9% 7.1% 2.7% 0.9% 0.2%
               
5 Minnesota 85.4% 58.9% 29.2% 15.0% 7.0% 2.4%
12 Temple 14.6% 4.3% 0.7% 0.1% 0.0% 0.0%
4 Cincinnati 78.0% 32.6% 11.9% 4.6% 1.5% 0.4%
13 Lehigh 22.0% 4.1% 0.6% 0.1% 0.0% 0.0%
               
6 Marquette 55.9% 31.3% 8.2% 2.4% 0.6% 0.1%
11 Middle Tennessee 44.1% 22.2% 4.9% 1.2% 0.3% 0.0%
3 Butler 70.5% 37.1% 9.7% 2.8% 0.7% 0.1%
14 Valparaiso 29.5% 9.4% 1.3% 0.2% 0.0% 0.0%
               
7 Missouri 39.5% 7.9% 4.0% 1.1% 0.3% 0.0%
10 Oklahoma St. 60.5% 16.3% 9.9% 3.5% 1.1% 0.2%
2 Indiana 96.1% 75.2% 62.0% 38.1% 22.0% 9.6%
15 Florida Gulf Coast 3.9% 0.5% 0.1% 0.0% 0.0% 0.0%

Let me know what you think

dmonet

January 29th, 2013 at 4:37 PM ^

I totally agree that Florida's chances are pretty ridiculous and that they'll come down as the year goes on. I'll also have to check the math to make sure all the formulas are correct, but I'm pretty sure they are. Just thought it would be interesting to look at how the odds broke down given this is the first time in a long time Michigan has been poised for a deep tourney run. This just emphasizes how much luck is involved. Also, if there is any interest to see this regularly, it's pretty easy for to update.

notYOURmom

January 29th, 2013 at 4:26 PM ^

Do the "percent chance to be in title game" column adds up to Waaaaaaay more than 100% how can that be; I realize there are two teams in the title game but any given team can only be on one side of the bracket - they don't get a chance at the other side

Trebor

January 29th, 2013 at 4:32 PM ^

Well, there's a 100% chance a team emerges from the South/West half of the bracket and a 100% chance that a team emerges from the Midwest/East half of the bracket. So the total should be 200% (but only 100% for the top half and 100% for the bottom half, which by my quick excel math is the case).

oriental andrew

January 29th, 2013 at 5:15 PM ^

The column headings are a little misleading.  OP made titled it such that the header indicates the next round the winner would advance to.  For instance, Kansas vs. Southern is the first round, but would advance to the round of 32, per the column header.  So the Title Game column indicates that the winner in that column would advance to the title game.  The last column indicates that the winner would be the Champion. 

 

Nick

January 29th, 2013 at 4:52 PM ^

Is that these probabilities of reaching each specific round are determined by the pairings in this bracket.

The problem with that is teams like Wisconsin and Pitt are underseeded compared to their rank on Kenpom.

So then Kansas gets matched up with the winner of Wisc/Pitt, a matchup that is unlikely to happen ( as is any specific matchup at this point).

The result is Kansas has under 50% chance of winning two games in this study, solely because of their tough matchup in this mock bracket.

It would be more representative to run a simulation with each teams pythagorean rating v. an average pyth rating for the opponent from each 'pod' in prior tournaments.

dmonet

January 29th, 2013 at 4:58 PM ^

Definitely. I don't have access to that data but I'd definitely be interested in seeing that. I started out just playing with the bracket and it ended up looking like something interesting so I threw it up here. What you mention would do a much better job of answering the question that I was originally curios about. I could also look at what the odds would be using a snake setup based strictly on kenpom or based on the averages kenpom ranking for each seed per the espn bracket.

Nick

January 29th, 2013 at 5:10 PM ^

Just pointing out that with this methodology, the matchups can create some unintuitive results.

People should view these odds as a decent picture of how this specific bracket would play out, but not exactly take them as representative of how the average bracket out of a huge sample of unique brackets would play out.

StateSmells

January 29th, 2013 at 5:03 PM ^

The 16 seeds have 1, 3, 5 and 6 percent chance of beating their respective 1 seeds. 

If that was at all realistic and roughly representative of probabilities for past years, we should have seen a 16 beat a 1 by now.

swan flu

January 29th, 2013 at 5:14 PM ^

1) its based on specific Kenpom match-ups, not previous performance.

2) dont fall victim to the gambler's fallacy.  If you flip a coin 100 times and it lands 'heads' 60 times, the probability of the next coin flip being a 'heads' is 50%, not 60% as your point of view seems to indicate. 

LJ

January 29th, 2013 at 6:34 PM ^

I don't think he was using the gambler's fallacy--he was just saying that if an average 16 would really beat an average 1 about 3% of the time, it would be very unlikely to have gone so many tournaments without a 16 ever beating a 1.  Thus, the numbers here are probably being too generous to the 16 seeds, likely for the reason explained by the comment below (the 16 seeds will actually be worse than these teams).

rdlwolverine

January 29th, 2013 at 5:19 PM ^

In the actual tournament, the 16 seeds will not be as good as the 16 seeds in this bracket.  Some low major schools will win their conference tournaments and grab autobids from teams that are higher ranked on Kenpom.   Those teams will have much lower chances of pulling the upset than those used here.

swan flu

January 29th, 2013 at 5:09 PM ^

The nit-picking geek in me is annoyed at so many 0.0% probabilities... they really should be <0.1% Since it is NOT mathematically impossible for a 16 seed to win three games.

LSAClassOf2000

January 29th, 2013 at 5:37 PM ^

If I read this correctly, this would project the following - per this table, the chances of the champion coming from the South bracket would be about 14% and the chances of the champion coming from the West bracket, by this figuring, to be about 48%. The Midwest and East brackets would sit around 19% each.

It is interesting that - in this projection - one bracket would be that much more imbalanced than the other three by appearances. Even though there are only 3-4 teams in each bracket which would have a >1% chance of being the champion, nobody has quite the easy run like Florida would in this scenario. The two that would come close, at least until later rounds, would be Syracuse and Lousiville..,perhaps Michigan and Indiana too. 

We can compare some of the results with TeamRanking's algorithm. They only discuss the projections for the Sweet Sixteen and the Final Four as well as the  Champion, but the ten most likely to win it all, in their view, are as follows:

 

Team Sweet 16 Final 4 Champ%
Florida 83% 50% 24.70%
Indiana 76% 38% 14.90%
Duke 67% 26% 7.20%
Kansas 68% 27% 7.00%
Louisville 65% 24% 6.80%
Syracuse 63% 21% 5.10%
Michigan 62% 20% 4.70%
Gonzaga 56% 15% 3.10%
Arizona 57% 16% 3.00%
Pittsburgh 42% 12% 2.70%

Interesting to think about this, however, and thanks to the OP for sharing this work.

Nick

January 29th, 2013 at 5:37 PM ^

and I have no freaking clue what numbers they are using to conclude Indiana is 3 to 4 times more likely than Michigan to win the tournament.

It stems from them projecting Indiana to win the Big Ten and get a #1 seed much more often than Michigan, but given the performances of the two teams so far, I have no clue why they project Indiana get get a better seed than Michigan.

Michigan's advanced stats are about equal to Indiana and inferior to only Florida.  I would imagine they're over-weighting defensive efficiency or something, but regardless I think their formula is whack.

swan flu

January 29th, 2013 at 5:52 PM ^

I don't know what formula they used but your point about them projecting Indiana to win the big ten by a slight margin could be a culprit, it could be exponentiating the small difference between the two teams unfairly. In a sense they could be using the same data point multiple times which would result in falsely high confidence? Or i could be monstrously mistaken, it has even a few years since I took stats.

Nothsa

January 29th, 2013 at 5:45 PM ^

KenPom's numbers for Florida are ridiculous, though that's because Florida is absolutely blowing the doors off the SEC. The Gators played a pretty solid preseason slate and, aside from two pretty close non-home losses, have been crushing opponents. The last time they won by less than 15 was... well, last March. They've won 16 games by at least 15 (and often 20+) points.

Are they really that dominant? Hard to say until tournament time, given the quality of the SEC, but this guy is going by Pomeroy's numbers, and that's why they have the big numbers.

DH16

January 29th, 2013 at 6:29 PM ^

I wonder if there was any way to do this with past tournaments, with the probability all calculated before the tournament started, and then after the tournament see how well the predicitions held up. I think it would be interesting to see how well these TeamRanking predictions hold up, and other things like how improbable some of the cinderellas are or if some teams perpetually choke away good odds/make a habit of winning with bad odds.

snarling wolverine

January 29th, 2013 at 6:39 PM ^

Given that no #16 seed has ever beaten a #1 (and there have been over 100 of these 1-vs-16 matchups since 1985), shouldn't the #1 seeds each have like a 99.9% chance of winning the first round?  

 

Yeoman

January 30th, 2013 at 5:31 AM ^

KenPom has Florida 7 1/2 points better than the next best team (Michigan), which looks to be the biggest spread there's ever been between #1 and #2 in their eleven years of data.

At Massey, that difference is a little less than 2; Sagarin has it at about 2 1/2 (Sag has Indiana and Louisville ahead of Michigan). I'm using the Power and Predictor ratings here, not the versions with an ELO result built in, since those are the versions comparable to KenPom and they're the one's established as the best predictors.

To put it another way, KenPom doesn't think anyone in the country would be less than a 7-point underdog to Florida; Sagarin thinks there are 16 schools within seven points, Massey thinks there are four.

That seems to be a consistent trend through the years, as far as I can tell. KenPom consistently thinkis the strongest teams have distanced themselves by larger spreads from the teams below them.

Does anyone know enough about the systems to know why this would be? Or have any data on how the various systems have done with their predictions, historically?