makes my brain ache the morning after a comeback road win.
But it's a good ache.
OK, so that happened. The great thing about the Big Ten standings is that they don't take 19-point first-half deficits into account. :-) Unsurprisingly, the performance in West Lafayette reduced KenPom's probabilities of Michigan victory in each of the next three games; combined with an Illinois victory over Nebraska, the remaining schedule looks a couple of percentage points harder than it did a few days ago.
Having said that, GRIII's layup turned a 74% (pregame) probability of victory into 100%, which more than makes up for the slightly smaller future percentages. Here's an updated chart of Michigan's expected final record, again to two signficant figures:
Staee hasn't played; their chart is nearly unchanged:
Wisky, now alone in third place:
|11-7 or worse||13%|
Ohio, still up half a game on Iowa:
|11-7 or worse||67%|
|11-7 or worse||79%|
(Nebraska, unsuprisingly, has been eliminated from title contention).
Combining the various scenarios, we get the following chance of winning the title with the given record (i.e., the rows add up to 100% and represent the probability of each outcome if Michigan achieves the record listed).
|Record||Outright Title||Shared Title||No Title|
Multiplying by the data in the first table -- the chance that Michigan achieves each of these records -- gives us an 83% chance of an outright title (up from 75% on Sunday) and a 15% chance of a shared title, for a whopping 98% chance of hanging a Big Ten championship banner for 2014.
With the victory over the Boilermakers, Michigan has also locked up a first-round bye in the Big Ten tournament. (I'll spare you the details, but suffice it to say that even if they lose out, Michigan's worst possible finish is a tie for fourth, and the winning record against the top teams in the B1G standings would pay dividends in any tiebreaker).
Clinching/eliminatation scenarios for the remainder of the week and weekend:
As always, Go Blue!
(Edit: corrected a typo - "play dividends" -> "pay dividends")
makes my brain ache the morning after a comeback road win.
But it's a good ache.
98% chance...all you need to know.
Where can I find the odds of winning each of the remaining games? Which is the toughest contest left?
Minny is still fighting to make the tournament while Illinois and Indy's bubbles have probably burst by the time we fight them.
But winning at Illinois is probably going to be harder, especially since they are playing at their best (in conference).
I got these from Ken Pomeroy's site at www.kenpom.com ($). As of today, he has Michigan at 82% to beat Minnesota, 63% to win at Illinois, and 86% to beat Indiana. These numbers may fluctuate from day to day depending upon what each team does and, to a lesser extent, what everyone's opponents do.
I appreciate the post. I find this sort of thing useful, if only to clue me in on how much butt-ache I should feel if we fall flat in a particular game.
I'll hang up now and listen to the show.
Thanks for updating this... I was hoping you would. Looking forward to the next one after (hopefully) the next Michigan win.
I'm glad you've enjoyed the posts. I definitely plan to have an update this weekend. In a perfect world, the posts keep shrinking as teams get eliminated. :-)
Hypothetically (but in no way wanting this): Is there a head-to-head tie breaker if we tie MSU? Say we both go 14-4, do we win the B10 championship since we are 2-0 against MSU? Not sure what the B10's rules are on this.
At that point it's just a co-champion situation. But we would be the top team in the BTT at that point because of the tie-breaker.
Makes sense. Thanks for the info.
I fear this will be unpopular, but I find these just utterly over-mathing it and really kind of silly. No offense to the OP, who is just providing information, but I don't see the world this way.
All three games remaining will be specifically challenging.
Minnesota is desperate and they have a very athletic team. If you watch the first game at the barn, it is very hard to figure out how we're going to win as you watch through the meat of that game. They have Andre Hollins back, they have some new contributors who have emerged, Walker can be really troublesome down low - that is a very capable, very athletic team.
Indiana? Is anyone really thinking that's a gimme? Really? They present specific matchup problems for us - I mean, I'd rather take us on our home court than them, but still, difficult game.
And Illinois also has some athletes and has gotten on a mini-run. And hates us. Winning on their court is an obvious challenge. And they also have particular players who have emerged late and have turned that into a somewhat different, more dangerous team.
These games need to be played, and sweated through, and won, and no amount of math is going to push us over the finish line.
This is the sort of thing where I'd rather let it play out, root game for game, and fervently hope for the best. I know we are supposed to like math here, but in this instance, I think it's missing the reality of the games ahead. Basically, we've got three athletic teams left, each with reason to play us very hard.
Why should we let math and tons of data give us a sense of what's likely to happen, when we can instead use our (largely biased / unporfessional) own intuition.
No predicitvie analysis is going to be perfect, but with all due respect, i trust KenPom a little more than you.
"Why should we let math and tons of data give us a sense of what's likely to happen, when we can instead use our (largely biased / unporfessional) own intuition."
I'm not surprised that J took no offense, but predictably, some others did.
Here's the deal, as I see it - these percentages are marginally more useful with three games remaining than they were with four - we have three games remaining, and our opponents have... some games of their own. Comparing the three remaining games on each schedule and the chances of each of the leaders getting nicked in any of these games, I mean, it's of some interest to look at and it tells you... something - but math simply can't or does not yet reflect the whole story here.
Because it's not just intuition that told you that the math didn't fully account for the trouble we might have with Purdue on their court. The Johnsons are very athletic but limited skills players. Quite simply, their numbers against other teams will not accurately reflect their potential numbers against the type of team we tend to put on the floor. Because we tend to be limited athletically on defense. It's not a coincidence that the Johnsons have gone off against us - they are very confident they can get whatever shot they want against our defense, and I guarantee you those guys circle the UM games on the calendar.
Against an athletic defender, they are a bit screwed - a guy who can check them step for step will require advanced basketball skills to score against, and they don't have those. They rely on being able to blow by their defender, or on a short game of floaters and shorter jumpers which you can only get if you can consistently go around the man defending you. Which they can do against our teams, because on ball defense isn't a particular strength of ours.
Ken Pom will never - or certainly does not yet - reveal some of the specifics inside the numbers like that particular quirk of the matchup. Teams are just represented as a set of overall numbers, and while these numbers are being improved on each year, I'm pretty sure they don't yet reflect something like, say Noah Vonleaf's relative success against post players 6' 8" and below vs. success against post players above 6' 8".
Btw, I have prepared statistics professionally for a political think tank of ex-McKinsey employees, so I am not math averse. I just tend to think that slavish devotion to some of these stats-based analyses is misguided, and it is important to see the limitation in the numbers. As one small example, the % likelihood of a win thing they used to (and probably still run) on ESPN for mlb games - those were freaking absurd. They didn't even take into account things like the numbers of the closer who was responsible for the remaining outs - and yet people slavishly watch the little win % thing go up and down because it's comforting and it's easy.
I think it's a question of what you expect/want from the numbers.
You're right that these kinds of probabilities are far from perfect. They don't account for many potentially important variables, some like the ones you mention, and the algorithms could be messy to begin with. (Truthfully, I've been more wary than most about KenPom ever since his Wisconsin problem, and I'm skeptical of his top 5 right now.)
Still, they probably aren't terribly far off. Knowing what I know about Michigan (likely more than KenPom's formula) and Michigan's opponents (likely less than KenPom's formula), I'd come up with win probabilities pretty close to the ones he's estimating. We could both be way off, but at the end of the day, who cares? These numbers are fun to look at and tell us something that's pretty objective even if it's also pretty flawed. As a human observer of my favorite team, I'm flawed and not objective, so I appreciate these kinds of insights.
"I think it's a question of what you expect from the numbers."
I mean, that.
All I'm asking for is a reasonable relationship with the predictive limitations, which you and danross (below) both acknowledge.
I just think it's possible to go overboard in the level of devotion to such analysis because it's comforting.
As danross points out, the 27 games that came before - it's non-mathy, but the pressure on each team and the way they respond to the pressure of having three games to decide the title - limits the predictive match between those 27 games and the three remaining games with the title on the line.
Like I said, I'm specifically not math averse - I used to trade options for a living - but I'm still someone who after all this time believes that the last three outs of a nine inning game are simply not the same as the 24 outs that came before. Which has solid logic behind it - teams don't play those last three outs the way they played the previous 24 - which means that the math must adjust to reflect the different approach by batters batting in that inning, along with different managerial strategies for that inning that also necessarily present different variables.
I haven't checked recently, so perhaps the math has started to catch up to these realities, but I know that for a solid decade, people who hadn't actually thought about the math slavishly insisted that the ninth inning was the same as any other inning. Now you can make the argument that other outs in the game are more important, or that taken as a whole, the idea of the closer is overvalued, but your math is incomplete if you don't have a full reflection of all the variables that should go into making your math an accurate representation of reality. Which is the true goal of statistics.
I also have a math/analytics background and it's interesting to me that there seem to be two personality types who are wary of stats-based predictions. The first are people who hate numbers and/or don't get them at all. The second are people who love numbers and/or spend a lot of time thinking about them. From my experience, it's the group in the middle - some training in statistics, a sense that stats are valuable - that tends to put the most faith in these kinds of analyses.
I'm in 100% agreement with you (and danross) about the limitations. I think there's a serious role for human eyes and context in all of this. I also think there's a role for KenPom-style analysis, so I'm glad that's out there, too.
Man, that is so dead on, in my opinion.
I just think that it's like, you use math as a guide, and then your next question has to be: "okay, now why is the math wrong?".
And I think that often has to do with putting a human eye on it, and then trying to see if you can tinker with actual math to represent the additional factors that may occur to you in a further analysis.
I realize that my initial response may have come across as math-hating to some, but I like to think I'm in exactly the latter group that you describe.
At the point where you're considering specific limitations in the Kenpom algorithm to justify your belief in a higher level of uncertainty (and backing it up with your own professional statistics credentials), aren't you being just as mathy as the OP? Certainly you're being at least as slavish about the precise application of the statistical method (on an entertainment site) as the OP is to Kenpom's numbers.
In any case, what you really asking for (and are not yourself providing) are not really different numbers, just bigger error bars - the fact that Kenpom can't account for everything means that a margin around the nominal probability would be warranted.
Also, you yourself are abusing statistics a bit by using a single sample to argue for the invalidity of the prediction. After all, Michigan only had a 74% win probability, according to Kenpom, at the start of the Purdue game. A Michigan loss would not have made him 100% wrong, any more than the eventual Michigan win means that he should have predicted a 100% win probability.
Somehow, there's some fundamental misunderstanding in what you think it is that I've said.
Basically, I wanted to spark a discussion - which has generally been a success. A small number of people are downvoting without, it seems to me, reading the larger context I've laid out in the rest of my posts, but that can be collateral damage when you look to engage and get a conversation going.
It's this part of your post I'm struggling with: "Also, you yourself are abusing statistics a bit by using a single sample to argue for the invalidity of the prediction."
I don't see how you're getting that. Nowhere am I suggesting that the outcome of the Purdue game is some sort of mathematical proof that Ken Pom is flawed. That would be pretty kooky. And we actually won. So it would be truly strange.
I'm just showing how one prediction can be incomplete, and I would hope you would do the extrapolation that is implicit in what I'm saying. The implicit point is although Ken Pom might predict well if the concern is 100 subsequent games, that certain games will present more missing variables than others, and that with three games left (not 100), these missing variables can present substantial weight, and there may be reason to see the three remaining games as specifically less likely to hew to the mean.
So, as you yourself have said, bigger error bars, which becomes more of a concern when there are only a handful of games where you get to flip that coin.
Obviously, I certainly hope things go according to plan, but personally, there's not a whole lot of alleviation of stress. I still feel that the three remaining games present specific challenges, and that Ken Pom may be a bit off in evaluating the specific challenges they may present.
Although I suppose I will concede the point if your complaint is with my statement that this was "over-mathing" it. That sort of is what I was trying to say (if I'm right that the predictive capability has limits over these three games, then look with your eyes, as well), but it sort of isn't what I was trying to say (my challenge does involve looking inside the numbers supporting the math).
FWIW I didn't downvote you. My point is I don't really get why you felt the need to write the equivalent of a couple pages of text to basically say "statistics don't tell everything", because I don't think anyone actually disagrees with you on that point.
The OP just wanted to have a little fun playing with the Kenpom numbers, not promoting "slavish devotion" to the predictions, and honestly you seemed to be taking it too seriously. We're not talking about cancer drug trials here. I mean, your points regarding some areas you think Kenpom doesn't handle well are interesting, but I don't think you needed to wrap it in an overall critique of the purpose of statistics in sports, which turned into a pretty major threadjack.
This is supposed to be fun, and while this OP seemed to take your points well, in general I think we ought to encourage more unpaid writers to share their number crunching with the blog, and your critique could be discouraging.
Anyway not trying to be mean or anything, just noting your original couple posts came off as a bit persnickety given the topic.
No offense taken. As a Michigan fan, I never take any win for granted -- I think we've all learned that the hard way, unfortunately. :-( I take some comfort in the fact that Pomeroy's system, which has, in the past, proven to be reasonably good at judging the strength of various teams, gives us a good chance to win the games.
The non-math version is as follows: Win out and get the first outright Big Ten title since 1986. Win two out of three and guarantee at least a share of the title, and then root for Iowa or Ohio (ugh) to beat Staee. Win one out of three, and there's still a chance that we'll back into a share of the title, but we'll all be sad anyway.
Of course, that's not really enough for a diary. ;-)
Yep, no one's going to read my explanation of why the ken pom analysis may have specific limitations. That was more predictable than anything ken pom can come up with - I guarantee you that. It's good to like math - it's bad to be intolerant of anyone who questions analysis or points out the limitations of a specific method.
Thank you, at least, for not taking offense. I knew it would be unpopular to question, but the people who take math seriously and use it for a living will all tell you it is essential to question and challenge what is presented.
Thank you for your in-depth replies. I've given you a couple of upvotes to balance things out a little bit. :-)
I should be very clear about this -- upsets happen all the time. It is a common fallacy to extrapolate probability into certainty: "KenPom says Michigan has a 82% chance to win this game; therefore, Michigan will win." That's one of the reasons I've done this analysis, by the way -- the numbers are more meaningful when applied to the remainder of the season than they are when applied to an individual game. (i.e., I have more faith in the win/loss histogram than I do each individual game prediction).
Fans use win probability data because it's fun. Factoring the strength of the batter and the closer into the equation sounds good, but often the sample sizes are so small that you get less error simply by using the averages. (One of my pet peeves is the idea that a batter "owns" a pitcher, or vice versa, based upon a sample of 20 or 30 at bats).
If I had access to data about how teams fared against defensive schemes similar to Michigan's (which, unfortunately, seems to be "Let's play H-O-R-S-E!" far too often), I would love to incorporate it. :-)
Anyway, I will happily concede that none of this analysis will mean anything if Michigan doesn't go out and get the wins it needs down the stretch. But it's fun to talk about nonetheless.
Thanks for your willingness to question the sacred cows!
It can lead to some flak, but I'm usually going to stick my neck out when it comes to questioning any orthodoxy.
As to the 9th inning, you're certainly going to have sample size problems - acute when dealing with specialists and pinch hitters (who are also specialists) - but, at a minimum, the fact that umpires can squeeze the zone in the 9th inning changes things, and then you have certain teams that are particularly good at working the zone and turn those screws up in the 9th, and so then your analysis should at least reflect, say, an instance where a closer who has a tendency to struggle throwing strikes faces a team that has a higer walk rate in, say, innings 7-9 than the rest of the league.
And, unless things have recently changed, it has been my understanding that this sort of thing has not been reflected in the numbers that are presented.
And that's definitely what I'm saying in terms of fans using win probability because it's fun (and kind of easy (I mean, hey, no analysis necessary! Just go to the win %age!)).
For our team, I'm afraid it's not just a defensive scheme - it's also that we don't put five defenders on the court who can on ball defend. Yogi Ferrell presents similar overall quickness problems for it, although unlike the Johnson's, he also has advanced skills. Walton had better pack a lunch for that game. I love Spike to death, but I have my doubts about his ability to stay in front of and with Yogi (but man, what a heady play that charge was at the end of regulation. Can't say enough good things about that kid.).
I love the math, stats and defendable arguments you see here so frequently, and I love that someone took the time to do the math on our odds. Great OP.
Having said that, I also agree with the notion behind your possibly-unpopular comment despite spending an entire career in the analytics business. What models like KenPom, Sagarin, et al are great at is evaluating a lot of data to get to an empirical ranking of how teams have performed relative to each other. One should predict the same thing as KenPom does every time merely because anything else is guessing.
Statistical models aren't as great though at predicting the very next outcome based on the last 27 because the forecasted sample set is small. If we had 27 other games to play against the remaining teams on our schedule, the KenPom outcome is likely to be a lock. But we don't. Look no further than yesterday's gut-churner - obviously did not unfold like KenPom would have predicted.
The brass ring here is the first outright Big Ten title since 1986. The players know that, the coaches know that, and so do the rest of us. KenPom can't factor in the pressure on these guys and the myriad other factors that can lead to let-downs, etc. It also can't factor that into how hard Staee is going to play to see if they can prevent it. That is why we play the games.
Ultimately, it is pretty comforting to know that if we play the way we have played for the last 27 games we should expect to be the only school hanging a banner this year. What a testament to the team and its coaching staff that they are in this position. Beat the Gophers!
Thank you for your post. Please see my follow up response above where I discuss some of what a ken pom is simply not going to reflect with its presentation of teams as a set of numbers based on averages from games that may have varying degrees of relevance to the specific matchup on a given night.
I don't have a full career worth of background as you do, but like you, I have prepared statistics as my job - for some guys with pretty hefty analytical credentials. And it is, at least in some part, my familiarity with the use of numbers to reflect and present real world realities that makes me sensitive to the inherent limitations.
We as fans can do and say whatever we want regarding outcomes, scenarios, and possibilities. We are not the team. It is not as if Nik Stauskas is going to fire up mgoblog today, read this diary, and text everyone on the team, "HEY YOU GUYS WE ALREADY WON ALL WE GOTTA DO IS SHOW UP WOOOOOO!!"
Michigan has one of the greatest coaching staffs in all of college basketball. JB and co. will not let their guys look ahead, especially after last night. We all know that anyone can beat anyone on any given night in this conference. Relax and enjoy the ride!
For B1G seeding purposes I think we want to have MSU and Wisconsin as the 2 and 3 seeds to give Michigan the best shot at winning.
Michigan is playing for NCAA seeding now, keep winning and they can get to the 2-line!
As much as UM seems to have State's number this year, I wouldn't necessarily like our chances of winning three in a year against them. Wisconsin is still Wisconsin, so I don't want any part of that. My ideal BTT bracket (based on some sense of reality with the standings) would be...
|4. Ohio State|
|11. Penn State|
Would be great to have Nebrasketball jump Iowa for the 5 seed, and have us play the winner of OSU and NEB, keeping MSU, Wisconsin, and Iowa all on the other side of the bracket. Last night's Nebraska loss didn't help though...
While I know every remaining game will be a dog fight, i am particularly nervous about Indiana. The matchup against Yogi Ferrell and Noah Vonleh scares me.
Indiana at Assembly: 12-3 (3-3)
Indiana on the road: 2-7 (2-6)
I'd upvote if I could!
Am sorry to ask, but where did the STAEE moniker really come from ? Was that photo of the chest-painted sparties misspelling it, real, or photoshopped ? I ask because I want to know if its true or not, as that would truly be hilarious.
Someone had spraypainted it on a car in Ann Arbor, if I recall correctly. Not sure if it was real.
I'd start here -
They can knock down shots if given space and we're not the best defensive team out there. Most importantly, we're playing them at their place on only 2 days of rest after playing run-n-gun Minnesota. We're going to be tired as hell. I hope we put an early, big lead on the Gophers so we can get Stauskas, LeVert, and Robinson some rest. Bigs, too. I'd like to see us play with a big lead and a lot more Irvin and Bielfeldt.
Everyone made a fuss about our 5 games in 13 days, and I get it, because that was also about who we faced in that gauntlet, but we've got 3 games in 6 days here and 4 games in 9 days. Every team, iirc, has had to deal with this, but it still seems kind of ridiculous.
Amazing to think that we're guaranteed to match or surpass last year's B1G win total. I don't think the league is quite as strong at the top as it was last year but still, without Burke, THJ and McGary, that's a remarkable accomplishment.
So with the Ohio State loss they are completely eliminated correct? This means we can ignore them winning for the rest of the season? The only game we care about with them is Michigan State correct?
And then there were three. With OSU and Iowa losing, it's down to Michigan, MSU, and Wisconsin.
I'd imagine that our KenPom probabilities actually got worse tonight, though, now that some of MSU's remaining schedule looks easier (Iowa & OSU) and ours looks harder (Indiana).
KenPom agrees with your analysis; Staee's probability of winning out goes from 16% to 18%; Michigan's stays the same, at 44% (within the limits of rounding, anyway). Accordingly, the estimates of Michigan's title chances drop slightly. I didn't run the complete analysis, but the only scenario where Iowa or Ohio's record mattered was the chance of an outright title should Michigan lose out, which was never very likely in the first place. Take yesterday's estimates, subtract a percentage point or two, and you're likely right on.*
Having said that, the part of my fan psyche that says "never take anything for granted" is happy to be able to scratch two more contenders from the list. :-) Michigan is now guaranteed no worse than the #3 seed in the Big Ten tournament.
* There is a significant chance that the error inherent in these percentages is large enough that the actual chances, whatever they are, haven't changed at all. But, for want of better data... :-)