I like me some stats, boy howdy, but there's a few things I'm not sure about. One is applying Pythagorean wins to football. For those of you who don't know the name of Data's brother, some smart baseball types realized that baseball teams pretty much try to score runs all the time. This means you can predict future performance better with run differential than record.
It works in basketball, too, because basketball teams pretty much try to score baskets all the time. A team leading may try to suck a possession or two out of the game by stalling late, but that effect is extremely minor. It works in hockey because hockey teams pretty much try to score goals all the time. A team leading late will take fewer risks but that effect is minor, too. Futz with the exponents and it's cool.
You can do this for football as well, but Lloyd/Tresselball observers can tell you that football teams do not try to score points all the time. This is because football has more state—primarily the line of scrimmage—than the other sports, and that state is simultaneously applicable to offense and defense. There is never any reason to not score in baseball or basketball. In football trying to score is riskier than running three isos up the middle and punting in a way that missing a jumper is not. Because of this, lots of personnel turnover, and wildly varying schedules, I don't think raw Pythagorean wins is a particularly useful predictive device. It does correlate some. I just don't like it. I acknowledge this is a Murray Chass sort of criticism.
I bring it up because BHGP has a long post featuring Pythagorean wins that eventually kind of discards the concept by way of praising Northwestern for consistently exceeding expectations. There's a table I'll post a bit later showing eight years of Big Ten performance versus expectations followed up by this:
The fact that most teams have such consistent "luck," when coupled with the fact that close wins and losses appear to be the strongest factor in where a team appears on the list, means this list may not be a measure of "luck," per se, but rather the simple ability to win close games. Since such ability is presumably based in large part on things like on-field experience, efficient playcalling, and clock management, the list could be considered a measure of a coach's in-game ability. Is it any wonder that the conference's biggest late-game buffoon and a geriatric who doesn't even wear a headset sit at the bottom of the list? …
It's also a credit to Pat Fitzgerald and the late Randy Walker at Northwestern. Even in its worst years, jNWU has outperformed its pythagorean expectations. In every year included in this study, Northwestern had a positive overall pythagorean margin, and in all but one the LOLcats had a positive margin in conference play.
There is an objection to this based on stock-picking monkeys.
Seriously. In 1999, a six-year-old female monkey named Raven threw darts at a selection of tech stocks that subsequently returned 213 percent. This was a bubble environment but even in that context her performance was impressive—22nd amongst thousands of funds. If you had 64 monkeys do that every year half of them would be discovered to be frauds by not beating the market, but you would expect at the end of that eight year period there would be one very lucky monkey who beat the market for eight consecutive years.
Any normally distributed set of data is going to have a lucky monkey and Ron Zook. I present a lucky monkey and Ron Zook:
Wins – Pythagorean expectation, 2002-2010
|Rank||Team||Ov +/-||Conf +/-|
Except… that is not a normally distributed lucky monkey. In conference (which is a more interesting number to me because nonconference schedules are so unbalanced), Northwestern accounts for nearly 70% of the deviation from perfectly Pythagorean records by itself. Lloydball advocates Michigan, OSU, and Wisconsin follow in order, and BHGP points out that Michigan State would be the second luckiest monkey if only the Dantonio era—more MANBALL—was considered. There seems to be something non-monkey there.
But I'm uncertain if that's good or bad if you're a fan. Does this mean manball is good at closing out games, as BHGP suggests the chart shows? It's a possibility. The other possibility (24-21 vs SDSU, 10-7 vs Utah, falling behind by 14 in the Orange Bowl before suddenly remembering David Terrell exists, etc.) is that Lloydball-type play shuts off the offense once it gets a narrow lead or until it falls behind significantly, thus leading to a lot of tight games generally slanted towards wins.
The most haunting stat from the Carr era is this: Carr was actually more likely to win a game if he entered the fourth quarter with a narrow deficit than a narrow lead. Since the point of football is to win more games, period, not more games than you were expected to based on the final score, the excellence of your coaching is bound up with your record. Exceeding expectations as Ohio State means your manball is working (until you get into a championship game). Doing so as Michigan, but never beating Ohio State, means something different.
There's too much weird stuff tied up in scoring points in football to draw many conclusions from a look at just margins. Primarily this comes down to wanting to score, which is a complicated decision based largely on your faith in the defense. This is hard when your defense is good-ish (Michigan) but not when it's terrible (Northwestern) or awesome (Ohio State). OSU and Northwestern rarely make the wrong decisions because theirs are obvious. Michigan (and Iowa, and Penn State) fans are haunted by the the decisions that turned out wrong.
BONUS GUESS ON NORTHWESTERN: Why would the Wildcats consistently exceed expectations? Guess: they feature in games with lots of points. Their spread has been as consistently effective as their secondary has been flailing, so a lot of Northwestern games feature large scores. If NW is consistently winning 42-35 that will look different to the formula than OSU grinding out 17-10 wins.
BONUS LOCALLY RELEVANT SECTION: FWIW, only one Michigan team shows up at the margins. If you think about it you'll probably figure it out:
Of course, using the full schedule allows for statistical variance based on strength of non-conference scheduling. If we look solely at Big Ten play, as close to a level playing field as we can get, Sparty still wins. It's just not 2010 Sparty:
Rank Team Py +/- 1 2008 Michigan State +2.16 2 2004 Northwestern +1.77 3 2010 Michigan State +1.69 4 2004 Michigan +1.63 5 2009 Northwestern +1.53
That 2008 Spartan squad went 9-4 (6-2) despite a total margin of victory of +28 and an in-conference margin of -7. In fact, 2008 Michigan State was one of just five teams since 2002 to post a winning record in the Big Ten despite being outscored in conference play.
The 2004 team that went to the Rose Bowl despite deploying a freshman quarterback thanks to things like nailcoeds.exe outperformed Pythagorean expectation significantly. You might be all like "a HA!" because the next year Michigan slumped to 7-5 in 2005, but they went 11-2 the year after that—there's just so much noise.