Putting this season into context Part I: Season Score Metric v2.0

Submitted by taistreetsmyhero on November 19th, 2017 at 12:18 PM

This is Part I of the Season Score Metric v2.0 diary series. Part II is here.

Last week, I set out to create a "Season Score" metric in an effort to provide meaningful context beyond merely record to compare the last few seasons under Harbaugh to the Michigan glory years. The result was this diary and a scoring metric that was well-received for its novelty, but thoroughly critiqued for its huge limitations. To summarize, here is that scoring metric:

(Wins x Win%) + Quality Wins - Bad Losses + (Wins vs. Opponents with Winning Records x Win% vs. Opponents with Winning Records) + 5 if Beat Ohio + 3 if Beat MSU + 3 if Win Bowl + 5 if B1G Champion

And here are the major caveats:

  • The relative strength of OSU and MSU in a given year is ignored
  • A Quality Win vs. the #1 team is treated the same as a win vs. the #25 team
  • Shared vs. outright B1G Titles are treated the same
  • All advanced stats are ignored

The responses to the diary also pointed out that the metric was missing the major components of expectations and improvement. It rankled at least one person that Harbaugh's seasons were rated relatively poorly, as there was no credit given to the huge turnaround he managed so quickly.

As a result, the Season Score metric created a couple wonky rankings, and didn't pass the eye test for several seasons. As user Lumpers pointd out:

"...There is no way the 1985 team with that defense (3 straight shutouts during the season), a 10-1-1 record and a victory over Nebraska in the Fiesta bowl is not one of the top 10 seasons in M history over the past 49."

And so, I decided to go back to the drawing board and try to address some of these issues. If you want to get right into the new metric, then feel free to skip ahead or check out the new metric for yourself. However, before I dive in, I justed wanted to quickly touch on the merit of a season score metric in general.

Is a successful season more like a fine wine or an S&P ranking?

~~Chuck says, "The answer is always wine."~~

I was initially surprised by responses claiming that season success is an inherently subjective matter. User ChiBlueBoy summarized:

"I also appreciate trying to put some numbers to something that will always be subjective (in Jr. High I created a mathematical formula to determine if someone was "attractive," so the desire to quantify the subjective resonates with me)."

My first reaction was to scoff at this comparison. Perhaps only because I had spent a decent chunk of time making the scoring metric, but I viewed the idea as more like S&P and trying to make objective measures of a team's offense and defense. Just like it is valid to say that a running play is objectively "successful" if it gets at least 4-5 yards on first down, it is valid to say that objective components of a "successful" season include beating our rivals and winning the Big Ten title.

I think ChiBlueBoy mentioned the key contention:

"In the end, all of this is very subjective, and I imagine that each of us would come up with a different formula."

Very true. In fact, I've done just that and made an entirely new metric. However, I would counter that some formulas are objectively better than others. And, if you continue reading, I think you'll agree that this new metric is objectively better than the previous one. Nevertheless, the point still stands to a very real degree. This new and improved metric doesn't account for everything and changing the metric weights here or there would alter the rankings significantly.

Ultimately, the most powerful utility of this metric is therefore comparing seasons between tiers, rather than within. All else being equal, a season that ends with beating OSU is objectively better than one that ends with losing to OSU. But, was the 1980 Michigan team better than the 1985 squad? Well, Chuck says that exercise is as subjective as comparing fine wines.

I think an apt analogy would be that this metric is like PER for basketball players. It provides meaningful context to make comparisons between players and gives objective evidence to say that LeBron James is better than Reggie Jackson. But PER alone can't answer who is better between, say, Steph Curry and James Harden.

Season Score Metric v2.0

The new metric is computed using the sum of several individual component scores as follows:

Season Score = Expectation & Improvement Score + Wins Score - Loss Score + OSU Score + MSU Score + Bowl Score + B1G Champ Score + National Champ Score

In Part I, I'll go through each component metric individually. Part II will look closer at this season's score and try to project the future based on previous responses to down years in Michigan history.

Expectation & Improvement Score

Had MGoBlog existed all the way back in 1969, this would have been a much easier exercise. Unfortunately, there is no easily accessible database of well-informed predictions for how the Bo, Moeller, and Carr teams would perform going into the season. What we do have is preseason polls. Now, I acknowledge that the preseason AP poll is almost entirely pure conjecture. But it does give some standard measure of how Michigan was expected to perform relative to its peers every season. And we would all agree that it was disappointing when the preseason #5 2007 Michigan team finished #18, and that the 1997 season was especially amazing considering they were ranked #14 going into the season.

So, long story long, the expectation component is derived from the preseason AP ranking. I did this by taking Michigan's final AP ratings over the last 49 years (and used record rank when they were unranked by the AP) and graphed it against the seasons' winning percentage:

I then plugged the Preason AP ranking into the formula from that fit line to get Expected Win%, and then multiplied by the number of games in a season to get Expected Wins. The Expectation Score is thus:

Actual Wins x  Actual Win% - Expected Wins x Expected Win%

As an example, Michigan was ranked #7 going into last season, which correlates to 10.4 wins. This season, Michigan was ranked #11 going into the season, which correlates to 8.8 wins (for the regular season). IMO, these pass the eye test.

The Improvement Score was based on a much simpler formula:

Wins x Winning% - Previous Season Wins x Winning%

Finally, the Expectation & Improvement Score are weighted:

  • (Expectation Score + Improvement Score) / 3

Wins Score

My previous metric looked only at "Quality Wins," which were defined as "wins against opponents that were ranked (in the AP poll) when they played Michigan AND finished with a winning record, OR wins against opponents that finished the season ranked (In the AP poll)."

The problem was that it weighted a win against the #1 team the same as against the #25 team, and gave no boosts for beating the #26 or #27 team.

This new metric scraps the idea of Quality Wins and instead just looks at the winning percentage of defeated opponents with the Michigan game removed from their record:

Wins Score = Wins x Win% x Opponent Win % (Mich game removed)

Losses Score

The previous metric included "Bad Losses," which were defined as "losses against opponents that were not ranked (in the AP poll) when they played Michigan AND were not ranked (in the AP poll) at the end of the season."

This again relied too heavily on AP rankings. The new metric scraps the idea of Bad Losses and instead looks at the loss percentage of opponents that defeated Michigan (again with the Michigan game removed):

Losses Score = Losses x Loss% x Opponent Loss % (Mich game removed)

Rival Score

The previous metric gave 5 points for an OSU victory and 3 points for a MSU victory and 3 for a Bowl Game win.

This totally ignored the fact that a win against a 3-9 Michigan team is not as impressive as winning the Rose Bowl.

The new metric accounts for the rival and bowl opponent winning percentage, and is broken down as follows:

  • OSU Score: If win, then +3 x OSU Win % (Mich game removed); If loss, then -3 x OSU Loss % (Mich game removed)
  • MSU Score: If win, then +1.5 x MSU Win % (Mich game removed); If loss, then -1.5 x MSU Loss % (Mich game removed)
  • Bowl Score: If win, then +1.5 x Bowl Opp Win % (Mich game removed); If loss, then -1.5 x Bowl Opp Loss % (Mich game removed)

B1G Champ Score

Whereas the previous metric awarded 5 points for every Big Ten title, the new metric differentiates between shared and outright titles:

B1G Champ Score: If outright, then 3; If shared by multiple teams, then 3 / # of teams sharing; If Division Champ but lose Conference Championship, then 1.5

National Champ Score

Finally, the last component accounts for winning the national championship, with theoretical playoff wins along the way:

National Champ Score: 5 points for National Championship; 3 x Playoff Opp Win % (Mich game removed) for every Playoff victory

Top Ten Seasons

Year Coach Season Score Record Final AP Preseason AP E&I Score Wins Score Loss Score OSU MSU Bowl B1G Nat. Champ
1997 Carr 23.03 12-0 1 14 3.39 6.82 0.00 2.50 0.95 1.36 3 5
1980 Bo 12.52 10-2 4 11 1.38 4.11 0.09 2.45 0.45 1.23 3 0
1985 Bo 12.42 10-1-1 2 29 2.66 5.14 0.02 2.45 0.95 1.23 0 0
2011 Hoke 10.78 11-2 12 34 2.97 5.50 0.11 1.50 -0.35 1.27 0 0
1971 Bo 10.40 11-1 6 4 0.85 4.09 0.02 2.00 0.90 -0.41 3 0
1989 Bo 10.23 10-2 7 2 0.00 4.27 0.04 2.18 1.09 -0.27 3 0
2003 Carr 10.13 10-3 6 4 -0.44 4.11 0.17 2.75 1.00 -0.13 3 0
1991 Moeller 10.07 10-2 6 2 0.00 4.47 0.03 2.18 0.45 0.00 3 0
1988 Bo 10.01 9-2-1 4 9 0.41 3.23 0.02 1.20 0.82 1.36 3 0
1986 Bo 9.74 11-2 8 3 0.07 5.01 0.10 2.50 0.90 -0.14 1.5 0

The 2011 Hoke season sticks out like a sore thumb, not only because of the doom that followed, but also because that team wasn't really that great. It is the only one of those seasons that is being propped up by the Expectation and Improvement Score. However, that season certainly did exceed expectations and was a huge improvement over the previous year, and it was certainly an exceptional season (for both positive and negative reasons). Outside of that year, this metric performs much better than the previous one on the eye test.

The  2011 Hoke Season Score does serve as a reminder that this is metric does not measure the true quality of a team. The 2006 Carr team was an all-time quality team, but the team missed out on glory and thus doesn't match up to these seasons in terms of overall success. The 2016 Harbaugh team was certainly better than the 2015 team, but the 2016 team underperformed relative to expectations whereas the 2015 team greatly exceeded expectations and made huge improvements relative to the 2014 Hoke disaster.

Performances by Coach

As I explained earlier, the greatest utility of this metric is to break the seasons into tiers. This is the breakdown of the seasons into quartiles by Season Score, with Final AP rank as the eye ball test:

The biggest oddball is the 1975 Michigan squad, which finished 8-2-2 and ranked #8. The season score however puts it in the lowest quartile. That seems valid to me, given that they came into the season ranked #2, lost to OSU, lost the Orange Bowl, didn't get any share of a Big Ten title, and only beat a bunch of body bags. Sounds pretty bad to me, but I wasn't alive then, so I invite any MGoHistorians to add their perspective.

With those season quality quartiles established, we can look at how each coach has performed:

I hope this provides a little more perspective for people. Even Bo had down years. When you look at Season Scores over time, you get an even better sense that Michigan's success has ebbed and flowed. It looks like a company stock:

And while the stock may seem to be trending down recently, the trend is hardly significant:

I'll end Part I by saying that we should all know that football teams have their ups and downs. In Part II, I'll show that a key to the program's success in the glory days under Bo, Moeller, and Carr was that we were able to roll with the dong punches and snake bitten seasons and respond with great ones. And I'll look back at similar down seasons to project the future.



November 19th, 2017 at 1:41 PM ^

I find this stuff to be immensely interesting, but I do wonder if it sometimes paints with a rather broad brush situations that can deviate for any number of unquantifiable reasons.  For example, that 2003 team scored really high in this metric but I don't remember that being a particularly good team, just one that had enough NFL talent to blow out bad teams (they scored 460 points that year, which I think is a modern Michigan scoring record).  By comparison, the 2006 team scored less points but was probably a better team; they lost only to #1 OSU and a USC team still coming off a string of unprecedented success.  They just didn't win a conference title or a bowl game, so their score is dinged.

I credit you for trying to quantify a very difficult metric, though.  And like all formulas, it can be tinkered with a bit as more information becomes available.


November 19th, 2017 at 3:04 PM ^

The 2006 team was an all-time Michigan team in terms of quality, but it was also typical snake bitten Michigan squad that came up just short of any meaningful success.

The 2016 team was clearly better than the 2015 team, but it underperformed expectations and was also inches short of glory, whereas the 2015 team vastly outperformed expectations and made a huge improvement over the last year.

I would also argue that those differences in quality are very measurable, but it just takes a lot of work to incorporate that into a metric like this. I do agree, though, that adding more components to take into account quality of the team would be interesting, as quality of a team should certainly be a component of the success.

Like I said, this metric is definitely better than the previous one, but certainly not perfect or all-encompassing.

Also, sorry for the rude hot take on your Best and Worst post. I wrote that at 3AM while I was jet lagged after flying across the country.


November 19th, 2017 at 3:05 PM ^

Yeah, I don't disagree, only that I think tying success to conference and bowl wins can be a bit misleading. Like, that 2003 team lost more games and to worse teams, but OSU wasn't quite a battlestation and UM snuck in a share of a title. But I think the 2006 team had more overall success.

Still, minor point to argue. I liked this all around.

No worries about the comment. I like people who disagree, and you made strong points. I don't finish this stuff until late at night, so I totally get the snark.


November 19th, 2017 at 5:33 PM ^

Dude you are crazy. 2003 we were outright big ten champs and beat osu. That is a way better year than 2006. 2006 was a good team but they couldnt close out on a meaningful win. Sure 2003 we lost to oregon and iowa on the road early in the year. But i'll take championship trophies and beating osu over the alternative.


November 19th, 2017 at 10:12 PM ^

Hard to say. The 2003 team arguably outplayed Iowa and Oregon except for the horrible special teams play. We were more competitive with a better 2003 USC team that was probably the best team in the country, remember that they were #1 but out of the MNC because of BCS structure. But they should have been in the MNC game and were definitely a better team than OU or LSU that year.. Certainly the 2006 USC team was really good too but they were a 2 loss team and not as good as the 2003 team and they beat us more handily. And OSU's obliteration by Florida might suggest that the B1G wasn't all that hot that year given how both OSU and us were handled in the bowls. The offense was certainly better in 2003, but obviously the defense in 2006. I think that 2003 team is somewhat like 2016, record is a little misleading and they probably were better in a pure football sense than their record showed.


November 20th, 2017 at 12:11 AM ^

In 2003, OSU had one loss and was ranked like #2 in the country when we beat them.  And they were the defending national champs.  That was a massive win.  The 2006 team probably was better, but 2003 was a more successful season because we beat OSU, won the Big Ten outright, and went to the Rose Bowl.  


November 19th, 2017 at 3:56 PM ^

I know why you are most likely doing it as well.  People go crazy and it's like their whole life evolves around Michigan football and if we are not perfect and winning every possible catagory we are a failure.

Thanks for taking the time to put this info up.  It is for sure interesting to look at and compare.