100% hot nerd action
Tourney sponsor reminder: HomeSure Lending is that. NMLS 1161358.
This began as a tool I made to fill out my brackets, then a few years ago I shared it and it became a thing. Much of the data are from Kenpom, though this year I also included ThePowerRank.com’s rankings, which Ed determines by expected margin of victory over an average opponent. Both he and Kenpom wound up pretty close, but it’s a bit more data when you’re deciding things like which should-be-a-6-seed do I choose in this 7-10 matchup? Alex Cook will have a thing later today that shows which teams got screwed the most in this year’s rather whacky seeding. Spoiler: Maryland and Minnesota shouldn’t be over the BTT championship participants.
The Tool The Tool The Tool:
To use this you:
- Follow this link to make a copy of the spreadsheet.
- Select the two teams you want to compare.
The site will be pulled from Team 1, fyi, so if you pull a match that doesn’t exist you’ll still get the distance each team will have to travel to their real site.
Thanks also go to the guy who wrote a google script to pull drive times with a formula.
In the beginning, it seemed like things might change. Michigan’s defense has been giving up more shot attempts than their offense has been generating from the drop, but the freshman class seemed to inject a bit more tenacity into Michigan’s forechecking. Opponents held the puck for long stretches, but it seemed that the prime scoring chances ceded by defenses in years past, the ones right in front of the net, may have been corrected. At least, that’s what this writer naively believed.
We’re now a bit past the midway point in the season and, thanks to some meticulous stat tracking, we have data to lean on that suggests the unchecked-man-in-front-of-the-net problem has not been remedied. An idea that’s gained popularity over the last few years among NHL advanced stats wonks is separating out from which area a shot is attempted. Those analysts have found what one might expect: more goals are scored from the area in front of the net than from the edges of the zone. Below we have scoring chance by shooting location via a Chance article by A.C. Thomas:
Based on information like the above, analysts have started to call the area with the two darkest shades of green the “home plate” area. The success rate above is based on NHL data, but the idea can be carried over to college hockey. With that in mind, David has been tracking shot attempts (in the Corsi sense; shots on goal+misses+blocked shots) all season. (Special thanks to Orion Sang and Mike Persak of the Daily for frequently providing us with shot charts.) Now that we’re past the midpoint of the season and solidly into Big Ten play, it seems that there’s enough data to see how Michigan’s defense has fared. It’s, uh…well, there’s a reason I called myself “naïve” above.
[After THE JUMP: cheery fun stuff]
You will probably have to create your own copy but then you can type in any two teams and make a comparison. Thank you to Kenpom for the data and helpful Google Sheets script writers for helping me calculate distances. Drive times are calculated as 1.3 minutes per mile.
To get a copy:
Follow this link and play around with it on google sheets.
Follow this link to the spreadsheet.
Go to "File" and "download as". Choose a format and the rest is up to you.
To use it just put the two teams you're trying to compare and the round (it will return wonky stuff if those two teams aren't able to meet there). It'll show you things like Off and Def rank on Kenpom and a win confidence based on a factor of the average 1 seed will be 100% to beat an average 16 seed. It'll also bring up the site of the game and, new this year, the distance for each team in driving hours. Last it'll show any injuries I knew about when I made it last night.
No, Upon Further Review series is not comprehensive. Most years are absent Ohio State and bowl games (including last year), and 2014 checked out after Indiana. That said, I challenge you to find a greater cache of free data than Brian's masterful charting of Michigan plays going back to the DeBord Throws Rock age.
Every so often I pull all that into a massive Excel file and try to learn things like how spread the offense was, favorite plays, etc. Let's dive in shall we?
What're those pie charts at top? Shows the relative efficiency (by yards per play on standard downs) and the mixes of Michigan's backfield formation choices. For "standard downs" I mean 1st and 2nd downs when the offense wasn't trying to do a clock thing or go a super-long or super-short distance. So no garbage time, no two-minute drills, no goal line, and no going off on Bowling Green and Delaware State. The idea is to show which offense did they get in when they had the full gamut to choose from, and how many yards did it get when the goal presumably was to get as many yards as possible.
Nothing very surprising there. Rodriguez ran his shotgun offense, Borges inherited Denard and Devin and still managed to jam them half-way into an under-center offense in three years. Then Nussmeier ran his zone melange single-back thing. Harbaugh did what Hoke always dreamed of doing, and the offense climbed back to about where Hoke's offense was with a senior (but oft injured) Denard.
[Hit THE JUMP for each year's most charted play, visualized Hennecharts, how many TEs Harbaugh used, how many rushers defenses sent, and LOOOOOTS of charts.]
Taco-ranked starters are far more likely than Glasgows [Fuller]
Every year, as college football recruiting becomes the only football thing left to pay attention to until spring, we are suddenly struck by an army of pundits so arrogantly attached to their "recruiting stars don't matter" narratives that they don't bother to care that math is against them.
Michigan typically gets taken to the woodshed in these articles for recurrently not matching recruiting expectations with on-field results. This discrepancy does exist beyond the normal J.T. Turners that everybody gets, and for various interrelated reasons: attrition spikes, spottily shoddy coaching, program instability, recruiting shortfalls. Anecdotally, there are examples I can point to, especially in the early aughts, when an otherwise two-star athlete was bumped to a three-star because Michigan offered. That explains less about how Wisconsin and Michigan State thrive on 2- and 3-stars, and more about how Michigan has recruited very few guys under a consensus 3-star.
However every time we find a new way to compare recruiting data to performance data, we consistently discover that recruiting stars handed out by the services correlate to better players. No, a 5-star isn't an instant superstar, but the 25-30 five-stars each season are consistently found to be about twice as likely to meet some performance metric (NFL draft, All-conference, team success, etc.) as the pool of 200-odd four-stars, who are consistently more likely to meet performance thresholds of the 400-odd three-stars, etc.
Today I present a new metric for proving it: starts.
|Example of raw data, via UM Bentley Library.|
ALL the Starts
My project over Christmas was to take the data from Bentley's team pages (example at right), scrub the hell out of it, and produce a database of who started what years, at what positions, at what age, with what recruiting hype, etc.
A few weeks back I released the initial results of my starts data. We noticed there were a lot of problems in that. I went back and did a lot of fixing, mostly just finding more weird errors in the Bentley pages I'd culled the data from, sometimes emailing the guys themselves to ask things like "Was there a game in 2001 that either you or B.J. didn't start?"
I think I've got it cleaned up now; at least the total number of starts for each season matches 22 players per game.
Recruiting By Starts
Starting in 1996 we start getting relatively uniform star rankings for recruits, though I had to translate Lemming rankings and such into stars (he had position rankings and national lists that line up with what we call recruits today). So I took the average of available star ratings of all players to appear on Michigan's Bentley rosters from the Class of 1996 through the Class of 2010, and put 'em against the number of starts generated. Guess what: recruiting actually matters.
|2- or 2.5-stars||29||271||9.3|
Even with Michigan's notorious luck, the 5-stars were expected to give you about two seasons of starts, compared to the 8 or 9 games you'll get out of a 2- or 3-star. That is significant, and offers a bit more evidence toward the general statement about recruiting stars: the higher the star rating, the more likely he is to be a good college football player, though at best you're at 50-50.
As for walk-ons, I've linked to the list of the 217 guys in that time period who made the Bentley rosters and weren't special teamers, in case you doubt me. The Order of St. Kovacs have accomplished great things for Michigan, but turning up one of those guys anywhere other than fullback has been rare indeed.
I'm going to try to use the starts data above to get predictive. The scatter plot of the 1996-2010 group was pretty linear so I'm just going to plug in a linear equation:
Expected Starts on Avg M Team = Stars x 5.30 - 6.35
And that gives us a reasonable expectation of Michigan starts to expect from a class based on their rankings:
click big makes
For the Class of 2011-2014 projections, I just guessed by hand, so those projections are going to be increasingly inaccurate once I'm predicting 2017 starters and whatnot.
The chart above has two stories to tell: 1) The strength of a recruiting class is strongly correlated to the value that class will produce in starters, and 2) the damage done by attrition to the 2005 and 2010 classes created ripple effects for several classes afterwards.
An Average Michigan Team:
By some quick averages I was able to get an average makeup of a starting 22. I took the average number of starts by experience (i.e. year in the program) for the classes of 1995-2010, adjusted those numbers for a 13-game schedule, then divided by 13 games to get an idea of what the starters ought to be against years of interest.
|Senior / RS Jr||7||5||4||8||8||6||8||9|
|Junior / RS So||6||10||5||4||7||7||5||6|
|Soph / RS Fr||3||3||6||2||1||2||3||4|
|AVG starter age||3.55||3.27||3.18||3.82||3.50||3.27||3.77||3.50|
By this the last two teams look extraordinarily young—about as young as the 2008 team or younger. The 2012 team by contrast seems like a wasted opportunity. FWIW I counted Devin, not Denard, as the quarterback, or it would have been even older. That fits the narrative: 2012 was a wasted opportunity, as a line with three 5th year seniors (two of whom were long-term productive starters) plus Lewan and Schofield was coached into one of the worst offensive lines in memory.