further adventures in Jed York being unsuited for his position
I have too many thoughts on my mind after enjoying the thrilling win at ND, I decided to try to plot them down in one diary entry.
It is so enjoyable for me to watch Michigan play this season because of many things but none more important than our offense and the predictability of our defense. This team really reminds me of the 2000-2001 team under Lloyd Carr and quarterbacked by Drew Henson and starring A-Train. That team had a great high powered offense and a very generous defense, it got so bad that we gave around 50 some points to NorthWestern, but in return scored 50 some points against them (no need to bring back memories of unnecessary fumbles and the nightmares after that game). That team though was not supposed to be bad on defense but it was (it had flashes of brilliance with 2 shutouts and 2 near shutouts against bad offenses), this team on the other hand is actually expected to be miserable on defense, but is actually more than half decent up to this point.
If you think about it, denard had 500 + yards against Notre Dame accounting for most of the offense but we ended up with 28 points, add to that 3 turnovers created by the defense and a couple of critical stops and voilà, we have a W.
Just to draw some statistical comparisons between this year’s team and that 2000-2001 team:
The 2000-2001 team had the following averages:
Points scored per game = 33.7
Points against per game = 19.08 (note that they had 2 shutouts against IU and MSU and 2 near shutouts against Bowling Green and Rice)
Turnover Margin = +1.08 (Big Ten Leaders)
Passing Efficiency = 155.3 (Big Ten Leader and Michigan Record)
(I looked around and couldn’t find historical data for yards per game on offense and defense online, please help with this)
just to refresh your memory that offense had Drew Henson, A-Train, David Terrel, Jeff Backus and Steve Huchinson. And the defense had a star in Victor Hobson.
This year’s 2010 stats 2 games in:
Yards per game = 502.5 (14th nationally)
Yards against = 439 (101st nationally)
Points scored per game = 29 (62nd nationally)
Points against per game = 17 (39th nationally)
Turnover Margin = + 2
I know that the current season stats are not really representative of much considering they were just 2 games against probably average teams, but they are the only indication we have for the rest of the season. The alarming number is actually the points per game which is low compared to yards per game, that means we’re getting a lot of yards but not scoring. 14th nationally in yards but 62nd nationally in scoring, which is with a positive turnover margin, although missed FG kicks will have to contribute to that statistic. On a positive note though we are 39th nationally in points against for now, a sign of a bend but don’t break defense.
In the past my fear with Michigan was actually the part where Michigan had the ball, I was always scared of fumbles, interceptions, and a 3 and out. I used to feel more comfortable when the defense was on the field. Today, it's a different story, Michigan has a pretty dependable offense that is actually aiming to run up the score and hoping to score on every drive (contrary to most of Lloyd Carr's offenses, 'btw I love Carr') and has a defense that is supposed to give up scores, but in general they hope to do so on long drives rather than short ones so they give their offense some time to rest up and get ready for the next attack. (Just to clarify this point, it's much more comfortable for me to watch this offense play than many of the offenses of the past 10 seasons. Exceptions on offense would be the 2000 team and the 2006 team) I finally feel the feeling that UCLA fans in 1998 had, big offense, no defense! This really plays into the rule that 'the best defense is a good offense'.
As for watching the defense, it's actually kind of fun, 2 games, the same outcome, the defense has proven that they have a bend but don't break quickly mentality. They don't mind giving up 500000 yards a game as long as they don't give up more than 25 to 28 points a game. The offense has to pick up from there and score 30 to 35 points a game (the more the better). For this season at least we can't look at the game as a win if we score anywhere less than 21 points if not more. Which is great for us, the fans, that means the offense's plan is to score on each drive and not really slow down until they are up by 50 late in the fourth quarter, because a young defense can give up so many points in a 10 minute span, it's crazy!!
Only 2 games have passed and we are all optimistic now, and that is ok, I love to be optimistic for Michigan actually. The team actually deserves national respect. If we can pull off big wins in the next 3 games, Michigan could be in a much better position to deal with the Big Ten teams than last season. I hope they do and I hope we win the close ones this season. I love this offense, this coach and even the attitude of the fans this year, it is the best start of a season for me since 1997, because its unexpected. In general, its fun to have a dependable offense and predictable defense, that way there isn’t much frustration for the fans. So this team has a similar offense to the 2000 team, but doesn’t have the expectations on defense that that team had.
All I can say right now to all the fans out there is simply Go Blue and Hail to the Victors. Good luck Maize and Blue, and keep the W's coming.
Apologies if this has already been posted (I'm pretty sure it hasn't, but you never know)
Yesterday the WSJ put up a gigantic statistical database of all D-1 FBS college football teams. You can filter by teams/conferences, and also filter by whatever category you want using the check boxes.
Basically, somebody put all this stuff from the NCAA database into excel and its realllllly pretty.
[Editor's note: frontpaged for obvious reasons. A scheduling mix-up with Brian caused this to get buried earlier, so I'm bringing it back near the top. [How much] Will Michigan rue the loss of Brandon Graham? - Tim]
[Note: I have 2006 fully loaded into the database now and will be included in all future multi-year studies along with 2007-2009.]
We can all agree that sacks and interceptions are good things for the defense and bad things for the offense. But how does a viable pass rush or a ball-hawking secondary affect the performance of the opposing offense on plays where there isn’t a sack or a pick. Likewise, what is the correlation between an offensive line that gives up sacks regularly or a mistake prone quarterback?
Sacks and interceptions have very similar direct impacts on games. From 2006-2009 in games between two D1 teams in competitive game situations (the “universe” for this and most of my analysis) the average defensive unit produced 2.3 ppg worth of sacks and 2.0 ppg worth of interceptions. Sacks have a slightly higher direct value than interceptions (interceptions returns and fumble returns on sacks are not included) but does either of these correlate to a better defensive performance overall.
Chart time? Let’s make it a double.
Not entirely surprisingly, the better a defense is at producing sacks and interceptions, the better it is on downs where neither occur.
For every point per game that a defense generates due to sacks, the overall pass rush generates 1.2 ppg of additional value. Interceptions are also powerful, but not as much so. Each ppg of value a defense generates through interceptions is worth 0.9 ppg of additional value.
This analysis serves to confirm what most football fans already know. Teams that can create interceptions and sacks are good going to be better defensive teams. Whether a strong pass rush/secondary creates pressure on other downs or if strong pass rushes and secondaries are a common occurrence on great defenses is irrelevant. As most of you probably know, defenses that are good at these two things are also good on other downs. So why is this interesting…
The story becomes very different when you look at offenses. The conventional wisdom that was supported for defenses is largely blown up on the offensive side of the ball. Sacks and interceptions may be indicators of great defenses, but they are not symptoms of bad offenses.
The slope of these two charts are about 20% of the gradient of the corresponding defensive charts and virtually flat. On offense, the amount of sacks and interceptions are largely independent of performance. There is obviously the immediate negative effect of the play, but giving up sacks or throwing interceptions show virtually no correlation to success or failure on other downs.
What it means?
For one of side of the ball it merely quantifies conventional wisdom. Good pass defenses get interceptions and sack the quarterback and teams that get interceptions and sack the quarterback are often good pass defenses, even on other plays. The value they create is roughly equal to value created by the big plays.
On offense, it’s a very different story. Interceptions and sacks will always be bad plays for an offense, but their rate of incidence is not strongly correlated to performance on other downs. In fact, if given the choice between a quarterback who threw a lot of picks the prior year but was generally successful otherwise and a quarterback who was very safe but not all that productive, my guess is you will be better of going for the quarterback with the picks.
Special thanks goes to Ty and The Lions in Winter who has been working on a similar line of reasoning for the Lions revamped defensive line.
Potential Future Diaries
Just some ideas I am kicking around or have half started. Let me know what you think about these or any other things you would like to see.
- A follow-up piece on fourth downs digging deeper into how the decision making changes based on the relative strengths of the offense and the opponents defense
- A broader look at “luck”, looking back over the last four years.
- When are extra yards not worth it. The secret dead zones of football.
- Probably not for several months, but a big season preview is in the works.
- Something Carr vs. Rodriguez, now that I have 2006-2007 seasons of data I have two years to compare the two more directly.
- How the best players of the last four years (TEBOW!!!) progressed over the years. Maybe a companion piece on Michigan defenders.
- Any other suggestions? An article a week means I need all the ideas I can get, I’m not afraid to beg!
[Ed: This week's Mathlete column expands on fourth down decision-making. I haven't seen a graph anywhere near as clear as those included below about how shifting the parameters of the offenses and defenses in question makes major impact on what a correct decision is. This is not a situation where you can just read the decision off a chart. Feel and personal preference will always play a role. It's a complex decision.]
Last week I wrote on the value of special teams but a very interesting side topic arose: fourth down decision making. It started with this chart:
About which I remarked:
The going for it actually peaks between 30 and 35 as more coaches don’t really know what to do so they just go for it.
So I decided to look and see what the decision chart should look like on an expected points basis.
Anything close to two different colors is a virtual toss-up. Any gains near a color transition are negligible and not worth noting, but there are very real gains to be made in the heart of the yellow section, where coaches are taking their offenses off of the field far too quickly.
A couple of quick rules of thumb:
- Don’t punt on the opponent’s side of the field.
- Really consider going for it on 4th down after crossing your own 40.
- Field goals only make sense if there are more than 5 yards to go and you are between the 10 and 30 yard lines. If you’re in opponent territory and these two criteria aren’t true, you should be going for it.
I know this is not the first time a topic like this has been presented, David Romer was mostly criticized for his paper on the topic a couple years back (thanks for the reminder Colin). [Ed: Not around here.] Of course there was the great Patriot debate last season when the Patriots elected to go for it on 4th and 2 with the lead in their own territory. Even though the majority of the arguments against this work amount to "people like David Romer and The Mathlete don’t know anything about football and just live in their parent’s basement" I did want to look at the main objections and see if they had any validity.
Objection 1: Does not account for “quick change” momentum
Below you’ll see a chart of the expected points on a drive based on field position, and how teams have actually fared. I also included drives obtained by turnover as comparison to the other “quick change” drive source.
There could be a case that drives started on a short field due to a 4th down stop generate more points than normal drives, but the small sample size reduces how strongly that argument can be made. From 2007-2009, the total points accounted for on drives obtained by 4th down stops (2523) is less than the projected points would be for any drives starting at the same field position (2580). This difference is meaningless statistically, something very damaging to the idea "momentum" helps the opposing offense after their defense gets a fourth down stop.
Adding in the turnovers does nothing to build a case for momentum after big defensive stops or turnovers. The turnover-started drive line tightly hugs the average line. As a whole, the turnover expected points line is slightly higher than the average line, but only by enough to generate an extra touchdown every 50 drives. That's about one every two years or so.
Although it can often feel like there is a big momentum swing after a big stop or turnover, there is scant evidence that it is more than our memories selecting the most traumatic or exhilarating scenes to hold onto. [Ed: for an example of this human tendency to ascribe meaning to unusual events where there is none, see any of the zillion "hot hand" studies.]
Objection 2: It assumes all offenses and defenses are average
To get a gauge on what “good” can mean in comparison to average, I plotted the best offense and best defense of the last three years against the average team’s expected points per drive.
As a rough approximation, the best offense is about a 1 point per drive better than average and the best defense makes offenses about a point worse per drive.
Scenario 1: Good offense
If your offense is as good as Florida, you should never punt against an average defense. Maybe if you are deep in your own territory, but only in the most extreme situations. This assumes that a new first down gives the Florida offense an extra point over an average team in expected value and a 10 percentage point increase in the likelihood that they convert.
A punt is conceding any chance of scoring and an offense this good should not give up that right so easily. This is the basic philosophy behind the vaunted no punting HS coach in Arkansas. His team isn’t necessary good because he doesn’t punt. He doesn’t punt because his offense is good. Why waste another scoring opportunity?
Scenario 2: Going against a good defense
Playing against a good defense changes the dynamic extensively but it does not mean forgoing the fourth down attempt altogether. With a reduced likelihood of success on 4th down and a reduced payout if the conversion is successful, the 4th down attempt still is an optimal strategy more than is currently utilized. Even against a top national defense, you should still not punt in opponent territory. The field goal becomes a more viable option against the stronger defense and punting becomes a much better idea all the way out to midfield.
[Ed: I think this is moving towards correct strategy since it takes a caveman or a seriously long-yardage situation for someone to punt from inside the opponent's 40 these days. That range from midfield to the opponent 40 is a spot we might see move towards fourth-down aggression in the next few years.
Also note that coventional current strategy gets way less wrong once you ramp up the ability of the defense. If we jacked it up even farther, it might get to the point where punting from the 36 (or even on third down) is a good idea. The flaws in strategy here are leftovers from an era when punting was actually the best option. Thinking has not kept pace with scoring since.]
Scenarios 3/4: Good defense or opponent good offense
The conventional wisdom is that if you trust your defense, you don’t go for it on fourth down. [Ed: In my experience the conventional wisdom is remarkably malleable on this point. If you have a good D and the announcer agrees with the call, the good D will be cited as a reason why.] In reality, the strength of your own defense (or the strength of the opposing offense) is largely irrelevant to the decision. Fourth down decisions are all about offensive opportunity. A 4th down decision to punt is the decision to take the ball out of your offense’s hand, leaving the relative impacts on your defense to negate each other. A 4th down failure puts your defense in a worse situation, but it doesn’t guarantee points for the other team; a good defense is still a major asset in stopping or limiting the other team with good field position. A punt doesn’t guarantee that the other team is going to be stopped, but a good defense makes it more likely. In the end, it’s still all about the offense.
Objection 3: Does not account for game specific situations
This objection does ring true, but its application is much narrower than most people believe. The main flaw with the expected points model is that for most of the game all points are largely equal but at the end of the game, a field goal or even time can become crucially important. If a field goal can tie a game, take the lead, or move said lead from one possession to two (or vice-versa), the decision-making process suggested above can shift radically. This could mean punting near midfield to prevent a short field goal drive for the other team or taking a field goal instead going for it on fourth in field goal range.
These situations are rare, however, and only come into effect in the fourth quarter. When there are likely to be even 2-3 additional possessions, the expected points model still holds up.
Another potential game situation not accounted for above is the presence of a high quality field goal kicker. A very accurate field goal kicker will move the blue field goal “bubble” in the above charts down, making fields more practical in short yardage situations. An above average kicker from long range will move the bubble left. Even a great kicker won’t make kicking inside the 5 practical in very many situations.
Conclusion: In Which Romer Is Re-Iterated
Teams need to be using kickers and punters less and their offenses more. Especially teams with good offenses. If you have a good offense, bringing out the punter should only be done in long distance situations or when deep in your own territory. Scoring touchdowns is the valuable thing in football and giving away a quarter of your plays to kick on fourth down greatly reduces your ability to score them, the gain in field position from a punt is worth less than it is currently perceived to be and the idea that momentum is obtained from a quick change of possession is to be slight at best and most likely non-existent.
One final thought I haven’t been able to quantify yet: if you switch to a fourth down mindset, what opportunities does it open up in play calling during the first three downs of a series. Planning on four plays for a first down instead of three would surely have some value for an offense to adjust and re-optimize their play calling, and the total offensive value could become even greater.
Note: apparently Brian Burke at Advanced NFL Stats and I have been having some of the same offseason thoughts as he just put up another piece on 4th down decision making, and this after we both introduced similar defensive player evaluation metrics within a month of each other.
Coach Schiano here. You might remember me from such fine diaries as MGoStatistics, Visualizing the Hennechart (aka the Hennegraph), and some other forgotten gems (the last being a drug-induced haloscan rant of epic proportion). Or you might not. But at least those stats got some front page love from blogmaster-in-chief Brian, despite the purported "diss". PYTHON RULES!
In last week's post, we summarized some word counts over the years to definitively show that Brian is awesome, which he is. What left a bad taste, however, was the weak attempt at the end of that diary to summarize word usage via a single Wordle. Yes, Wordle is awesome, but no one Wordle can this blog describe, as someone famous once said; probably not somebody associated with Wordle, though.
Thus we bring you a deeper analysis of the blog via the simple tool of Word Frequency Analysis (WFA). By simply counting how many times a word is used, great insight into this blog and its content can be achieved. Or, at least, mild amusement can your way be brought. Minimally, sentences can in Yoda style be written.
The results below come from (somewhat arbitrary) comparisons of the frequencies of different words. The conclusions come from my brain. Thus, the former can be trusted, and the latter should likely be dismissed. But hopefully each analysis is clear: a table, with a list of (frequency, word) pairs, where frequency is the number of times that particular word appeared in mgoblog over its entire lifetime, 2004 until present.
And now, for the results! Brace yourselves, this gets ugly.
First, we analyze how often particular sports are mentioned:
Now, an analysis of how often various places are mentioned:
Now we study the popularity of various coaches:
You might find yourself wondering about the dominant mgoblog receiver. If so, we give you the receiver analysis:
Who is mgoblog's favorite running back? Well, this was an easy one to guess:
Onto the quarterback competition:
And now we study two particular schools of football philosophy: Lloydball and Tresselball.
Speaking of football philosophy, we also study the dominance of the spread:
Now we move onto more important matters, like the study mascot names:
Finally, if you'll indulge, we'll get into some slightly more off-topic terms. Let's start with food. What about the food preferences of mgoblog? Sadly, not much data here, making us wonder if Brian eats very much or is rather some kind of blog-creating Cyborg sent from our future UofM overlords to get us through these rough times (possible, no? hmm? HMMM?). But from what we could find:
Being a blog of international repute, mgoblog also mentions some people of differing nationalities:
Brian also uses his fair share of saltier language. For example:
"I suppose it is possible that Germany is a plant biology major and spends his time before the snap screaming "I gonna sprout all up in your ass, mothafucka*" at the quarterback, but it seems unlikely."Classic.
Sorry, one last set of bad words:
Just keep moving folks, keep moving. And let them never be mentioned again. Speaking of which:
Just keep moving folks, keep moving. And let them never be mentioned again. Speaking of which:
We end with some fairly random studies. First, a gender study yielded the following information about the different types of "boys" mentioned on the blog:
And we conclude with some word counts that we noticed "coincidentally" ended up at the same frequency. Or did they?????
Summer is upon is, and with it, a bit of a lull in our mgoblogging fervor - there are simply not as many sports to talk about. The great wait for the football season begins.
With this in mind, what better time to celebrate this very blog in some bizarre and uniquely mgobloggish way? Hence I present: MGoStats, a statistical look at this blog over the years since its inception.
It began on December 4th, 2004, with the following post at 6:30am by some guy named Brian:
An inauspicious beginning, to say the least, but thus mgoblog was born. In the years since, we have all come here for a multitude of reasons: to celebrate the highs, commiserate during the lows, but mostly for one single reason, which is to hear what one Brian Cook has to say about all matters Michigan Football (and occasionally other sports).
So I found myself wondering: how much has Brian said over the years? A couple of python scripts later, I had some answers. I wrote a trivial script to download the entire blog (old pages are available through links of the form
http://www.mgoblog.com/?page=X, where higher
X values link to older pages), and then a less trivial script to parse the downloaded content into a more manageable form. The python SGML parser is amazing, for those of you who care about such things.
What I found follows below. Note: there may be some errors, but I believe the results to be in the right ballpark.
Perhaps the single most amazing fact is that Brian himself has written something on the order of 3 million words (or typed about 17 million characters) over about 3600 articles. Wow! That's a lot of content, from his hands to our eyeballs.
|Who||Articles||Words (Millions)||Characters (Millions)|
The table shows these sums, as well as the sums across all contributed articles (including ones from Tim, TomVH, formerlyanonymous, and anyone else who has made the front page). It might be interesting to see how these counts (number of articles, number of words, number of comments made by users) play out on a week-by-week basis. So interesting one could even make a ... chart? Chart. Or actually, Charts.
The first chart I present is the number of articles published per week over the entire existence of mgoblog.
From the chart, one can observe some interesting facts. First, from mgoblog we should expect about 14 articles per week on average over the course of a year. Second, that number is notably higher in the fall (no surprise), and lower in the spring. Finally, and perhaps most interestingly, one can see the growth of the mgoblog community in the orange bars, which represent articles written by somebody other than Brian; this content, which now represents a significant portion of mgoblog, picked up halfway through last year and has continued to get stronger. Brian's efforts at making the blog more than just himself are clearly having an impact.
The second chart just shows the number of words on a per week basis:
The graph reflects the same trends seen above, but in word counts. Even early on, Brian was producing above 10,000 words per week during football season, and last year during the same season, we were spoiled with over 30,000 words per week about the sport and team we love.
Finally, I show the number of comments per article:
The big effect in this graph is the lack of comments before the switch to the new blog infrastructure (e.g., the Haloscan era). The other effect is the growth of the community: the difference in the number of comments in Fall '08 and Fall '09 is likely a sign of the increased importance of this site as a place for the broad UM football community. Aside: the one early outlier which has a large number of comments (Fall '06) is just full of a bunch of comment spam: Unverified Voracity 99 Bonus Guest. Who knows why it's there, but Brian should probably remove those comments.
I was also interested in what the longest articles were, but that should have been obvious: UFRs. Here are the ten longest articles (by number of letters in the article):
- 10. Upon Further Review: Defense vs Notre Dame (by Brian on September/16/2009, 48949 letters long)
- 9. Upon Further Review: Defense vs Iowa (by Brian on October/14/2009, 49477 letters long)
- 8. Upon Further Review: Defense vs Indiana (by Brian on September/30/2009, 49913 letters long)
- 7. Upon Further Review: Offense vs Iowa (by Brian on October/15/2009, 50279 letters long)
- 6. Upon Further Review: Offense vs Illinois (by Brian on November/5/2009, 50421 letters long)
- 5. Upon Further Review: Defense vs Purdue (by Brian on November/11/2009, 51002 letters long)
- 4. Upon Further Review: Offense vs Purdue (by Brian on November/12/2009, 51279 letters long)
- 3. Upon Further Review: Offense vs Notre Dame (by Brian on September/17/2009, 51572 letters long)
- 2. Upon Further Review: Offense vs Western Michigan (by Brian on September/10/2009, 51616 letters long)
- 1. Upon Further Review: Offense vs Indiana (by Brian on October/1/2009, 51721 letters long)
If you remove the UFRs from the list, these ten get the longest billing. A number of previews and various other summaries show up:
- 10. Michigan 2007, Part II: Defense (by Brian on August/31/2007, 28513 letters long)
- 9. Michigan State: Sometimes The Bar Eats You (by Brian on August/13/2007, 28636 letters long)
- 8. Purdue 2007: You're Killing Your Father, Larry (by Brian on August/23/2007, 29656 letters long)
- 7. Purdue 2008: Tiller On A Treadmill (by Brian on July/31/2008, 29964 letters long)
- 6. Illinois Preview: Redact This (by Brian on August/9/2007, 30014 letters long)
- 5. Michigan Preview 2005: A Tale Of Two Units, Part I (by Brian on August/30/2005, 30163 letters long)
- 4. Offense Unit By Unit, 2008 (by Brian on August/26/2008, 33989 letters long)
- 3. Michigan Preview Part I: Offense (by Brian on August/29/2006, 34844 letters long)
- 2. Penn State Preview: Stupefying (by Brian on July/20/2007, 35006 letters long)
- 1. Michigan 2007, Part I: Offense (by Brian on August/30/2007, 38809 letters long)
Most-Commented Upon Articles
I was also interested in the most commented-on articles. They were:
Nothing gets people rev'd up like the Offense's Units, or RAWK MUSIC, I guess.
Finally, I was generally curious as to what words show up in the blog. Sounds like a case for a ... chart? Nope. But close, a wordle:
The word cloud here shows a list of the most popular words used in this blog, with some editing done by y.t. to remove words like "the" (actually the most popular word on the site) and so forth.
Anyhow, that's all for now. An amazing amount of content, built up over the years on the backs of UFRs and other regular features we all know and love. Thanks Brian for all the hard work - it is truly staggering to see the sheer verbiage that has powered the site over the years.