In the aftermath of the CMU game, I’ve seen a few comments about running backs that go something like this: “If you took out X’s long run, his YPC would have only been Y, so he really wasn't that effective,” or variations thereof. This got me thinking a little about the limitations of using YPC to summarize running back performance, so I've put together a couple ways of looking at running back performance against Central.
First off, sample size concerns are rampant. Statisticians frown on many, many things, but they take particular umbrage when you do anything with a really small sample (read: less than 30). But, like our beloved coaches, we live in the real world where we have to make decisions based on incomplete information; so we continue on despite the limitations of the dataset.
Strength of competition is also suspect. We don't know for sure how good CMU will be this year, but we do know they were outscored by fifty points in the only game they've played this year. They may not be great this year.
Yards per carry is calculated by summing all rushing yards for a player and dividing by number of carries, making it an average (or sample mean). A sample mean is a very useful way of summarizing data with one nagging flaw: it is particularly vulnerable to outliers. The median, on the other hand, as the most central value, can be interpreted as a more typical expectation for a dataset. One extremely high or low value will have virtually no impact on the value of the median. Here's an example: Derrick Green's YPC for the CMU game was 6.1, 2 whole yards higher than Toussaint's 4.1. But Green's median carry of 3 is an entire yard shorter than Toussaint's 4. The YPC might lead you to conclude Derrick Green was a better bet for getting yards than Toussaint, but the median says at least 50% of Toussaint's carries went for 4 or more yards in comparison with Green's 3 or more yards. Since If you needed four yards for a first down, you may want to give it to Toussaint. That's potentially valuable information not contained in the YPC. Then there's the pesky fact that TD runs have a maximum length. If we're two yards out from the end zone, that's the maximum the player can get for that carry. This artificially lowers the YPC of a player who gets the ball over the line; in particular Toussaint's YPC would probably have been higher.
The table below contains a few measures of central tendency for the players who had at least 3 carries (three is still too small, but a line had to be drawn somewhere and Rawls' touchdown seemed to merit his inclusion in this list). Rawls gets no standard deviation because three is a small number.
QB Devin Gardner wins the YPC sweepstakes with a blistering 7.4 YPC bolstered by a median carry of 6 yards. I would advocate getting this man some more carries, but that's a) already happening and b) potentially troublesome for our passing game. Regardless, Gardner does a good job here no matter what metric you use: no negative yardage, a great longest run and two touchdowns on only 7 carries. At least for this game, our shiny "more passing-oriented" quarterback was our most effective running back, which speaks a bit to the value of athleticism at that position.
Among the running backs, Toussaint and Green duke it out for maximal effectiveness depending on which measure you use. Green wins on YPC, longest run, and least negative minimum run. Toussaint had a higher median, most touchdowns, and most carries. Rawls has the highest median of the RB's, but since he only had three carries, sample size tells us to pay no heed.
____ Yards and a Cloud of Dust
Hearkening back to the days of Three Yards and a Cloud of Dust (TYaaCoD), I wanted to know who was more reliable if you need three yards every time you rush. The table below contains the percent of carries the player achieved at least three yards, embodying the spirit of slightly-in-jest Schembechlerian Michigan Football.
Personally, though, I find three yards slightly lacking. If you run three yards every rushing play and you rush every play, you end up facing 4th and 1 every series. Our Fearless Leader would still go for it on fourth down every time (Heil Hoke!), but it's not an optimal situation to find yourself in. What you really want is someone who can pick up 3.5 yards or so every play, so you get a new set of downs after every three. The play-by-play is unhelpful in this regard, however, only listing integer values for yards. So I also calculated the Four Yards and a Cloud of Dust (FYaaCoD) metric, which is how the table below is sorted. If you get four yards every carry, you can go on rushing forever.
I did make a slight modification to the success rates of both metrics: I counted a touchdown as a success regardless of how many yards the play was because there is no further to go.
|Row Labels||Total Yds||Carries||TYaaCoD||FYaaCoD|
For TYaaCoD, you would want the following players rushing in order: 1. Green 2. Gardner 3. Rawls 4. Toussaint 5. Smith 6. Johnson. All players are between 50% and 75% successful at getting 3 yards against CMU, which is heartening. Moving to FYaaCoD, you would want 1. Gardner. 2. Rawls 3. Toussaint 4. Green, 5. Johnson 6. Smith.
There's some shuffling when you move to FYaaCoD: Derrick Green drops from first to fourth, and Smith falls to sixth at a slightly disappointing 29% success rate. Rawls still has only three carries, but two of them pass the FYaaCoD test, so he has a terrific success rate of 67%. Almost as good as Devin Gardner, who had over twice as many carries. Devin's ability to scramble is probably for real. Toussaint's actual strength as a running back comes through a bit more on the FYaaCoD metric. On his 14 carries, he hit 4+ yards 57% of the time, and he often surpassed four. That increases the chance of success for future plays, as the distance to the first down marker is smaller.
I thought about running the same analysis with passing yards, but it didn't feel right since yards per catch vary widely based on the play. Your wideout running the deep route will end up with more yards per target than the slot ninja you toss the bubble screens to. That is more schematic than based on individual skill. It is true that running plays are also not all created equal. But every running play starts behind the line of scrimmage and heads as far as possible into enemy space, making comparison a reasonable exercise.
Any statistical summary is just that: a summary. We lose information when we look at average, median, min, max, total yds, TYaaCoD, FYaaCoD, etc. that is available to us in the actual dataset. Our lizard brains just can't process significant amounts of data in numerical form in any reasonably quick fashion. But there is one thing we are great at: reading charts. So I've assembled the information from each rushing effort for everyone with 3+ rushes in order from least yards gained to most. I've colored the touchdowns Highlighter Yellow™ so you can include/exclude them from your mental calculations as needed.
For recent time's sake, Drake Johnson. Fare thee well, 2013 Drake. We hardly knew ye.
A. We were completely misguided to push for Devin-Gardner-to-wide-receiver last year when his natural position is clearly running back. The fact that QB's get an extra blocker has no bearing on this.
B. At this exact moment in time, the staff's decision to go 1. Toussaint 2. Green 3. The Field. is pretty justified. We saw flashes of brilliance from both of them—maybe even more from Green—but Toussaint overall had a better day. If Green sheds a few pounds and picks up just a hair more speed in the process, though—and I think we all expect that to happen— he could become the clear #1 even by mid-October. De'Veon Smith is not yet ready for world-beating, but he did display that vaunted balance. Hold off on judgment on him at this point.
C. Charts are indeed fun to look at.
D. Norfleet had one rushing effort for 38 yds, which I didn't include in this analysis because dividing by zero is difficult and because his YPC would make Brian cry.
So I was curious how each team in the B1G was doing Home vs Away, and decided to do a quick compilation of some data. Here are resulting charts (H means home team won, A means away team won, N means that game does not happen, and U means it's an upcoming game still):
And a chart with just the records by team:
As most of us are aware, Michigan is the only team with a perfect record at home in the B1G so far this year. For as bad as Michigan has played on the road this year, Minnesota has the biggest difference in win percentage for home/road games. Michigan comes in at #2, and Iowa at #3.
The average B1G team wins 64% of their home games to just 36% of their away games. I'd be curious to see how that compares to other conferences, but I'm too lazy right now to try to dig up the data for that.
For the chart and graphomaniacs out there in the MGoUniverse:
You may need spectables to be able to read some of the print, but well worth the prescription eyewear.
Let me first say I am not an RR hater. In fact, I wanted him to succeed as badly as anyone, and am appalled at the crap he has had to put up with, and the unwillingness of so many fans to acknowledge that he had so many poorly stocked (not unstocked) cupboards at some many position groups upon his arrival.
That said, I am just as frustrated as anyone else at the current mess.
Fact is, as has been posted elsewhere today, the 2008 and 2009 offenses scored more points in the first halves of Big Ten games than the 2010 offense did. That is incredible. To wit:
|Year||PF, 1st halves vs B10||PA, 1st halves vs B10||M turnovers, 1st halves vs B10|
Stark improvement in the second halves this year, but because by the end of the 3rd quarter in the MSU, Wisc, Iowa, PSU and OSU games most or all the necessary damage had been done, each took its foot partially or completely off the gas in the 4th quarter until (Iowa and PSU) pressed, in which cases both merely got the clinching score needed.
|Year||PF, 2nd halves vs B10||PA, 2nd halves vs B10||M turnovers, 2nd halves vs B10|
Sure, there are myriad ways to interpret these stats. Few of them reflect well on the 2010 team, or RR.
You can never win or lose a game in the first half, but you can come close. A game's dynamic changes completely if a team gets out to a three-score lead.
I've looked at the play-by-plays and drive charts closely for this year's team, and for the 2008 team. And yes this year's team is a yard-gaining machine. The record-holder in M history -- well, or at least as far back as the late 1930s, when official NCAA stats started being kept. Indeed, 500 yards a game is impressive. On paper.
It is far less impressive when so many of those yards are gained between the 20s, or at least don't make it all the way in.
For instance, here is a look at how our first-half drives in Big Ten play (save half-ending kneel-downs) went:
|TDs||FGs||Missed FGs||Punts||Downs||Fumble lost||Interception|
(For those adding up, these TDs and FGs add up to 93; the fumble return vs Purdue brings the number to 100. And one of the first-half turnovers occurred on a KOR vs Wisc, hence the fumbles lost and INTs immediately above add up to 11, not 12).
There were many long first-half drives in Big Ten play that ended badly -- in fumbles, interceptions, on downs, or missed field goals. These mistakes effectively rendered all those yards gained on those drives moot. They're no more helpful to the scoring cause than punt yards. Because, really, when the 08 team kept punting from around its own 40, the other team would get the ball at around its 20 without having been scored on. The only difference with this year's team making so many mistakes in the first half is that the other team would acquire the ball at about the same location on the field, but instead of after a punt, rather after an M turnover, or on downs, or after a missed FG. There is no difference on the scoreboard.
A mistake prone-team renders its gaudy yard totals moot with its mistakes.
23 turnovers (whole game) in Big Ten play last year, and 23 turnovers in Big Ten play this year. That's almost 3 per game.
Ain'ta gonna cut it.