I did not make this headline up
In the aftermath of the CMU game, I’ve seen a few comments about running backs that go something like this: “If you took out X’s long run, his YPC would have only been Y, so he really wasn't that effective,” or variations thereof. This got me thinking a little about the limitations of using YPC to summarize running back performance, so I've put together a couple ways of looking at running back performance against Central.
First off, sample size concerns are rampant. Statisticians frown on many, many things, but they take particular umbrage when you do anything with a really small sample (read: less than 30). But, like our beloved coaches, we live in the real world where we have to make decisions based on incomplete information; so we continue on despite the limitations of the dataset.
Strength of competition is also suspect. We don't know for sure how good CMU will be this year, but we do know they were outscored by fifty points in the only game they've played this year. They may not be great this year.
Yards per carry is calculated by summing all rushing yards for a player and dividing by number of carries, making it an average (or sample mean). A sample mean is a very useful way of summarizing data with one nagging flaw: it is particularly vulnerable to outliers. The median, on the other hand, as the most central value, can be interpreted as a more typical expectation for a dataset. One extremely high or low value will have virtually no impact on the value of the median. Here's an example: Derrick Green's YPC for the CMU game was 6.1, 2 whole yards higher than Toussaint's 4.1. But Green's median carry of 3 is an entire yard shorter than Toussaint's 4. The YPC might lead you to conclude Derrick Green was a better bet for getting yards than Toussaint, but the median says at least 50% of Toussaint's carries went for 4 or more yards in comparison with Green's 3 or more yards. Since If you needed four yards for a first down, you may want to give it to Toussaint. That's potentially valuable information not contained in the YPC. Then there's the pesky fact that TD runs have a maximum length. If we're two yards out from the end zone, that's the maximum the player can get for that carry. This artificially lowers the YPC of a player who gets the ball over the line; in particular Toussaint's YPC would probably have been higher.
The table below contains a few measures of central tendency for the players who had at least 3 carries (three is still too small, but a line had to be drawn somewhere and Rawls' touchdown seemed to merit his inclusion in this list). Rawls gets no standard deviation because three is a small number.
QB Devin Gardner wins the YPC sweepstakes with a blistering 7.4 YPC bolstered by a median carry of 6 yards. I would advocate getting this man some more carries, but that's a) already happening and b) potentially troublesome for our passing game. Regardless, Gardner does a good job here no matter what metric you use: no negative yardage, a great longest run and two touchdowns on only 7 carries. At least for this game, our shiny "more passing-oriented" quarterback was our most effective running back, which speaks a bit to the value of athleticism at that position.
Among the running backs, Toussaint and Green duke it out for maximal effectiveness depending on which measure you use. Green wins on YPC, longest run, and least negative minimum run. Toussaint had a higher median, most touchdowns, and most carries. Rawls has the highest median of the RB's, but since he only had three carries, sample size tells us to pay no heed.
____ Yards and a Cloud of Dust
Hearkening back to the days of Three Yards and a Cloud of Dust (TYaaCoD), I wanted to know who was more reliable if you need three yards every time you rush. The table below contains the percent of carries the player achieved at least three yards, embodying the spirit of slightly-in-jest Schembechlerian Michigan Football.
Personally, though, I find three yards slightly lacking. If you run three yards every rushing play and you rush every play, you end up facing 4th and 1 every series. Our Fearless Leader would still go for it on fourth down every time (Heil Hoke!), but it's not an optimal situation to find yourself in. What you really want is someone who can pick up 3.5 yards or so every play, so you get a new set of downs after every three. The play-by-play is unhelpful in this regard, however, only listing integer values for yards. So I also calculated the Four Yards and a Cloud of Dust (FYaaCoD) metric, which is how the table below is sorted. If you get four yards every carry, you can go on rushing forever.
I did make a slight modification to the success rates of both metrics: I counted a touchdown as a success regardless of how many yards the play was because there is no further to go.
|Row Labels||Total Yds||Carries||TYaaCoD||FYaaCoD|
For TYaaCoD, you would want the following players rushing in order: 1. Green 2. Gardner 3. Rawls 4. Toussaint 5. Smith 6. Johnson. All players are between 50% and 75% successful at getting 3 yards against CMU, which is heartening. Moving to FYaaCoD, you would want 1. Gardner. 2. Rawls 3. Toussaint 4. Green, 5. Johnson 6. Smith.
There's some shuffling when you move to FYaaCoD: Derrick Green drops from first to fourth, and Smith falls to sixth at a slightly disappointing 29% success rate. Rawls still has only three carries, but two of them pass the FYaaCoD test, so he has a terrific success rate of 67%. Almost as good as Devin Gardner, who had over twice as many carries. Devin's ability to scramble is probably for real. Toussaint's actual strength as a running back comes through a bit more on the FYaaCoD metric. On his 14 carries, he hit 4+ yards 57% of the time, and he often surpassed four. That increases the chance of success for future plays, as the distance to the first down marker is smaller.
I thought about running the same analysis with passing yards, but it didn't feel right since yards per catch vary widely based on the play. Your wideout running the deep route will end up with more yards per target than the slot ninja you toss the bubble screens to. That is more schematic than based on individual skill. It is true that running plays are also not all created equal. But every running play starts behind the line of scrimmage and heads as far as possible into enemy space, making comparison a reasonable exercise.
Any statistical summary is just that: a summary. We lose information when we look at average, median, min, max, total yds, TYaaCoD, FYaaCoD, etc. that is available to us in the actual dataset. Our lizard brains just can't process significant amounts of data in numerical form in any reasonably quick fashion. But there is one thing we are great at: reading charts. So I've assembled the information from each rushing effort for everyone with 3+ rushes in order from least yards gained to most. I've colored the touchdowns Highlighter Yellow™ so you can include/exclude them from your mental calculations as needed.
For recent time's sake, Drake Johnson. Fare thee well, 2013 Drake. We hardly knew ye.
A. We were completely misguided to push for Devin-Gardner-to-wide-receiver last year when his natural position is clearly running back. The fact that QB's get an extra blocker has no bearing on this.
B. At this exact moment in time, the staff's decision to go 1. Toussaint 2. Green 3. The Field. is pretty justified. We saw flashes of brilliance from both of them—maybe even more from Green—but Toussaint overall had a better day. If Green sheds a few pounds and picks up just a hair more speed in the process, though—and I think we all expect that to happen— he could become the clear #1 even by mid-October. De'Veon Smith is not yet ready for world-beating, but he did display that vaunted balance. Hold off on judgment on him at this point.
C. Charts are indeed fun to look at.
D. Norfleet had one rushing effort for 38 yds, which I didn't include in this analysis because dividing by zero is difficult and because his YPC would make Brian cry.