Dilithium Quantified

Submitted by MCalibur on

[Author note: This thing is long and pretty technical. That said, I think there will be  sufficient payoff and value for you the reader. Still, be ye warned.]

madscientist Have you ever wished there were a convenient way to rate rushers the same way we rate passers? Sure, passer rating has its weaknesses—all mathematical formulas do—but despite it's issues, I've come to appreciate passer rating as a very useful framework to evaluate a player/team when it comes to passing the ball. In the same way that finding a corner piece to a jigsaw puzzle helps you figure out it's entire quadrant, once you have an idea of what to expect from the passing game you can leap to other touchstones to determine what to expect from the running game. A rusher rating would be just the sort of touchstone needed to really start messing around for those of us who are so inclined. This diary lays out what I think should work for these purposes.

To recap some of my previous work: passer rating combines four important factors—completion percentage, yards per attempt, interception rate,  and touchdown rate—and blends them into one number. For rushing stats, important information for coming up with an analogous metric has been hard to come by until cfbstats.com came along. Tons of fascinating and useful data, for free. God bless the internet.

To come up with the rating, I looked only at positions that would be considered normal rushers (QB, RB, TB, FB, HB, SB, WR) that have an average YPC greater than zero. If you can’t meet those criterion, then you cant represent a normal rusher, thus sayeth the me. Other positions register rushing attempts but allowing the odd rush by a punter to color your view of what normal looks like would be dumb. See the chart below for more information. Also, if a guy averages negative YPC, uh, find something else to do, kthx. Other than that, no other filter was applied but some math wonk tricks were and I’ll talk about those as we go.

data discrimination

Parameter Mapping

Completion Percentage → Gain Percentage : Parsed play by play is necessary to generate a replacement for completion percentage. I opted to go for Gain Percentage: the percentage of attempts that resulted in more than zero yards. I figured the basic goal of a pass is to complete it (brilliant insight, I know) and the basic goal on a rush attempt is to gain positive yardage so…any gain of more than zero yards is mission accomplished. This parameter is as much about team skill as it is about player skill but the same can be said for Completion Percentage.

MikeHartYards Per Attempt: Direct analogue.

Interception Rate → Fumble Rate: The direct analogue would of course be fumbles lost per attempt but that’s not the right way to do it IMO.  The luck factor that influences whether or not the team actually loses possession has nothing to do with the fact that bringing possession into question is a terrible idea. So, all fumbles whether lost or not  are counted in the calculation.

There is also a bit of mathematical wonkiness deployed as well. Mike Hart is famous—at least around here—for his deftness at protecting the rock. It was awesome: 991 carries, 5 loose balls, 3 losses of possession. That was an aiight career, but these guys were kinda, sorta, maybe, better (!) at protecting the rock:

Player Team Att FMB
Jacquizz Rodgers Oregon State 789 1
Javon Ringer MSU 843 3
Montee Ball Wisconsin 924 4

OK, so the wonkiness…a lot of people who register meaningful rushing attempts do so at a pretty low level of opportunity. Even stud RBs often split carries with other backs: Eddie Lacy siphoned off carries from Mark Ingram before becoming the man, and T.J Yeldon  did the same to Eddie Lacie. So in order for fumbles to make sense for players that get meaningful carries in low doses,  we need to consider the question: at which point does a low fumble rate cross the threshold from wait-and-see to holy-crap-check-that-dude-for-stickem?

ABBACASTATS,BRUH!

P(Fumble)

What we have here is a chart comparing the observed percentage (red dots)  and the mathematical probability (blue line) that a player will have at least 1 fumble versus the number of carries he has registered. The red dots are binned in increments of 1 so the sample sizes out past 150 are pretty thin but if bigger bins are used, you’d see a scatter of points that more closely follow the mathematical fit, because… math. The blue line was derived using logistic regression.

The weirdness at zero for the mathematical expectation might be concerning as it suggests that there’s a 20% chance you’ve fumbled despite not having a single carry to your credit. However, that is just an artifact of the data. It is possible to fumble on your one and only carry as actual observations show. What the math does, though, is it considers the sample size of the observations and then finds the best fit possible to the overall dataset. There are ways of dealing with that issue, but…I rather talk about football. Also, KISS. This is good enough for my intended purpose.

Anyway, the point of doing all that is it allows me to apply what I’ll call the Phantom Protocol. Basically, I take that curve, subtract it from 1, and add the resulting value to the player’s fumble total. As the number of carries increases, the effect of the phantom fumble recedes thus leveling the playing field and letting us evaluate players with low sample size as best we can. The result of this bit of data manipulation is that a guy with no fumbles in 16 carries is assigned an average fumble rate and by the time 100 carries are registered, the penalty is not perceivable. Below 16 carries, the assigned penalty is pretty stiff but this trick levels the playing field to let us look at guys with few carries and not just dismiss them with the low sample size red card. Sure, 16 carries is still a low sample but at least the rating self corrects for the fact that fumbles take time to manifest.

Most importantly though, the protocol adequately acknowledges players with low fumble rates even though they have a lot of carries. It’s easier to have a 1% fumble rate after 100 carries than it is to have the same rate after 789 carries.  That said, after a while the fumble rates should be allowed to speak for themselves. Quizz Rodgers and Mike Hart need their proper allocation of DAP; nothing more, nothing less. I think the ghost protocol concept accomplishes exactly that.

Touchdown Rate: This one is also directly analogous but here again I’ve deployed the ghost protocol to credit guys with low sample the expectation of an eventual TD. TDs come about much more freely than fumbles do with goal line attempts and the like so this credit vanishes very quickly. But fair is fair: the protocol giveth and it taketh away.

Those are the components directly analogous to the ones used in passer rating and these would be enough to go about the business at hand. However, whereas a passer’s job is to get the ball into the hands of a play maker, players that are given the ball whether by pass of handoff are called upon to be the playmaker. Certainly the scheme, play call RPS, and execution of the supporting cast all have major influence on the results of a play but the ball carrier can do things that elevate the call from good to great. I wanted to be all formal-like and call this the Impact Run Rate but this [stuff] is s’posed to be fun, man. Hence—

Another Dimension: the Dilithium Quotient

DenardGiveTeoStiffy

The 20 yard threshold is usually referenced as registering a play as a big play. That would certainly qualify as a big play by any standard but that threshold seems to have been established somewhat arbitrarily in my opinion. On average, a generic runner on a generic team in a generic game gains about 4 yards per attempt with a standard deviation of about 7.5. Its called the standard deviation for a reason as a huge swath of observations (about 2/3rds) occur within 1 SD of the mean, or between –3 and +11 (remember: discrete data). The other 1/3 of observations get split evenly with 1/6 below -3 yards and 1/6 above 12. I’ve used objective criterion, you know, math, to define Impact Runs as those that register 12 yards or more. To register one of these the player’s entire team has to execute the play correctly, then the carrier he has to do something special (i.e. juke a dude, break a tackle, be fast). This is the real life manifestation of the Madden Circle Button and its informative. It’s the difference between Barry Sanders and Emmitt Smith.

Denard Robinson was great at this but it might be surprising to hear that he wasn’t the best. Percy Harvin in the spread option was ridiculous in this category. Percy had touched the ball a lot when he was a Gator and 27% of the time, he darted for an impact run. By Contrast, Denard’s DQ% was ‘only’ about 15%. Could you imagine Denard breaking loose almost twice as often? Of course, the scheme, the team’s execution of the scheme, and the player’s deployment within the scheme has a lot to do with this number. Florida circa Percy Harvin was galaxies away from Michigan circa Denard Robinson. Percy Harvin was the 3rd rushing option in Florida’s spread and shred, Denard Robinson was options 1-10. Also, being the QB in the spread-option means you are concern #1 for defenses: the cornerstone. That was triply the case when facing Michigan with Denard in the captain’s chair. Harvin was usually one-on-one with a guy 10 times slower than he was who was also probably pooping his pants.

Denard’s DQ% was pretty stable around 15% (scheme be damned) but his utility rate (723 career carries) was second to none save minor conference QBs. His closest proxy Pat White (684 career carries) broke loose at a 19% clip in RichRod’s Scheme.  However, the Big EEEast sans Miami and Virginia Tech wasn’t quite the Big TEEEN. Denard went up against stout defenses way more often than Pat White did and did so without the benefit of Steve Slaton or Noel Devine and the benefit of a revolutionary offensive scheme. When Pat White lost RichRod is DQ% dropped to under 12%, Denard didn’t bat an eye. Everyone *knew* they had to stop Denard and only him on *every play* and they still had their hands full trying to actually do it. The fact that Michigan could never position itself for him to win the Heisman trophy will always be one of my sports fan laments. For ever and ever and ever.  He better get a Legends Jersey or I’m qui’in’. I don't care if that’s silly. You’re silly. Where’s my bourbon?

Blending It All Together

RBRatCoef Passer Rating was developed such that an average QB would end up with a rating of 100 according to the data set that was used to develop it, which was gathered two maybe three football eras ago when linemen couldn’t really block and scholarship limits weren’t so much. I’m not sure how they went about the process of pinning the rating to average==100 and I don’t have the data to try an replicate the results…so, I kinda, sorta, you know, pulled something outta my [hat]. That is to say: I did what I think is correct or at least valid. I normalized each parameter by it’s par value, summed them together, then forced resulting rating to equal 100. Ultimately the 100 thing is completely arbitrary, but negative numbers are weird, I guess. All said, a rating of 100 means the player was a solid runner but not special, below that you wonder if he should be running at all.

Where in the World is Carmen San Diego Mario Mendoza

mendoza Now that we have a calibrated formula its time to get down to business, application. I calibrated the rating so that 100 was a normal guy, but to figuring out what par should be is a little more complicated. I mentioned earlier that if you cant get to a rating of 100 I don't think you should be a primary running option and I also think we should only look at primary running options to establish our benchmark. But being a primary running option means different things depending on where you’re lining up.

When trying to crack a nut like this I often find that the data itself will help you figure out where to chop it. In the chart below I have plotted Average Rating vs. Amount of Carries. Obviously, the better runner you are, the more carries you should see but runners that are REALLY good are few and far between…this chart shows that dichotomy very nicely. I like to look for population gaps and/or inflection points in a performance curve. Those usually a good places to drop an anchor as far as I’m concerned. When they are near each other it’s a dead giveaway. Based on the data itself I’m using 115 for RBs, 70 for QBs, and 120 for WR as performance benchmarks.

RBMendoza

QBMendoza

WRMendoza

Laugh Test

So, this is all well and good but the real test is whether or not things make sense. Here the values for the B1G in 2013:

Team Name Player Name RB Rat Attempt Yds/ATT TD% FMB% Gain% Dillitium%
OSU C. Hyde 188.35 208 7.31 0.072 0.005 0.942 0.135
IND T. Coleman 182.49 131 7.31 0.092 0.015 0.832 0.137
WISC M. Gordon 172.71 206 7.81 0.058 0.015 0.888 0.150
WISC J. White 169.86 221 6.53 0.059 0.000 0.810 0.122
IND S.Houston 157.12 112 6.72 0.045 0.009 0.786 0.152
NW T. Green 146.27 138 5.33 0.058 0.000 0.841 0.087
ILL J.Ferguson 141.77 141 5.52 0.050 0.007 0.816 0.113
MSU J. Langford 129.16 292 4.87 0.062 0.007 0.849 0.065
NEB A. Abdullah 116.18 281 6.01 0.032 0.018 0.875 0.100
MICH F. Toussaint 114.60 185 3.50 0.070 0.011 0.676 0.070
MINN D. Cobb 112.55 237 5.07 0.030 0.008 0.827 0.084
PSU Z. Zwinak 109.68 210 4.71 0.057 0.014 0.867 0.052
IOWA M. Weisman 106.12 226 4.31 0.035 0.004 0.832 0.058
PSU B. Belton 99.05 157 5.11 0.032 0.019 0.854 0.083

This generally looks pretty reasonable to me in terms of an overall ranking as well as a relative ranking. The players/team you’d expect to be at the top and bottom of the list are where they are supposed to be. If anything I’d criticize the Mendoza line at 115 given how we all feel about Michigan’s running game last year. Maybe 115 is just the threshold of suicide and 130 or better is what we fans really want from our teams. But, even this jibes with what I think.

As with passer rating, this rating depends on player skill, surrounding support, and offensive scheme. Toussaint’s YPC and Gain%—components heavily influenced by surrounding support (i.e. the O-Line)—are way under par. So is his Dilitium % which is a skill/talent/speed thing but the dude had a bum knee and he’s not that far off of par there. Makes sense. So, he hit the Mendoza line even though he had bad support in front of him, sorta like Gardner. These numbers make sense to me.

Re: Smith Vs. Green

I mentioned in my last diary that it was interesting to hear grumblings about De'Veon Smith being ahead/competitive with Derrick Green because I think the numbers bear this out. Check this out:

Player Name Att TD Fum Gain % Yds/ATT TD% Fum% DIL% RB Rat
F. Toussaint 185 13 2 0.676 3.50 0.070 0.011 0.070 114.60
D. Green 83 2 0 0.723 3.25 0.025 0.004 0.048 83.42
D. Smith 26 0 0 0.769 4.50 0.015 0.023 0.077 73.05

These guys played with the same support and in the same system so the differentiators on display here are essentially Skill and Opportunity. Neither Green nor Smith actually registered a fumble but the Ghost Protocol affect Smith’s rating more because he has far fewer carries. Indeed, Smith’s rating is also bolstered by a phantom touchdown, but this effect dissipates faster because TDs occur more frequently. So the math is screwing Smith over here a bit. Meanwhile, Smith’s Gain % and YPC (hitting the right hole at the right time) and DIL% (juking, speed, whatever) were the highest on the team last season. Yep, Small samples yadda yadda. Just sayin’.

Anyway, that's a lot of words and I hope this was worth the read. Of course, I will be referring to this information in future diaries. Thanks for reading and let please provide and criticisms or comments you might have in, uh, the comments section.

11 Days.

Comments

MCalibur

August 19th, 2014 at 11:22 PM ^

"from over the years" is the basis of this analysis. I have taken all relevant data currently available on cfbstats (2005 and beyond) and plugged it in in order to do this analysis as described.

I can kick the spreadsheet to Brian/Seth for posting but I'd rather not for two reasons: 1) cfbstats.com is the isht and deserves much traffic and 2) I want to encourge replication to make sure I didnt screw something up. Failure is always an option...

MCalibur

August 19th, 2014 at 11:31 PM ^

I have something in the works to cover the B1G already and expect to post it soon.  This stuff plays into QB projections like whoa and I'm doing an overview of the QBs we're facing this year already. Look for it there.

As for the national ranks...word. I'll try to do a Doak Walker Watchlist by kickoff of week 2 if not sooner. It will be based only upon data available today regardless of when it comes out.

Thank you for your interest.

MCalibur

August 19th, 2014 at 11:44 PM ^

I'd love to have someone that is not me track "pickable opportunities" nationally because that would be valuable in evaluating a QB's decision making and accuracy. Unfortunately I haven't seen that available at a price point I'm willing to float the ask on.

Task #1 for the offense is to maintain possession until they score: either move the chains or live to try again. Preferably the score is a touchdown. Fumbling--whenther you recover or not--is a bad idea.

BlueKoj

August 20th, 2014 at 10:26 AM ^

I agree, if an RB drops the ball...that should be counted against him irrespective of who fell on it.

 

EDIT: Interestingly, dropped passes as become a thing for WR stats...the same could be done for DBs (on a team level). You'd think it wouldn't be difficult and would be interesting enough to track. I would think teams do this already.

bleu

August 21st, 2014 at 12:01 AM ^

Right, a fumbles a fumble for this formula. I don't think having your own team recover your fumbles is a skill that improves your rating. I imagine there is a pretty steady recovery rate for fumbles that all fumblers would trend towards if given enough opportunities. 

RandomWolverine

August 19th, 2014 at 7:55 PM ^

Would be really nice if there was a way to clearly quantify pass protection efficiency.  Clearly, the biggest hurdle that backs face transitioning from HS to college OR college to NFL is the effectiveness in QB protection. Keeps a lot of great runners riding the bench.

MCalibur

August 19th, 2014 at 11:59 PM ^

Unfortuantely I think this can only be assessed via inductive/deductive reasoning coupled with subjective observation en masse (i.e. universal UFR). FOOTBALL OUTSIDERS are the folks I think can/will conquer this mountain. I'm out gunned.

BlueKoj

August 20th, 2014 at 10:29 AM ^

Thanks for the time and effort. It'll be interesting to see UM's stable of RBs this year.

EDIT: Is there a reliable measure for YAC? That would seem pertinent to a number of discussions on this board about UM RBs, and RBs in general.

Ali G Bomaye

August 20th, 2014 at 12:08 PM ^

I think one valuable concept that can be used to compare rushing success to passing success is the median gain on a play.  Pass plays almost always will have a higher yards per play than rushing plays, but the cost of that is that they're less consistent.  

Seth

August 22nd, 2014 at 9:13 AM ^

Thing 1: CFBStats is no more. Marty sold out. The new cfbstats will be....MGOBLOG. We're building the data now and will release it early this season as a part of the site. We've scrubbed a bit better, for example I went through all the "Superbacks" and labeled those who are tight ends (Northwestern) or slot RBs (Arizona) as such.

Thing 2: I figured out a really handy way to deal with fumbles: there's a 51% recovery rate for the offense on those. So we can conveniently assume that any fumble is "half a turnover" and treat it as such.

MCalibur

August 22nd, 2014 at 9:44 AM ^

I think thats an awesome investment.

If I might make a humble request, I find it to be an enormous pain in the tail to assemble and cross-reference all the information into one database so that I can then go about doing my thing. The price is right so whatever. However, if there were a convenient way to query what I want and export that to a .csv file or whatever so that I can cut to the anlaysis chase faster...that would be hella dope. Whatever though, cant wait.

Kudos to you guys.

[EDIT: oh and if y'all wouldn't mind sprinkling in the recruiting profile info into the database I might go into meltdown...]