Could One Estimate Offensive Yards Per Play? (An Admittedly Strange Meandering)

Submitted by LSAClassOf2000 on

I will start this diary by saying that, with the right data, it is possible to get the actual mean yards per play on offense (actually, if you really wanted to spend the time, it is right in the box score), but as most people generally don't more than skim when they look at box scores, it dawned on me that it might be possible to approximate yards per play from other information in the box score, and indeed, I believe it is.

So, what is the minimum information that you might need? Again, if you wanted to spend the time, you could glean everything you need right from the detailed stats, but what if you wanted a quick calculation?

I share this because it is something that I honestly had not thought of before when looking at relationships within the statistics, so if it is well-known (there are people that know far more about this than I), then I will apologize for the redundancy.

I did one of my standard data dumps for this one – I took 10 seasons of Division I offensive data by season and then began searching for a few things that correspond well to yards per play. I wanted to try and keep this simple, so I was hoping to see perhaps two other stats that might be good candidates. As it turns out, there are a couple – yards per carry and yards per pass attempt. Respectively, the R-values for each are 0.720 and 0.835. This is over a fairly large sample (n=1,189).

The next challenge was simply charting these and seeing how well they did in fact trend with one another. Below are some examples from the Big Ten:

 photo PennStYPP_zps4265c3a3.png  photo OhioStYPP_zps8b9f3b08.png  photo MichiganYPP_zps75c46f4f.png  photo MinnesotaYPP_zps50b57ae0.png  photo NebraskaYPP_zps82cfecb1.png  photo IllinoisYPP_zps070b9fc9.png

So, they track each other fairly well. Indeed, they track each other well enough that it dawned on me that yards per play could be approximated by the average of the two statistics mentioned above, so this is became the next step. Could it be that offensive yards per play can be approximated by the average of yards per carry and yard per pass attempt? It would make total sense if this were true, of course, as these cover most offensive plays, right? It is important to note that the sum of yards per carry and yards per pass attempt correlates very well with yards per play – the R-value here is 0.930.

It seems you could get pretty close just based on those two numbers, at least typically. The average difference between estimated and actual yards per play on offense for all of the Division I data turned out to be only 0.12 with the mean error being all of 1.94%, so even though it is discount certain things which do happen in the course of a game, you can get a decent handle on how effectively the offense is advancing based on these numbers. It is important to note that the estimate tends to be over, but this is not necessarily accounting for things such as plays for zero yards, plays for negative yards (TFLs, sacks, etc…) the marked imbalance of some teams when it comes to rush vs. pass plays, as well as some other things.

Some comparisons from the Big Ten data specifically:

 photo IllinoisYPPCompare_zps67cf4670.png  photo MichiganYPPCompare_zps9ca738f3.png  photo MinnesotaYPPCompare_zpsb5de37d1.png  photo NebraskaYPPCompare_zpsd91ccd6f.png  photo OhioStYPPCompare_zpsae58b9b5.png  photo WisconsinYPPCompare_zpsd9516288.png

Again, this is something I never really thought of looking for within the stats, but to see that you can get a reasonable approximation of offensive yards per play from two other numbers in the event you didn’t know the exact number of plays run on offense for a team (or just weren't interested in a lot of math or a minimal amount of searching)

In any case, this data comes from historic season statistics, so the next step here is to test it at the game level, which I plan to do this year. It would also be interesting to see if a similar approximation could be constructed from yards allowed on defense, but I shall save that for perhaps the next diary.

Have I found anything particularly profund? Probably not. It’s a way to possibly estimate a number that, if you were studious and carefully studied the box score, you could simply calculate (or if you didn't have all the information at your disposal). This is more about an interesting manner in which different statistics correlate and can be applied, but those relationships are always interesting to discover even if they are not necessarily profound or novel.

OBLIGATORY (in honor of a fate my own cats will suffer in a few short weeks):

Comments

ChiBlueBoy

August 28th, 2013 at 10:19 AM ^

Wouldn't offensive yards per play = yards per attempt x percentage of pass plays + yards per carry x % of plays that are runs? What other plays are there? Btw: why do we always use mean? Seems to me median percentage of yards per play/carry etc can be more important and would correlate to first down percentage. No one seems to note it.

LSAClassOf2000

August 28th, 2013 at 10:42 AM ^

It's actually really close and in some cases even dead on, and the idea for this was to be able to estimate things without the precise information like the breakdown of pass/rush as well (this was more a test to see what the minimum amount of information to get an idea of Offensive YPP might be) - for example:

Wisconsin ran 926 plays on offense last year, 635 of which were rushing plays or 68.57%, so if you multiply that by average yards per carry, that's 3.57. It means 31.43% of the time, they passed, and they did so at a 7.55 YPA clip. The result of 7.55 x 0.3143 is 2.37. 3.57+2.37 = 5.94, which is nearly dead on for Wisconsin in this case. 

You're correct, of course - the in-season tracking was going to study the variance in estimating versus actual by game, so what you mentioned works really well for that purpose if you have the information. One of the other shortcomings of such an estimate - as you'll note here -  is that it assumes equal weight of play type, which is rarely ever the case in a game (the Big Ten, for instance, averaged about 60/40 in favor of rushing plays on offense last year). 

LSAClassOf2000

August 28th, 2013 at 11:09 AM ^

I think there's an interesting study here really - it's just a theory, but I bet median YPP by game (probably the best way to study it) would correspond fairly well to either first down percentage and maybe even various aspects of red zone offense. For example, it seems reasonable to me that if you're getting something in the neighborhood of 6 or better YPP in half your plays, you're likely getting first downs and/or seeing the end zone quite a bit.