I will start this diary by saying that, with the right data, it is possible to get the actual mean yards per play on offense (actually, if you really wanted to spend the time, it is right in the box score), but as most people generally don't more than skim when they look at box scores, it dawned on me that it might be possible to approximate yards per play from other information in the box score, and indeed, I believe it is.
So, what is the minimum information that you might need? Again, if you wanted to spend the time, you could glean everything you need right from the detailed stats, but what if you wanted a quick calculation?
I share this because it is something that I honestly had not thought of before when looking at relationships within the statistics, so if it is well-known (there are people that know far more about this than I), then I will apologize for the redundancy.
I did one of my standard data dumps for this one – I took 10 seasons of Division I offensive data by season and then began searching for a few things that correspond well to yards per play. I wanted to try and keep this simple, so I was hoping to see perhaps two other stats that might be good candidates. As it turns out, there are a couple – yards per carry and yards per pass attempt. Respectively, the R-values for each are 0.720 and 0.835. This is over a fairly large sample (n=1,189).
The next challenge was simply charting these and seeing how well they did in fact trend with one another. Below are some examples from the Big Ten:
So, they track each other fairly well. Indeed, they track each other well enough that it dawned on me that yards per play could be approximated by the average of the two statistics mentioned above, so this is became the next step. Could it be that offensive yards per play can be approximated by the average of yards per carry and yard per pass attempt? It would make total sense if this were true, of course, as these cover most offensive plays, right? It is important to note that the sum of yards per carry and yards per pass attempt correlates very well with yards per play – the R-value here is 0.930.
It seems you could get pretty close just based on those two numbers, at least typically. The average difference between estimated and actual yards per play on offense for all of the Division I data turned out to be only 0.12 with the mean error being all of 1.94%, so even though it is discount certain things which do happen in the course of a game, you can get a decent handle on how effectively the offense is advancing based on these numbers. It is important to note that the estimate tends to be over, but this is not necessarily accounting for things such as plays for zero yards, plays for negative yards (TFLs, sacks, etc…) the marked imbalance of some teams when it comes to rush vs. pass plays, as well as some other things.
Some comparisons from the Big Ten data specifically:
Again, this is something I never really thought of looking for within the stats, but to see that you can get a reasonable approximation of offensive yards per play from two other numbers in the event you didn’t know the exact number of plays run on offense for a team (or just weren't interested in a lot of math or a minimal amount of searching)
In any case, this data comes from historic season statistics, so the next step here is to test it at the game level, which I plan to do this year. It would also be interesting to see if a similar approximation could be constructed from yards allowed on defense, but I shall save that for perhaps the next diary.
Have I found anything particularly profund? Probably not. It’s a way to possibly estimate a number that, if you were studious and carefully studied the box score, you could simply calculate (or if you didn't have all the information at your disposal). This is more about an interesting manner in which different statistics correlate and can be applied, but those relationships are always interesting to discover even if they are not necessarily profound or novel.
OBLIGATORY (in honor of a fate my own cats will suffer in a few short weeks):