The real relationship between defensive size and performance.

Submitted by ebv on

A while ago I did some analysis to further Bud Elliot's look at defensive size and defensive performance, which I found interesting, but incomplete. Unfortunately, I didn't have enough mgopoints at the time to post it, but now I do!  So here you go, if anyone still cares.

I had two problems with his analysis: First, he has an extreme selection bias, focusing on the top 20 teams, along with the ACC and SEC.  Second, he compares size to defensive rank, but we have no expectation that rank would correlate with size.  For example, if a number of teams have exactly the same size, the BIG implies GOOD hypothesis predicts that those teams would perform about the same.  In other words, we expect defensive performance to correlate with size.

Based entirely on the one linear modeling class I took last fall, I have attempted to do better.  I have barely addressed the first problem, because getting more data is hard.  However I have added Notre Dame and USC to the data on his blog.  I've addressed the second problem by replacing rank with the Defensive S&P+ score from Football Outsiders.  I don't know exactly what it means, but I gather that larger numbers are better.

Tada:


So yeah, not a clear relationship, and definitely not "Staggering!".  And, given that the data is strongly biased towards the top 20 teams,  I'd be willing to bet that this relationship would diminish further,  given the full set of NCAA starting player's weights.

There is some correlation, but is it significant?  Turns out yes, but only barely.  The F-statistic gives us a p-value of 0.014.  However, the adjusted R-squared value for the fit is very low, at 0.1374. Adjusted R-squared tells us about how much of the overall variation seen in the plot is due to variation in the independent variable (the weight of the front 7).  In this case, less than 14% of the variation is explained by our variable.  Conclusion: At least  86% of a teams defensive performance is explained by factors other than the weight of the front 7 starters  (and its probably much more than 86%).


The coefficient on the dependent variable is 0.188, meaning that for every pound your front 7 puts on, you gain about 0.2 Defensive S&P+ units.  More is better . . . but it isn't clear what units that statistic is in . . . so there you have it.

The residuals are huge.  Leaving out the largest errors and only considering the middle 50% still covers a range of 25 S&P+ units - roughly equal to the range  from (14-0) Alabama to (9-5) Clemson.  Just look at all the teams with a total size of around 1800!  They cover virtually the entire range of defensive performances!! Defensive size is just not a good predictor of success in games.

One more caveat.  While searching the web for starter's size stats, it because clear that the preseason is filled with overestimates, so the 108 pound gain over last year's line should probably be taken with a grain of salt.

Go Blue!
ebv

 



Methods.  I gathered most data from Size Matters on Defense Part 1 or Football outsiders.  Notre Dame info came from 2009 Notre Dame depth chart.  USC info came from this USC roster.

I fit the simplest of linear models, which is shown as the line on the scatterplot above (intercept -222.9).

Raw data:

Michigan        1720    94.5
Alabama 1911    149.4
Florida 1883    148.3
Maryland        1864    91.3
UNC     1855    127.6
Georgia 1846    100.1
LSU     1836    126.8
Clemson 1807    125.2
Ga Tech 1804    90.7
S. Carolina     1798    122.8
Kentucky        1796    93.5
Duke    1795    86
Ole Miss        1790    116.5
Miami   1787    113.1
Wake Forest     1786    97.7
Miss St 1785    107.6
Auburn  1781    104
NC State        1777    78.1
Arkansas        1776    103.4
Boston College  1768    112.6
Virginia        1757    110.1
Vanderbilt      1754    95.2
Va Tech 1736    122
FSU     1734    90.3
Tennessee       1726    111.7
Ohio State      1784    134
TCU     1781    145
Oklahoma        1791    147.6
Boise   1808    133.6
Iowa    1821    138.7
Ok. State       1811    116.4
Penn State      1818    127.9
Nebraska        1795    143.2
Texas   1807    131.2
Utah    1837    121.2
Notre Dame      1819    98.7
USC     1750    103.8

 

Comments

joegeo

September 16th, 2010 at 12:21 AM ^

It'd be nice to know exactly what that is.  That way the slope of that line would have a little more meaning.  And we could have a better sense of the variance between defenses of teams in the top 20.  Anyways, this is not a surprising finding.  I would expect that weight means something.  Holding all else equal... more weight means more size and strength, which is a plus for defensive capability. 

The other 84% comes from so many other things. There is technique, speed, field sense, quickness, tenacity, etc.  Let's also not forget that there is a huge variable that unarguably has a huge factor in the S&P value and that is the quality of the secondary.  Let's throw in the defensive coach and scheming into the mix as well.  It'd be interesting to do a multivariate analysis on all of the variables and see what sort of variability we can explain.

OSUstudentUofMfan

September 16th, 2010 at 3:20 AM ^

I think an additional point about defensive size, other than the fact that it has little explanatory power regarding defensive power is that  just because two variables are correlated doesn't mean that one CAUSES the other. Rather, it simply means that they covary, but that covariance could be due to some unobserved 3rd variable. Statistically this kind of relationship is known as a "spurious relationship".  For example, there is a strong correlation between ice cream sales and drowning, but that doesn't mean one causes the other. Indeed, both are related to a third variable...outdoor temperature. So even though it may look like two things are related they may not be.

It could also be that this is just a statistical anomaly. Its a 1 year study...this could just be a fluke.

My question is why would we expect this relationship? Why would having a heavier front 7 lead to having a better defense? The only argument I can think of is that the defense would be harder to push off the ball...but that applies primarly to the d-line and is really a testament to strength not necessarily girth.

Now, regarding the correlation between size of defense and defensive performance, the way to check against spuriousness is by running a multivariate regression with control variables. I think controlling for defensive scheme (3-4; 4-3; other.), offensive rank of opponents, team's offensive rank, weight of d-line only, opponent's % run/pass, and TO margin would be a good place to start.

I know such an analysis is probably not feasible, but I think that we shouldn't just assume causation here, however weak or strong the relationship

jshclhn

September 16th, 2010 at 7:04 AM ^

I would expect returning starts and recruiting hype to be much better predictors of defensive performance.  Defensive size has lost much of its luster from the heydays of Bo, Woody, and Bear Bryant.  There are a fair number of teams who are placing an increased emphasis these days on speed.

One lingering question I have is in the analysis above, are we more or less comparing weight of the front 7 to defensive performance as a whole?  It would appear that this is not controlling for performance of the secondary - though I admit I have no bright ideas to accomplish such a task.

I appreciate the analysis - a worthwhile venture for sure.

Not a Blue Fan

September 16th, 2010 at 8:56 AM ^

 

I can. I can also imagine such pettiness occurring there, as well. Then again, I've never been the provincial, narrow minded type...

At any rate, I'm glad that the OP included both the p-value and r^2 value. People commonly confuse "weak effect" with "statistically insignificant effect"; there is a subtle difference. Good on you for the analysis. I think the next step is to find statistics that better differentiate performance gains from the front 7 and performance gains from the back 4. S&P+ is a whole-defense metric, which potentially skews that hypothesis as tested. Nonetheless, there's a good likelihood that using a different (I hesitate to say "better") metric won't substantially change the degree of significance but will instead influence he strength of the effect.

Meeechigan Dan

September 16th, 2010 at 10:58 AM ^

You are either uninformed or being disingenuous. I am at Ohio State weekly (yes, pity me) and experience the high-mindedness of your fans first hand, see their contributions on their favorite sites and otherwise have learned the hard way over 22 years that a Buckeye fan is about as cerebral as a mollusk.

Meeechigan Dan

September 16th, 2010 at 1:40 PM ^

Nothing like policing stupidity with stupidity. Nice work there.

Since you didn't get it, my initial post was a compliment to the OP with a mild, playful insult at our two most hated rivals - something that by the way happens about 300 times a day on this very site, half of them as threads on mgoboard all by themselves (i.e., Sparty At It Again!!!...OSU Hating on Denard) - that some twit from OSU got all huffy about. So please, officer, go after the murderers and embezzlers on this site and leave commonplace rivalry talk alone.

caup

September 16th, 2010 at 10:33 AM ^

Martin (299)

Van Bergen (283)

 Banks/Sagesse (avg 287)

 Roh (251)

Mouton (240)

Ezeh (250)

T.Gordon (205) or Herron (220)

Total = 1815 to 1830 pounds

Michigan is big enough this year.  They were too small last year.

TheOracle6

September 16th, 2010 at 3:09 PM ^

We need to get bigger and stronger up front in the future to continue to get better on the defensive side of the ball.  In the Big Ten offensive linemen are probably some of the biggest in all of the land, we were way too small last year, in the future we need to bulk up a bit.

MGrad

September 17th, 2010 at 1:40 AM ^

Don't want to stargaze, but since some of the other factors noted such as speed, agility, tenacity, etc.,are generally accounted for  in the overall rivals/scout rankings, it might be interesting to see a correlation of the total number of stars in the front 7 or total defensive starting units versus their actual measured performance as a team.  Range from 55 stars down to 11, but most elite programs will be in the low-mid 40s, and the others in the high 20s through the 30s.

docwhoblocked

September 17th, 2010 at 9:20 AM ^

I feel we are arguing about whether mice like cheese.  Football is an athletic endeavor that requires characteristics that every coach that recruits look for in players they want for their teams.  Size, speed, stregnth, brains, endurance, leadership, coachability, and heart.  Any team with more players that combine these characteristics is going to be hard to beat.  The more interesting question that comes up is to what extent each of these interact ( that goes to the multiple regression question mentioned above).  Good post in my book.  I di think theat this blog is more thoughtful than most.  (or maybe were all just football nerds).