# Is Our O-Line Really That Young? Third Time is the Charm!!!

Submitted by Gameboy on November 7th, 2013 at 2:27 AM

Just when I thought I was I out, they pull me back in!

I don't know why I am such glutton for punishment, but I am finding this topic interesting (and not just in football sense, but statistically as well). I want to contribute one last time.

Many people on the threads have pointed out that just counting the class experience (basically age) is not enough, you need to count the actual games started as well.

I agree, games started should be part of this analysis.

AmazinBlue pointed out that Phil Steele has published a convenient list of all the games started by the players on the roster before the season began (http://www.philsteele.com/Blogs/2013/JUN13/DBJune08.html). Since the data is so handy, I figured I would go ahead and combine both sets of data and make a handy dandy XY Scatter chart. X-axis is the total combined number of Class Experience (i.e. Frosh=1, rs Frosh = 1.5) and Y-axis is the total number of previous games started.

As you can see from above, Michigan is in a better place than at least four teams (Auburn, UCLA, LSU, and Texas Tech), and surprisingly not that far away from Alabama.

Statistically, Michigan is within one standard deviation from the mean on Total Games Previously Started and just .16 away from one standard deviation for Total Class Experience. That, by definition, says Michigan o-line is not an outlier.

Again, the data says Michigan o-line is young, but not "outlier" young. There are other teams in top 25 who are just as inexperienced and a few who are even in a worse position. Blaming all of our woes on o-line experience does not paint the entire picture.

if we remove the tackles and just look at the interior line?  I think that's the real crux of the issue.

Once you add the tackles in Michigan looks a lot better because we have a lot of experience there.

But why stop with interior line? Why don't we just get rid of C and RG while we are at it and just compare the LG, which we know is the youngest in the country!

Of course, you are just manipulating the data to get the result that you want. If you want to "prove" that Michigan is an outlier, there are many different mental gymnastics we can go through to find the data that best fit our preconceived notion. That is not what data analysis is really about though.

And as I have said before, if interior line is of utmost importance, we have a former guard playing tackle that we can easily slide back to interior.

Explain to me how Taylor Lewan and Schofield blocking their guys well helps Glasgow block the NT. You seem to refuse to except the fact that the OL is only as good as the weakest link because it hurts the result that you're looking for (that Michigan isn't that young at OL *on average*)

They're only younger if you use the definition that you used (average), which IMO is flawed. I posted a re-working of your source data looking at it from a collective percentage of any one person blowing up a play towards the bottom. Give that a read and let me know what you think.

As an aside, just a helpful suggestion here, but I think if you took a slightly less argumentative, defensive tone, the discussions in your diaries will likely be much more civil and constructive. Either way, thanks for the effort for collecting the data.

I think he's getting argumentative and defensive because for the third straight day, people are attacking his diary for fairly ridiculous reasons that he's spent days trying to address.

Hell, I didn't even write the thing and I'm getting argumentative and defensive about it. At least one person accused him of manipulating data because he calculated an average. That's not true in a literal analytic sense or a colloquial sense, that's just ridiculous.

The point is that tons of people are trying to get him to understand that if he uses the average, his data don't address the actual issue. Stats are complicated that way. Even taking a simple average can completely obscure the information that's actually relevant. In fact, simple means are almost never used in statistical analysis of data (in my field, anyway). If your model doesn't account for intra-sample variation, your model is worthless.

"Stats are complicated..."

Not that you'd know this, but I have a Ph.D. in psychology from the U.Texas, specializing in behavioral analytics and am a professional research analyst. I am more than familiar with statistics.

For three days running, the tone on these diaries has been (not exclusively, but to a strangely high degree) shitting on Gameboy for putting in the effort to compile some basic information and then do a pretty simple summary of it. He presents all the raw data, and yet very few (basically, only one that I've seen) posters bother to use it to formulate a thoughtful response. It's all "what are you, some kind of idiot? It's clear that X," where "X" can be things like "number of freshman starting" or "the interior is the most important part" or "offensive lines aren't blenders" or "you don't understand statistics."

I respect your analysis. I similarly have a doctoral degree from Temple University in Pha. and am also occupied as a professional researcher and clinician. I also have experience working with statistical analysis. In reading GameBoy's work I am impressed with his effort but I think both of us would agree that, professionally speaking, application of the mean in analyzing line play is a very limiting metric and not one that would enable meaningful correlation. Certainly, you wouldn't expect to publish using that method? The problem is that GameBoy is utilizing that mean based data, while not accounting for confounding variables, to establish correlation. By doing this I cannot trust his correlation as there are too many factors that could otherwise inform the outcome.

I advised GameBoy yesterday to consider the total starts per line as the metric. He unneccesarily overcomplicates his analysis with continued application of the subjective  "Total Class Experience" which takes away from essentially comparing apples to apples (on-field experience). Using "Total Class Experience" he assumes that development year 1 to 1.5 and 2 to 2.5 etc. is static and equal program to program which it is clearly not on both counts. I'd throw that data out and focus solely on the total starts which keeps the comparison to on-field starts.

Beyond this what is entirely missing is any inclusion of the TE position (an integral part of the line). I also suggested this yesterday. Given that he is essentially ranking classes GameBoy would be better to utilize percentile (imho) looking strictly at total starts and including the TE position for each school.

GameBoy, percentile rank is calculated PR% = L + (0.5 x S)/N where L equals the number below rank; S equals the same number in rank and N equals the total numbers.

I am open to stats geeks telling me I am completely wrong in this lunch hour analysis. Keep up the good work GameBoy, I'm enjoying.

There's a lot to consider. I will say this, my reading of this is that it is a very simple calculation meant to address a very narrow question. I think part of my frustration is posters wanting to take this and run way too far with it, then finding it to be lacking. Could this be done better, more elaborately, and/or more statistically rigorously? Absolutely. But people seem to be mad that this post isn't something different that it's not trying to be. That said, I'll respond a bit to some of your points.:

"Certainly you wouldn't expect to publish..." No, I wouldn't expect this to be published in an academic journal, if that's what you're saying. That's a pretty insane bar for a user-generated post on a sports blog. There is something to be said for understanding context and audience.

"The problem is that GameBoy is utilizing that mean based data, while not accounting for confounding variables, to establish correlation. By doing this I cannot trust his correlation as there are too many factors that could otherwise inform the outcome." What correlation are you talking about? All he's doing is showing how Michigan's average falls in the distribution of the top 25's averages. There is no correlation to not trust. This kind of gets at why this has been such a weirdly contensious thing; he's not saying anything other than "look how this average compares to these other averages." People are extrapolating out way too much from that.

"He unneccesarily overcomplicates his analysis with continued application of the subjective  "Total Class Experience" which takes away from essentially comparing apples to apples (on-field experience)" I agree that experience is probably a more useful metric than class. But nothing is perfect. A RS senior getting his first start isn't the same thing as a RS freshman getting his first start. Either way, he provides (or, in the case of starting experience, links to) the raw data for other people to look at.

"Beyond this what is entirely missing is any inclusion of the TE position (an integral part of the line)." Not every team uses TEs the same way, not every formation has TEs, and not every TE has the same responsibility. Introducing TEs introduces a huge amount of team-to-team, and even play-to-play variance. As far as I know, every team will use TGCGT.

As far as percentiles, I agree with that. I'd much rather have a "Michigan sits at the Xth percentile" than a kind of wonky scatterplot-like chart.

EDIT: One thing I want to make clear, I really appreciate thoughtful discourse and discussion, so thank you, Clarkie. I'm not trying to attack you or anything, I think you make a lot of good points, even if I don't agree with them. I appreciate the time and thought that went into your response.

DOUBLE EDIT: I almost went to Temple to get my Ph.D. It was down to UT and TU.

Appreciate the analysis: Go Owls although their football team causes me infiniately more pain than ever being a Wolverine fan and season ticket holder. Michigan fans have it so easy vs. TU football fans, believe me. At least with Michigan I've enjoyed more success than the New Mexico Bowl (though it was glorious).

I think we could debate back and forth correlation statement/inference based on OP's concluding assertions from version 2.0 and his easing on those today. Mean based analysis provides a discussion point, as you indicate, but nothing more. OP's strong statements in version 2.0 probably why the board was harsh on his methodology.

I still want to see TE in analysis but I understand your reasons for non-inclusion. Prima facie analysis tells us TE position as part of line (holding red cape and waving it about) is problematic to rushing success. Of course, prima facie analysis tells us our Senior tackles play well and our young interior is getting killed.

I hope if GameBoy runs percentile he uses an online calculator/excel. I agree that this would offer greater clarity versus the overall sample.

I don't subscribe to the interior line being of utmost importance or being only as strong as the weakest link. In team sports, you can, to some degree, help the weakest link. I've always thought the it is the second weakest link that determines your fate, because you can't effectively help 2 guys. When the 3 weakest links are bunched together, problems.

The argument about Scholfield seems off base (that is if interior were of utmost importance, the coaches would have put him there). Because the time for making that move was way before the interior struggles manifested themselves. Knowing what they know now, the coaches may well have moved him back inside to either spread out the weaknesses or concentrate the strength. Right now, we've got our weaknesses concentrated and our strengths diversified to the point that we can't leverage them.

Ok, this is getting annoying. You are refusing to listen, and instead keep trying to find new ways to keep pusing the same argument.

I think it is worth thinking about whether or not the interior line is indeed of utmost importance. Based on Yeoman's experience data for the BIG and Pac12 provided in the posts below, I've plotted experience vs YPC for tackles and the interior line separately. Sample size is still woefully small, and I probably won't go digging up O-line experience for the rest of CFB, but early returns suggest that interior line experience is indeed more meaningful than tackle experience when it comes to running the football well.

"Experience" is coded using 1 = FR, 1.5 = RS FR, 2 = SO, etc. In the graph above you can actually see a small decrease in YPC the older the tackles are (see what I mean about small sample size?). Anyway, there isn't any obvious evidence from this year's BIG or Pac12 suggesting tackle experience is strongly correlated with a good running game (r2 = 0.04 here).

When we look at the interior line, we that YPC tends to increase as the guards and center get more experienced. R2 = 0.17 here.

Again, the sample size is quite small here, only 24 teams. But this brief little analysis would suggest that interior line experience is a key factor in producing an effective run game.

Woomba, suppose that instead we remove the interior line and look at the tackles? Michigan should be steamrolling everyone?

See how silly that is?

It's a complicated issue that involves more than youth (which certainly _is_ important).

I think there's a relevant point buried here.

Try to construct a game plan that focuses on your strengrh, your two tackles.

It's hard. Borges tried something clever to do it with the tackle-over, but it doesn't work. (It might have, if we had better blocking TEs, but we don't have much experience at TE either.)

You'd like to able to line the two guys up side by side and run behind them, and I remember there was some thought that Schofield would move back to guard for just that reason. But none of the interior starters have tackle bodies and I guess the play-the-five-best principle took precedence.

Gameboy's analysis is not perfect but his opponents are in far worse shape regarding logic.    Some problems:   1. if undersclassmen are a weak link then any team with even one should be just as bad since it only takes one weak link to break a chain- but this is not the case   2. the coaches can respond to the weak links by running plays off the edges to the strong links of upperclassmen so the offense should not be hurt by a weak center    3. if the coaches have more experienced players for the interior (and they do) and are not playing them then its the coaches fault    4. but if the coaches beleive the freshmen are better than the older interior linemen on the bench then the coaches do not agree with those who feel that experience matters most    5. there should be no teams in the bottom 25 who have an offensive line made of upperclassmen (I doubt this)   6. Mattison should not have been able to transform RR's terrible defense in only 7 months since experience is more important than coaching   Many celebrated GM as a defensive genius when he came and made great changes. You cannot have one reality for what you want and another reality for what you don't want to face. It is very important to respect reality.My own view is that defense simply do not respect the QB and RB and are able to bring full pressure. The problem isn't the line- its the lack of backfield or downfield threats.

The issue is, though, that everybody's manipulating the data.  Once you decide average among all the Oline is the key (whether in class ranking or starts or both) you're manipulating the data as well.  Others have pointed out the problem with focusing on average of all the players on the Oline.

Michigan is an outlier in this sense: not a single team in the top 25 has two freshmen starting (like Michigan does) on their Oline. It doesn't mean that coaching isn't also an issue, but the data you've presented certainly doesn't prove, like some in your previous two threads seem to believe, that coaching is the primary reason for Michigan's offensive woes this season.

Edit: I meant this as a reply to Gameboy's post at 3:22.

We only have one freshman.  A redshirt freshman is not a freshman.  They are a sophmore with freshman eligability.  They are a year older and have a year longer in the system full of practices, strength and conditioning training, and working on putting on weight and strength.  So we only have one freshman and one redshirt freshman.  It irks when we lump these two into the same title.  And, correct me if I'm wrong but our one freshman was an early enrolee so he's been around longer than most freshman.

1. Of the 130 players in the database there are only three true freshmen starting. Michigan has one ot hem.
2. Of the 130 players in the database there are only nine players in their first year of eligibility (redshirt freshem and true freshmen). Michigan has two of them, no other school has more than one.

Reader71's written about what happens during a lineman's redshirt year. Your weight and strength comment is spot on; the practices aren't so useful because you're on the scout team running other people's offense and there's very little individual coaching. The big leap is during that first year off the redshirt when you're getting reps in the offense and attention from the position coach.

But everyone on here redshirted, too. So maybe my experience was an outlier.

I'm using the OP's own metric (which I believe has merit) in distinguishing RS freshmen (assigning them a value of 1.5) from sophomores (assigning them a value of 2).  The difference is in game experience: presumably, the sophomore was not redshirted because he played in too many games to be redshirted.

Also, there are only two true freshmen starting out of the 125 offensive linemen among the top 25 teams; I don't know if those other freshmen were early enrollees or not, but starting true freshmen is quite rare for this year's current group of top 25 teams.  (It will be interesting to do this again at the end of the season with the final rankings.

First of all, he has explicitly said multiple times that all he's doing is providing information related to whether or not Michigan's line is somehow crazily young or inexperienced. He has explicitly stated that he is not making the argument that youth and/or inexperience isn't the problem or that it's all on the coaching. He simply is addressing the back and forth on the boards regarding "other teams have lines just as young and aren't terrible" v. "no teams have lines this young!". He has shown that while at the low end, Michigan isn't wildly out of step.

Second, "once you decide average among all the Oline is the key...you're manipulating the data as well" is a crazy accusation. Calculating the average is the simplest, most common, and easiest to understand metric for describing the central tendency of a group. And is what way is he saying it's "key?" All he's doing is describing characteristics of groups for comparisons across groups. He's not making any kind of argument about the average somehow being a particularly special metric, just the most obvious one.

It's been perfectly clear what his position was since the first of these went up. Or you could just check his posting history... Unless I'm dramatically mislead by not seeing the comment in context? He clearly blames the coaching and dismisses the line youth.

These diaries sure seem to have been put together with those preconceived notions.

Edit- Sorry if this is all a sloppy incoherent mess. I'm trying to do about half a dozen things right now. The post in question (from the "snowflake:coaching" thread)...

"4 days 18 hours ago
Look, many teams would kill to have two NFL linemen tackles and few red shirt freshmen. Hell, MSU would swap in a sec. Did you see us get much pressure on MSU O line? At certain point you have to admit that coaching bears strong responsibility for this failure."

These diaries weren't born from this position and dedicated to backing it up?

Yeah, people keep pointing out they don't like averages, but don't have anything better. And this combines average with total game experience, giving even more comprehensive picture.

For the record, I applaud your efforts. Average works fine for me, at least for your purposes.

I think your most promising post was the first, although I think your scale should be changed. True freshman should be 0. Redshirt freshman should be 1. Sophomore should be 1.5. Redshirt sophomore should be 2. And so on. Basically, playing time should outweigh class. Playing time tells you who is better, period.

Teams are attacking the interior of our line, not the tackles. So that is where age matters. I don't see DE running around the tackle to make a play but instead a DT bull rushing to run up the middle.

Next year line will be better even without Lewan and Schofield cause the middle will have more experience and the tackles will probably be RS Soph (not making a prediction though)

You've said this multiple times now. Are you just ignoring all the reasons people have given you why this can't happen and why it wouldn't help?

First of all, the persistence here is admirable, and you provide some very interesting info. I still have to contest that average does not paint a full picture here. As someone pointed out in the other thread, with an offensive line you don't melt down the five linemen and distribute their experience evently among five spots. It only takes one bust to blow up a play, and we have three guys who had zero starts at the beginning of the year on the interior line. Two of them were freshmen. Shifting Schofield over might make a strong side, but then you would have glaring weaknesses elsewhere.

Do I think this is a full excuse? No. We should be able to block the UConns of the world, but I just disagree that average is the best way to go. As far as answering the question of how to better quantify the info, this might be the best way quantitatively. If we take a qualitative look at all the variables, however, we might see that this is a complex problem.

I think you're assuming he's trying to do something he is pretty explicitly not doing. "However, we might see that this is a complex problem." Gameboy has explicitly stated that it's a complex problem and he's just trying to provide some basic info about the common refrain about whether or not other successful teams have lines as young or younger than Michigan's. He has now shown that by both age and starting experience that Michigan is at the low end, but not crazy low.

Again though, just like with the averages, Lewan and Schoffield make it seem like we have more experience across the board because their numbers are so high. The big contention that many people have is that we have several very weak links, and that's something which this type of data doesn't acknowledge. I appreciate the data, and think it's very interesting. Furthermore, the chart from the last post gave me some hope for next year when looking at UCLA's personnel.

Again, he's not trying to make some kind of comprehensive argument about the line's performance. Just that the argument going on on the board for weeks was whether or not other programs had offensive line units that were as young/inexperienced as Michigan's. Obviously where the youth and experience is matters. It sounds like you're extrapolating what he's saying in order to say he's wrong.

The only conclusion you should make is that overall age and experience of this offensive line is within the normal range of Top 25 teams.

Each of these teams are currently 2-loss teams.

There are losses, and there are losses.

Better to use FEI, rather than losses. From an FEI perspective another way to put it is that with the exception of UCLA, all the less experienced teams have far superior offensive FEI ratings, and even UCLA and Michigan's OFEI is almost equivalent.

Well, sure, but that's because you've cherrypicked the data by comparing Michigan, which is not a top-25 team, to the very tiny subset of young/inexperienced lines that happen to be playing for teams in the top 25.

If you want to draw the conclusion you just implied you'd need to compare offensive FEI's among all the schools with inexperienced lines, not just the ones that have happened to be successful.

Michigan was a top-25 team a week ago.

It's not an unfair comparison and it is the right ballpark.

There are a lot of ways to argue with this data, mostly by disputing the core premise of the line as an averaged unit, but saying that what has been done here is "cherrypicked" is not on target.

Every team in this list has been more successful than Michigan. That's why they're in the top 25.

It's not that the data itself is cherrypicked, it's that it's silly to compare a team outside the data set to a data set comprised of teams that, by the definition of the data set, are better teams, and then draw conclusions from the fact that the less successful team has a lower FEI.

The variables you compare have to be independent of the variables you used to construct the comparative set.

simple, the offense Al wants to run is changing and with youth/scheme change comes failure. We are used to Denard stare down passes and him running all game. Pass protection isn't easy to learn and worse with revolving lineman. teams are twisting the dlines and sending LBers around the interior. Forces our Tackles to go out wide so they have a big gap between the guards and tackles to run though, if the guard goes wide they go at the center. its obvious. only solution is run at it with HBs hoping you catch them leaving a big hole which I saw a couple vs msu but fitz hesitated going in. We need to simplify, no more play action, more runs up the middle between the tackles at least and honestly....try the other HBs.

I really believe that what is hurting the offense/line most is our quarterback's inability to throw the ball. Teams are absolutely teeing off through the interior of the line. Whereas a pass-oriented quarterback has the ability to punish a defense for such aggressiveness, ours has proven he can't (at least not without a pile of turnovers). This also is the reason we can't run. This was the same story with Denard. Our young line is at a greater disadvantage because opposing defenses need only gameplan for our one dimensional quarterback. Our option? Throw in a true freshman quarterback... eeks. I actually find myself appreciative that Hoke et al don't use their press conferences to say "our quarterback sucks.". Gardner is no longer in a system he was brought in to run.

...the fix is not for Borges to simply "gameplan according to his personnel.". Sure, he could do that, but opposing defenses don't have to oblige him. They know we can't throw the ball, no matter which kind of offense we run. And if we try to run what suits Gardner, we never transition out of the RR era, which is what most of us wanted.