Back in April, I wrote a diary called Blue Moon in my Eye in which I developed a regression model that could be used to develop a projected win total assuming that reasonable estimates had been used as inputs. At the time I thought that the team would be capable of winning at least seven, probably eight, and maybe even nine out of thirteen games this season. Since then, things have, uh, how do you say … changed. With the loss of Woolfolk, how do those numbers change?
The New Blue Moon
Before I get to that, there’s a good reason to update the model. In April, I mentioned that turnover margin is meaningful factor in regard to outcomes, but I lacked enough data to break it out specifically and therefore decided to leave it as a lumped parameter; turnovers were doomed to fade into the ether that is Intercept. No more, the NCAA has finally included turnover data in its database and now there is enough data to mix into the model. The new model has an improved R-squared value (0.752 as improved from 0.675) using just three end-of-year factors: offensive yards per game, defensive yards per game, and total turnover margin. Last time I didn’t include the model because it was mine, my own, my … preciousss. That was incredibly lame and nerdy (both with holding the coefficients and referencing LOTR) but we’re talking stats here so no one should be surprised. Another reason for divulging the goods is, now that there are four dimensions, a chart would be useless. Behold, the Blue Moon Model coefficients:
- I left the P-Values in there for those who know what that is. For the rest of you, it suffices to say what I said last time: that ish be money, yo.
- The second column (Normalized Coefficients) is there to demonstrate the relative importance of each factor; in short, defense is a skosh more influential than offense and turnover margin is a little over half as important as both.
- The use of the model (first column) is simple, start with the intercept then multiply the other the coefficients with their interrogation values and add everything together. Use it to gamble at your own peril. Until such a time as you can accurately predict end of year stats for these categories, the model is only good for using as a platform to base sophisticated guesses off of.
Probable influential factors that are embedded in the 25% of the variation not explained by the model (1 – R_squared) are:
- Return Teams effectiveness. Good return teams will establish good field position thus reducing OffYds/G.
- Coverage Teams effectiveness. Bad units will allow the other team to establish good field position thereby reducing DefYds/G.
- Field Goal Kicking effectiveness. If you get into field goal position and miss, you’ll have a lot of yards but nothing to show for them.
- Penalties. Penalty yardage will increase/decrease your production depending on if they’re called on you or them but doesn’t necessarily change how effective each team is at controlling field position.
- In round terms, factor influence on winning percentage breaks down to 30% Offense, 30% Defense, 15% Turnover Margin, and 25% Other Things.
Shine Down on the Big Ten (and it’s self-absorbed neighbor)
Below is 2009 Big Ten Data and Blue Moon Model expectation (BMM Expect).
|Team||OffYds/G||DefYds/G||TrnOvrMgn_Tot||2009 Wins||BMM Expect.||Delta Wins|
I am not a man. I began as one, but now I am becoming more than a man, as you will witness.
– Francis Dolarhyde, Red Dragon
After the Iowa game last year, my nervous system instantaneously rushed to the precipice of meltdown every time Denard Robinson stepped onto the field. Mixing equal parts of anxiety and exhilaration yields a volatile cocktail. There were times when I couldn’t stand up because I was so nervous; only once or twice but, regardless of frequency, that ain’t right. Trembling calves, bated breath, dilated pupils, thumping heart. Then, a money Chewbacca impression; happy or sad, the reaction was the same. I can’t have been the only one.
There was good reason for such a strong pavlovian response. It seemed as though the outcome of a play with Robinson under center was the random result of the flip of a coin—tails: utter disaster, heads: spectacular success, on edge: just another play. Denard threw interceptions at a nauseating 13% rate on 31 passes. However, he also scored touchdowns 7% of the time on 100 total touches. Forcier only produced TDs a little over 3% of the time. Think about that for a second, Forcier had 399 touches last year and scored 13 TDs…Denard, theoretically, could’ve had 28. Those numbers are ridiculous to quote because Denard touched the ball so infrequently last year, but it isn’t fair to quote his turnovers without also quoting his TDs.
Anyway, eight months later we are faced with another batch of the cocktail, this time with a twist. A full offseason and a spring practice session have apparently yielded a thrilling prospect, Denard can throw. Maybe we can actually stomach the elixir and keep it down. That prospect sparks at least two questions. The first, how much could he have realistically improved? I mean, there’s improvement, and then there’s being good; the latter is not guaranteed. The second question is, who do you play, Tate or Denard? In this diary I hope to rigorously estimate an answer to the first question and hopelessly flail at the second.
Author note: This got long. Real long. My bad.
In this diary I build off of the foundation laid out in the White Rainbow entry over the weekend to size up the QB around the Big Ten as well as other QBs of particular interest to Michigan. This is a list of player expectations going into the season based on the investigations I’ve conducted previously. This list is presented in order of worst to best expected year end pass efficiency for each category.
Before I begin, I wanted to share a technique for ranking QBs that came to me after I published the White Rainbow diary. In that diary I talk about how I think passer rating does a decent enough job at determining large differences between players but a poor job at distinguishing subtle differences between them. Well, after playing with the numbers a bit, it looks like taking the average ranking for all four categories yields a method for differentiating players with similar QB ratings but vastly different subjective quality.
The following table* shows an excerpt of the NCAA QB rankings for the 2009 season.
|1||Tim Tebow, Florida||QB||SR||164.17||67.83||9.22||6.69||1.59|
|2||Kellen Moore, Boise St.||QB||SO||161.65||64.27||8.2||9.05||0.7|
|3||Jimmy Clausen, Notre Dame||QB||JR||161.42||68||8.76||6.59||0.94|
|22||Scott Tolzien, Wisconsin||QB||JR||142.99||64.33||8.25||4.88||3.35|
|23||Daryll Clark, Penn St.||QB||SR||142.64||60.89||7.88||6.3||2.62|
|25||Kirk Cousins, Michigan St.||QB||SO||142.63||60.37||8.17||5.79||2.74|
|48||Richard Stanzi, Iowa||QB||JR||131.62||56.25||7.95||5.59||4.93|
|51||Joey Elliott, Purdue||QB||SR||131.13||61.66||6.99||5.08||3|
|57||Juice Williams, Illinois||QB||SR||129.38||57.71||7.19||5.29||3.08|
|58||Mike Kafka, Northwestern||QB||SR||129.25||64.84||6.97||3.25||2.44|
|59||Terrelle Pryor, Ohio St.||QB||SO||128.91||56.61||7.1||6.1||3.73|
|64||Tate Forcier, Michigan||QB||FR||128.15||58.72||7.3||4.63||3.56|
|67||Ben Chappell, Indiana||QB||JR||126.44||62.62||6.87||3.97||3.5|
|98||Adam Weber, Minnesota||QB||JR||114.66||52.04||7.04||3.54||4.09|
If you take that same data and rank each player for each of the four categories then average that ranking, you end up with what I’m calling the QB Prism Score yielding the following final ranking.
QB Prism Score
|Jimmy Clausen, Notre Dame||3||6.75||1|
|Kellen Moore, Boise St.||2||8.5||2|
|Tim Tebow, Florida||1||8.75||3|
|Daryll Clark, Penn St.||23||31.25||21|
|Kirk Cousins, Michigan St.||25||33.5||24|
|Scott Tolzien, Wisconsin||22||37.75||34|
|Joey Elliott, Purdue||51||50.25||44|
|Mike Kafka, Northwestern||58||52.75||48|
|Juice Williams, Illinois||57||58||57|
|Richard Stanzi, Iowa||48||60||62|
|Terrelle Pryor, Ohio St.||59||61.75||64|
|Ben Chappell, Indiana||67||62.5||67|
|Tate Forcier, Michigan||64||62.75||68|
|Adam Weber, Minnesota||98||85||100|
Notice how this technique improves differentiation between similar QBs using the exact same data and very simple math (rank and average). At the very top we see that the all three guys had awesome numbers, but Tebow had the worst INT % of all of them. The method distinguished Clausen as a higher overall performer than Moore and Tebow.
A similar thing occurs when looking at Big Ten QBs. The traditional passer rating ranks Daryll Clark, Kirk Cousins, and Scott Tolzien as virtually identical passers; Prism score separates them substantially.
Now look at the log jam between Ricky Stanzy and Ben Chappell; only 5 passer rating points seperate 7 players. The Prism Score breaks these into main groups: Elliot-Kafka and Williams-Stanzi-Pryor-Chappell-Forcier.
Some bunching still exists but the bunches are smaller.
Anyway, I thought that might be a useful and easy technique for people who are so inclined to apply. On with the purpose of this diary.
Adam was ranked 98 - of - 100 in NCAA passer rating for 2009. I would actually rank him 100 - of - 100. His completion percentage, yards per attempt, and interception rate were terrible, especially for a redshirt Junior in his third year as starter. His touchdown rate ... I'll be good and listen to Thumper's dad ("if you don't have anything nice to say..."). Eric Decker did get hurt, but that's not the reason Weber wasn't throwing TDs. Of his 13 TDs last year, 5 were to Decker, 5 were against Michigan State, and the other 3 were flukes. OK, so that last part was mean. Recall that MSU's secondary was worse than Michigan's in 2009.
Going into 2009, Minnesota had 10 returning offensive starters available to them; that was a mature squad. My previous QB-centric work has shown that by year 3 as starter, QBs are what they are. Adam Weber is bad at passing. Phil Steele thinks Weber will end his career on a high note, I don't. Alas, I think Thumper's dad is disappointed in me.
Side Note: Minnesota's D only has 2 players returning and Phil Steele points out that they play USC, Penn St., Ohio St., and Iowa at home so they'll have to go on the road in order to try and win. That's just mean. Short Minnesota; with leverage. When do we play them again? Oh ... sweet.
I make no bones about it; I think Stanzi is hugely over rated. He throws a worse ball than Brady Quinn and is inaccurate to boot. I tried to find a picture of him throwing a pick but couldn’t confirm the result of the pass shown. There’s a 5% chance that what you’re looking at is a pick (no joke) so I’m assuming that it is until proven otherwise. Relax, I’m mostly kidding.
Anyway, last year he actually had the highest rating of the next four QB in this list, which were in a tight cluster, but his high YPA and solid TD rate obscured the fact that his completion percentage and INT rate were the worst of the bunch. The two categories he was good in, YPA and TD rate, are highly influenced by things outside of the QB himself (receiving corps, O-Line, run game, opposing defense). The two things he has direct influence over, Cmp % and Int rate, he was really bad in.
Coming into his third year as starter he should improve somewhat and has McNutt and DJK returning but loses Bulaga and Moeaki. Net, net, I think Iowa sees modest improvement in their passing efficiency. The problem is, they need more than that.
Now excuse me, I’m about to get kicked out of the country by the Americanzis.
Apparently, I hate senior quarterbacks with oodles of experience. That has nothing to do with the fact that Michigan doesn’t have any, I swear.
In all honestly, I think Chappell is a fine QB and a great find for Indiana. He has progressed nicely so far and should take another step forward this year if Indiana’s O-Line can absorb the losses of two 4-year starters. Maybe, maybe not.
As far as the quality of his passer rating he is the opposite of Stanzi; He did the things he could control (Cmp %, Int %) well, but didn’t do so well in the things he needed help in (YPA, TD %). The latter two categories should take a step forward this year as IU’s top 5 WRs return this year. Again, if the O-line holds up IU should be a pretty saucy passing team.
All in all, I expect there to be a significant gap between Stanzi and Chappell. He’s slotted here because the support he has around him isn’t as good as that of the others ahead of him.
Originally I was going to do a spotlight diary on Forcier similar to one I’m working on for Denard Robinson but, since Forcier is more of a known quantity in terms of style and actual production evaluating his prospects is much more straight forward.
As most Michigan fans know, Forcier was a Godsend for Rich Rodriguez in 2009. From his pedigree to his tutelage by Marv Marinovich to his early enrollment, Forcier’s freshman performance didn’t exactly come without signal. Based on my previous work on QB maturation, Forcier’s freshman year was solidly that of an average true freshman 5 star recruit which is well above that of the typical first year starter. Considering that 5 star QB recruits almost always go to very good, if not elite, football programs and are therefore surrounded by elite and mostly mature talent (see Chad Henne), Forcier proved that all the fanfare that accompanied his arrival in Ann Arbor wasn’t just optimistic hype. And he sustained a meaningful injury to his throwing shoulder early in the season.
What’s more remarkable to me is that as polished as he was, he still showed room for tremendous growth. In terms of performance metrics Forcier was ahead of schedule in completion percentage and YPA, and he met expectations for TD rate and INT rate. The four picks he threw in the Ohio State game took his INT rate from 2.5% to 3.5%. If he had maintained the INT rate had going into the game—resulting in 1 INT and 3 INC instead of 4 INTs—his final passer rating would have been 2 points higher with all else being equal.
Those keeping track will note that, true to his hype, Forcier was ahead of schedule in terms of the self-controlled parameters (Comp. %, INT %) and solid in the team help parameters (YPA, TD %). Michigan has a stable of high potential receivers with extensive starting experience and development time and also has the best offensive line it has had since 2007. With reasonable personal development and the supporting cast he has around him, Forcier has every opportunity to be all Big Ten this year. Michigan might not have a senior QB in the strict sense, but it has one in the practical sense.
As insurance against Michigan slappy-ism, I’m placing him further down the list than I think he’ll end up.
Terrelle Pryor, JR, Ohio State
Say what you want about TP, no one would think twice about him if he weren’t a baller. As a true freshman, he a had a high passer rating that met the long term quality thresholds establish in the White Rainbow diary. Last year was a step back statistically for him as he regressed in completion percentage, YPA, and INT rate; his TD rate remained solid though. The regression makes some sense between the expansion of his responsibilities in OSU’s offense and the breaking in of new contributors at the skill positions.
Another difference between his freshman and sophomore years is that Pryor ran less often in 2009. This is a bad idea; if Pryor is allowed to flash his running ability explicitly, opposing defenses must respect the threat which would leave easy opportunities in the passing game. Josh Nesbitt is the uber-example of this effect. Nesbitt rarely throws and is inaccurate (46.3%) when he does, but when he connects, the result is a big play. Nesbitt’s YPA in 2009 was a staggering 10.5 accompanied by a good TD rate (6.2%) and solid INT rate (3.1%). I’m pretty sure Nebitt’s YPA was the highest (by over a yard!) in the FBS, and certainly the highest in BCS conferences. He’s not a great passer, but that doesn’t stop him from doing extensive damage when he throws.
This season Ohio State has everybody coming back except for the tight end. Pryos has already shown how much damage he can do when he puts it all together; if you don’t know, ask Oregon. This year he will be better and will have high quality support around him. The result is likely to be an emetic wave of OSU/Pryor hype. Chin up though, chances are that this is his last year on campus.
Cousin’s superiority over Keith Nichol in 2009 was apparent to everyone except Mark Dantonio. Maybe Dantonio had a problem with the fact that it looks like Cousins likes to rub his butt up against his lineman’s during a wind up. Personally, I don’t think there’s anything wrong with that. Even though it was his first year as starter, Cousins put up the numbers of a seasoned veteran. His YPA of 8.1 was above the threshold of good QB play, and his Comp.%, TD rate, and INT rate were just a hair’s width away from the thresholds; that’s close enough.
Coming into 2010, Cousins will lose his top receiver in Blair White along with a bunch of other knuckleheads who ain’t going out like dat, son. But, because of Dantonio’s preference for only disciplining players that are either expendable or not worth the heat of benevolence, Cousins has some good to very good WRs retuning in B.J. Cunningham and Mark Dell, along with non-knucklehead Keshawn Martin.
The challenge for Cousins will be finding enough time to hook up with his receivers. MSU loses 3 starters from a so-so offensive line in 2009. I don’t expect that to slow Cousins down too much though.
Tolzien and Cousins were neck-and-neck for best returning passers and based on 2009 numbers alone, Cousins actually wins. Both had the same-ish YPA and Tolzien had a better completion percentage, but Cousins had a better TD rate and lower INT rate along with an adequate completion percentage. However, when you look at the team Tolzien has coming with him, his prospects for 2010 look higher and that give him the nod in my book.
The Badgers have a monster offensive line coming back all of which are either returning starters of have substantial starting experience. Running back John Clay was the Offensive Player of the Year in 2009. Tolzien has his top wide receiver back, too. So yeah, there’s absolutely no reason why Wisconsin’s passing attack shouldn’t be very, very good this year.
I don’t have much else to say except that I think Bucky Badger is dumb, and that makes me feel better. What? That’s totally germane to the topic of this diary.
Scheelhasse was a rivals.com 4 star recruit who also received offers from Iowa, Nebraska, Oklahoma, and bunch of other solid programs according to Rivals. That’s solid endorsement, but his supporting cast is likely to be a drag. The Illini need to establish all new receivers this season and also need to replace two multi-year starters on the O-line. Beyond that, Illinois is bringing in a new offensive coordinator this year, so there are strong headwinds against Scheelhaase.
Despite grim sounding early returns, 4-star and one time Michigan recruit Kevin Newsome should be the guy for Penn State this year. He has been around for a year and therefore has time invested in a collegiate strength and conditioning program, play book study time, technique development. Penn State has to find a new combination at O-line, but there’s plenty of talent available to make that happen. They have their top 2 WRs retuning as well as really good running back in Evan Royster to take the heat off.
Dan Persa, RS-JR, Northwestern
Persa was a rivals.com 2-star recruit in 2007 who’s biggest offer besides NW came from West Virginia. So, theoretically, Rich Rodriguez thought this guy had some skillz. Northwestern has a solid supporting cast around him with all 5 starters on the O-line returning this year, as well has two WRs who each caught more than 40 balls last year and the teams leading RB (Kafka was the team’s rushing leader in 2009). I wouldn’t be surprised if he ended up having a better year than all of these guys.
Robert Marve, RS-JR, Purdue
Marve originally committed to Miami (FL) as a 4 star recruit in the class of 2007. He also received offers from Purdue, Michigan State, and Alabama. He started as a RS-SO at Miami but split time heavily with Jacory Harris. He was suspended for the first game in 2008 for disciplinary reasons related to his arrest for a misdemeanor mischief charge during his redshirt year. He was also suspended for the bowl game for missing class. Oops, I got carried away with the Google-stalking. What can I say, I was fascinated. Besides, I think its worth wondering how this kid handles adversity. This is basically the anti-Tom Brady story.
He transferred because of an apparent falling out with HC Randy Shannon; but Jacory Harris flat out beat him head to head that year. Marve was pretty inaccurate (54.5%) and had an extremely high INT rate (6.1%) and a low YPA (6.0). Marve is coming off a torn ACL suffered just before fall camp last year. His knee is probably fine now, but he had to have missed a lot of pratice time rehabbing his knee rather than working on his accuracy and timing with the receivers.
If he’s grown up since his Miami days Marve has a shot at being the best new QB in the Big Ten saving for a certain someone. Purdue returns the Big Ten’s most prolific WR of 2009, Keith Smith, but is pretty thin at OL. The running game took a big hit when Robert Bolden tore his ACL this spring so a lot of pressure will be put on Marve to produce.
Other QBs of Interest
Kyle Havens, 5th Yr, UMass
I wasn’t going to write anything about UMass because they’re an FCS team that was sub-500 last year and who lost their top rusher and receiver from that team, but then I saw this video from the spring and figured people might get a kick out of it. Madre. Same team, dude.
Anyway, I thought it’d be rude to link that video and not do a write up so here it goes. Havens was actually a rivals.com 3-star JUCO recruit in 2009. He played in and started 10 games last year but his prism stats were terrible: 55.3% Completion percentage, 7.2 YPA (OK, I guess), 3.4% TD Rate, 5.7% INT Rate. That’s against FCS competition.
BGSU had a crazy prolific passing attack last year. They’re all gone, only 4 offensive starters are back this year. As far as QB there are four guys vying for the gig: Matt Schilz, RS-FR, 3 star; Aaron Pankratz, RS-FR, 2 star; Kellen Pagel, RS-FR, not ranked; Caleb Watkins, FR, 2 star.
Pankratz is the only guy to have thrown the ball in a college game (13 attempts), but Schilz was purported to have the inside edge in the spring. Watkins had a bunch of offers from MAC teams, but, his Rivals profile also lists Michigan, Ohio State, Tennessee, Illinois, Indiana, and Cincinnati, for whatever that’s worth.
Frazer is another senior QB with a lot of experience that I’m turning my nose up at. It’s not me, it’s him. I promise. This will be his 3rd year with meaningful playing time as a starter. His passer rating for the last two years has been dreadful: 103 in 2008, 116 in 2009. Running those numbers through the prism shows that he was indeed a bad passer. Last year was an improvement over 2008 as he improved his completion percentage, YPA and TD rate by normal amounts while improving his INT rate by a large amount. Unfortunately, all of those numbers were bad save for TD rate which was sligtly below average.
This year Frazer has a solid to good RB and an experienced offensive line returning but loses his top 2 WRs from last year. Frazer should be able to improve is Cmp % and INT rate of his own accord, but he has a significant way to go in order to reach high quality veteran numbers and this is his last year to do it. As for YPA and TD rate, my opinion is that you need help from the rest of the offense to get good numbers there, and while the O-line and running game should be solid, having 2 new starting WRs to break in will cut into the progression there.
The Notre Dame offense has a lot going on this off season, new head coach, new offensive scheme, and a new starting QB. Sounds like a tough transition, huh? I actually don’t think so. I don’t see the transition from a Pro Style offense to a Passing Spread to be all that different. Both systems need guys who can pass block, throw, and catch; Notre Dame has all of that, in spades. Crist is a new starter but he was a 5 star recruit who has been on campus working out and improving his technique for two years. Sure, he has to learn a new playbook but Brian Kelly’s system is notoriously simple making it easier for inexperienced players to step right in and be effective. Crist will have a whole off season to learn the system. Sure, he won’t be flawless out of the gate, but I can’t imagine that he’ll be a liability either.
As for the team around him, Crist and Kelly inherit Michael Floyd and Kyle Rudolph who have already established themselves as elite players at their positions. The O-line has three 4-star recruits returning as starters and the other two spots are likely to be filled with 4-star recruits as well. The retained talent fits the new system like a tailor made glove lined with memory foam. Are we really to believe that Brian Kelly wouldn’t have recruited these players himself?
Some people look at Notre Dame in 2010 and see a situation similar to what Rich Rodriguez walked into at Michigan in 2008. They are wrong. Oh, so wrong.
And there it is. I’d love to hear other people’s thoughts on these players, how big a threat they pose to Michigan’s secondary, and anything I may have overlooked/understated in my assessments. Also any feedback on the Prism Score would be helpful as well.
*Can someone please explain to me how to format tables so they show up with Maize and Blue row and column headers? I’ve tried many things, I’ve failed many times.
/desperate plea for assistance.
In a previous diary I used passer rating as a well known and objective grade for the relative value of a quarterback’s stat line in order to determine if there were any trends in player development, and if so, how strong those trends were. However, in the diary I noted that passer rating is not without its issues and pointed those interested in finding out toward other people’s work and went on with it.
Most Declarations of Grievance attack the adequacy of the the formula used saying that the scale is unintuitive, some of the components are not orthogonal (total yards, completion %), some components are irrelevant (touchdowns), and other components are omitted (rushing stats, and sacks). These are valid arguments but the alternatives presented are unfamiliar, come with their own set of complexities, and are often difficult for fans to calculate on their own.
In this diary I don’t want to generate a new formula, that has been done. Rather, I want to accept the current formula for what it is and develop new benchmarks for what it shows us in modern context. The two problems I have with it are that it’s clearly outdated and that it obliterates information.
Problem 1: It’s Old and Busted
The current NCAA passing efficiency formula (shown below) was developed in 1979 and was generated using passing data since the beginning of the modern two platoon era which began in 1965. At the time, the rating was calibrated to yield a rating of 100 for the average passer. If a QB had average values for all 5 components (attempts, completions, total yards, touchdowns, interceptions) his passer rating would have been 100.
Here’s the rub, major rule changes have been implemented in favor of the passing game since two-platoon football started, and so the majority of the data set used to calibrate the formula was skewed toward weak passing numbers by today’s standards. The major rules changes are:
- 1976: Offensive blocking changed to permit half extension of arms to assist pass blocking.
- 1980: Retreat blocking added with full arm extension to assist pass blocking, and illegal use of hands reduced to 5 yd. penalty.
- 1985: Retreat block deleted and open hands and extended arms permitted anywhere on field.
And these aren’t even all of them. Behold, further evidence of Anthony Carter’s ridiculousness: he thrived in an era where the rules were stacked against the pass. Before these rules were implemented, offensive linemen could not really be aggressive in pass blocking. They were forced to be either turnstiles (before 1976) or turnstiles with their elbows sticking out. Before ‘85, linemen could not have their palms facing the opponent. Back in the day illegal use of hands and holding penalties were 15 yards assessed from the spot of the foul. Cloud of dust football so popular back then for a reason. For a modern taste of what this might have looked like check out Michigan v. Notre Dame 2007. The mismatch between Michigan’s D-Line and Notre Dame’s O-Line in that game was obscene. Despite that Jimmy Clausen’s freshmen year performance at Notre Dame, on that terrible offense, was slightly above average by 1979 standards.
Due to the rules changes, passing stats have inflated but the formula has not adapted along with them. That is not to say that it has no value, just that our understanding of that value is outdated.
Problem 2: It’s A White Rainbow
Imagine if a rainbow were a brilliant white arc in the sky; still interesting, but less so than what we usually see. If the water droplets in the air can not produce a prismatic effect, they just diffract the light and we can’t see the individual colors. BTW, white rainbows are real.
Getting back to football, the passer rating formula looks at, yards per attempt, completion %, touchdown rate, and interception rate, then assigns weights to those values and blends them together to provide football fans a single number to use to compare QBs against themselves and each other. All in all that is a useful tool, but the blending process obliterates some very interesting information. Passer rating is a great coarse filter but it’s inadequate for picking up subtle differences. Not all 130’s are created equal.
In order to address the first problem, it is necessary to decompose the formula into it’s base components to see what the new definition of average is for each category. For college players, I think it is also useful to split the data by recruiting ranking (Rivals.com Star Rating) and Experience (Years as Starter) to really understand how well a kid is performing relative to history.
For this project I’ve taken only players who played on teams in BCS conferences and who were rated as a Rivals.com 3-star recruits or higher. The data plotted is the average for all players within a given category (ex: all 3 start players in their 1st year as Starter is a group, and so on).
One thing I should note up front is that there are fewer and fewer players in each category as the number of years as starter increases; only about 10% of QB recruits in each group start for four years. For the 3-star and 4-star groups this isn’t a huge problem because they survive the attrition fairly well and still have 7 or 8 players to use for averaging purposes; not great by any means but workable. The 5 star group ends up with 2 players in my data set that have started 4 years (Chad Henne and Trent Edwards). A sample size of 2 is not workable and has therefore been omitted.
For completion percentage we see that the average QBs gradually improves his accuracy and approaches 61% in the long term. The higher a player is rated coming out of high school, the sooner he is likely achieve steady state.
With Yards per attempt we see a more subtle upward trend and also more separation between rating groups. I think this separation makes some sense. For one, Rivals explicitly accounts for the players physical assets; it stands to reason that 5-star players are more likely to develop NFL-level arm strength and will therefore be able to push the ball up field without sacrificing accuracy significantly. Another potential factor is that a high level QB recruit is likely to attract high level WR recruits that help improve YPA significantly. I think the long term standard that should be applied for this category is 7.6 yards per attempt.
Touchdown rate is a factor that many people argue against including in the passer rating formula. The argument goes that a TD is as much a result of the WR’s ability as it is the QB’s. The Roundtree hawk down at Illinois is an example: Edwards, Manningham, Breaston, Odoms, and a bunch of other guys would have taken that ball to the house. I think this chart shows this effect pretty dramatically. The 5-star recruits tend to go to high level programs and are surrounded by high level offensive lines, running games, and receiving corps, thus making it easier for them to throw touchdowns. Oh yeah, and they’re more likely to have the skill to exploit their advantages. Long term target: 6.0%.
Interception rate is the only negative factor in the formula, so a lower number is better (duh). Again we see 3-star recruits significantly lagging the other two groups. I suspect that not only is there the experience issue, but 3-star recruits are likely to need more time to develop proper mechanics. By year 3, all groups are about as good as they’re going to get. Long term target: 2.7%.
The New Hotness
Cherry picking the long term values for these parameters allows us to assemble a passer rating that is a true indication of good passing efficiency in college, not just objectively point but also subjectively; that value is 139.2. This is a stout target to hit and the player needs help from his team mates to get there, but it is achievable for all BCS level recruits by their 3rd year as starter. In 2009, 33 QBs put up this level of performance or better with another 10 or so within reasonable striking distance.
[Ed: MCalibur, apparenly an economist found himself collateral damage on today's shotgun blast at "X is stupid" sports economists. Maybe I should have come up with a label like "freakonomists" so as to not implicate people who are just interested in the numbers without the look at me pub. Anyway, here's an excellent diary on what your goals should be on second and third down. Implications for a second and medium are interesting.]
A while back The Mathlete sent out a Thundercat signal for some help shucking data for his database; at least that’s how I remember it. Any un-lame kid of the 80’s knows that when you see the Thundercat Crest you put on your spiked suspenders, pick up your laser shooting panther paw nun chucks, jump into the tank you built singlehandedly, and you roll; that’s all there is to it. I had no choice.
Anyway, we voltroned* our abilities together and came up with something pretty sweet. I have put together my own database, with Mathlete’s help, and can now do some of the same tricks he can. I’ve focused onto BCS-BCS matchups extending the thought of excluding mismatches; Michigan v. Eastern Michigan is still a significant mismatch.
*Oops, wrong cartoon but, then again, you simply cannot over-reference 80’s cartoons/shows. I pity the fool that disagrees. I feel bad for youngins that don’t know the glory of 80’s children’s programming. Also, am I the only one who thinks that Voltron and Zoltan might be related?
When I’m not eliciting unreasonable responses from otherwise reasonable people, I’m usually crunching numbers of some kind as if they were a motley band of mutants and aliens led by a grody and ancient mummy demon priest. Very often the numbers have something to do with football in general and, most often, Michigan football specifically. This time I wondered “how do we know if a play was successful or not?” This question has been asked and answered by some smart people before, but being the curious little twit that I am, I wanted to gauge it on my own.
One way to go about it is Mathlete Style: Expected Points, a good but abstract method. One potential problem with focusing on EP is that doing so can drive you to scoring points where as the real goal is to win. It’s a subtle but important distinction. Depending on the situation, maximizing EP might not be the same as maximizing the probability that you will win. Maybe you would rather not score if doing so means giving Peyton Manning the ball back with 25 seconds left and less than a 1 score deficit. Besides, The Mathlete has this beat covered.
Another method is to use 1st Down Probability, the likelihood that a team will convert a new set of downs given the current down and distance. I think this is more appropriate to the microcosm of a play because the goal of a play is not necessarily to score it is to keep the ball and move it forward, in that order. Scoring is the goal of an entire drive. To calculate 1DP, you do the same thing you would to derive EP, except you keep track of first downs instead of points.
Whenever you have a mountain of data, you need a way to focus your attention on what matters while still maintaining the value of having so much data in the first place. For this study, I’ve filtered on the following criterion:
- Exclude plays involving a penalty of any kind.
- The game must be close. My arbitrary definition is: all plays in the first and second quarter, third quarter plays where the lead is less than 17, and fourth quarter plays where the lead is less than 10. These values are arbitrary, but there are so many plays available that the sample sizes are still large enough that any additional precision is of negligible value. Also, any unimportant plays are swarmed by a large number of plays that are important, then math deals with the noise.
- Results of the play are limited to –10 and +25 yards. The logic here is two fold. On the negative side, the average sack is good for about 6 to 8 yards, anything bigger than that is a fluke play (botched snap for example). On the positive side, most plays aren’t designed to go for huge gains. However, there are instances when an OC calls a play like that in order to exploit an advantage and not necessarily as part of a base strategy. Though relatively infrequent, both types of plays happen with enough regularity that they significantly shift the averages even though they are vastly outnumbered by more typical gains. This filter only excludes about 0.5% of all plays to the negative side and about 5.3% to the positive side.
Each play in the database has been assigned a 0 or 1 depending on whether or not it was part of a first down series, touchdowns are counted as first downs in this survey. Essentially, every play in a four down sequence is counted as a being part of a 1st down unless a punt or turnover occurs before a new set of downs is achieved. Filtering the plays that made the cut (over 105k) by down results in the following scatter plot:
Every point on the chart above has at least 15 samples, most have several hundred, some have several thousand, and 1st and 10 has almost 42,000 samples. The trends are self evident and really, really, strong. A few comments on other decisions I’ve needed to make here:
- The small black dots represent 4th down plays. They are essentially overlaid with the 3rd down plays which makes sense, the objectives in both cases is the same, convert to a 1st Down. If you’re in a 4th down decision, use the 3rd down line.
- The curves for 1st and 2nd Down were both pegged to 100% probability of converting a new set of downs at zero yards to go; pretty obvious as to why, it’s the rules. On 3rd Down however, I opted not to peg it to y3 = 1 at x = 0 because even though the R-squared value doesn’t suffer by much (0.005 lower), the resulting curve significantly over estimates 3rd down success inside of 3rd and 5. Also, I think the gap could be real; how much error is there in spotting the ball (especially on QB sneak type plays)? To me this data implies that the ball is mis-spotted to deny a 1st Down conversion approximately 9% of the time. The incremental error of spotting the ball doesn't matter until you end up at 4th and inches.
- For 1st down plays, I intervened on behalf of noise reduction by only including plays where the distance was in multiples of 5. The reason is that the rules say you start at 1st and 10 and the only way you end up with 1st and something other than a multiple of 5 is A) you’re inside the opponents 10, and B) multiple penalties or 1st down repeats after spot fouls. Plays that were rejected are largely noise; the legitimate plays (ex. 1st and X inside the opp. 10) act like 2nd down plays, so use that in those cases.
Generating Hard Targets
Now that we have a survey, we can use the information to answer the question I asked “what makes a successful play”? The question has been tackled before in the seminal tome The Hidden Game of Football. The DVOA system developed by Football Outsiders is based in concepts discussed in Hidden Game. Hidden Game presents the following goal schedule:
On first down, a play is considered a success if it gains 45 percent of needed yards; on second down, a play needs to gain 60 percent of needed yards; on third or fourth down, only gaining a new first down is considered success.
So, the goal schedule by down should be 4-ish yards on 1st Down10, 3 yards on 2nd and 6, and 3 yards on 3rd and 3. I haven’t read Hidden Game but this doesn’t look right, particularly in short yardage situations. For example, 2nd and 1 is a failure if you do not convert a new set of downs. Sure, the consequences of that failure are small because you are virtually guaranteed another chance to convert but gaining zero yards (we only have whole yard resolution) is failure by definition.
Brian Brown of Advanced NFL Stats fame has a better definition: a play is a success as long as your chances to convert a new set of downs are not hurt by the result of a play. The great thing about this definition is that it considers the opportunity cost of running a play. This simple idea probably explains why a lot of OC’s call conservative plays on 1st and 10, if you don’t advance the ball by about 4 yards, you’re worse off than you started. Brown focuses his work on the NFL and has done this work for the League but he stopped at the first chart leaving the answer to the question abstract-don’t hurt your chances of getting a new set of downs. OK, but how do you avoid that?
Running an optimization routine on our curves gives us the concrete answer, a goal schedule by down and distance in chart form.
- 3rd down is obvious, you need to gain all of the yards remaining or you’ve failed. Fourth down decisions should be avoided.
- The 1st down requirement is virtually flat at a 37% yield, lower than what Hidden Game suggested.
- The 2nd down requirement is asymptotic to 65% yield but reaches a requirement of 80% yield by 5 yards to go. Essentially, you need at least 4 yards on 2nd and 5 to not have wasted the down.
First down is all business, you must move the ball 37% of the way or you’re screwing yourself. Third down is also all business, you need to convert or risk deciding which poison tastes the best. Second down however, depending in the situation, that’s a down you can get jiggy with.
On a generic 1st and 10, there’s a 64% chance of converting a new set of downs. So, as long as you end up with about a 64% chance of converting on 3rd down, you can do whatever you want on second down as long as you don’t lose yards or give the ball away. That means, you need to end up at 3rd and 3 or better. On 2nd and 3 or better call in the B2s and Outkast, baby, ‘cause it’s time to drop bombs (over Baghdad).
In football the QB position is the lynchpin for the whole offense. They touch the ball on every play, read the defense, and choose the best course of action based on what they see in the moment. So, naturally, the outlook of an offense depends in large part on the outlook of the QB who will be flying the plane. The goal of this diary is to see if there are any reliable trends in how a generic QB progresses from one year to the next and to investigate if there are factors that can be identified and quantified that will aid or hinder his on field success. I'm actually very surprised about how clear the data is.
To do this I have accumulated information for 226 quarterbacks that have played in BCS conferences since 2003. The pool was restricted to BCS schools so that some level of control was applied to the level of talent surrounding and opposing the quarterback; the presumption being that players in BCS conferences will be playing with and against talent that is on par with their own.
If a player did not average at least 10 passing attempts per game he played in a given year, the data point was not considered because the number is highly unreliable (small sample size). This shuts out some interesting pieces of data (Tim Tebow 2006) but improves the overall conclusions significantly. In Tebow’s case, his second year as a regular player was his first year as a regular passer so his sophomore season was placed in the Year 1 group. There are a few other, more obscure anomalies that were given the same treatment. The large number of data points make the impact of those anomalies negligible.
The metric I used for this study is NCAA Passer Rating. Unfortunately, Passer Rating isn’t perfect when it comes to evaluating QBs; there are many disses available on that topic (Advanced NFL Stats, Football Outsiders, Fifth Down). I leave the detailed explanation to the articles I’ve linked. However, though it’s imperfect, passer rating is still a familiar number for most football fans and it does provide significant and reasonable insight into the relative performance of QBs. On with the show.
The following chart shows the average NCAA QB Rating by year of experience for all QBs included in this study.The chart includes the standard error of the averages for those that know what that means (or are good guessers). The chart shows a couple of interesting things: more experience is better, which…duh, and the average QB rating seems to improve by approximately equal amounts going into year 2 and into year 3 but then tails off a little going into year 4.
Now, the second point goes against conventional wisdom somewhat; QBs are supposed to improve a lot more after their first year than after subsequent years. The fly in the ointment is that, in order to track improvement, the data should be evaluated as matched pairs. This means that we should take each specific QB’s improvement over the preceding year and then average the deltas to understand the average improvement from one year to the next. Doing that yields this chart.
This chart shows what we expect to see, the change after the year 1 is much bigger than the change after years 2 and 3. But, now there’s the apparent negative improvement between years 3 and 4. What’s up with that?
Need … more … charts …
What I did here is plot average improvement versus the previous year’s rating. To clear out the inherent noise in the data, I lumped QB Ratings near each other together (i.e: ratings from 115.0 to 124.9 treated as 120 and so on). The trends are clear and strong, and they demonstrate that mean reversion is in full effect—the higher a QB’s rating is in a given year, the more likely he is to have a lower score in the next year and vice versa. It’s very difficult to have 2 really good or really bad years in a row (unless the QB is awesome or terrible).
We know from the first chart in the series that ratings go up as your years of experience goes up, hence, by the fourth year as starter, the net expected change is negative. The guys above 130 are likely to fall back and the guys below 130 are likely to move up. This effect allows us to infer that there is an expected upper bound for a seasoned QB, probably in the 130 to 140 range. One possible explanation for this phenomenon, is that a QB is unlikely to have the same group of players around him for all four years. The team around him might be out of phase with his development and that will have an effect on the numbers he puts up.
The familiar example around here is Chad Henne. Chad had Braylon Edwards and a veteran offensive line in his first year. So any improvement he may have developed in between 2004 and 2005 was partially offset by the loss of Edwards and other changes around him. However, as the team around him developed and he continued to develop, he saw a big jump in performance in his third year. Then, going into 2007, there were many losses on the offensive line in addition to Steve Breaston, and Henne’s numbers fell back to the 130-ish level. Overall it looks like Henne never really improved, but the reality is that his development made up for and was masked by the changes in the team around him in all likelihood. I think this is a more plausible explanation than “he was always sweet and he never got better.”
Finally, it’s worth taking a look at the dependency of first year performance vs. Seniority. The question being: is it better to have a redshirt junior making his first start instead of a true freshman?
Once again I’ve plotted the averages and their corresponding standard error and included sample size along the axis for reference. The responsible conclusion is that seniority is not a significant factor in first year success for Redshirt Sophomores or younger. Players older than that seem to perform better. However, you could just as easily conclude that since the averages overlap so much, especially in non-adjacent points, the trend is pretty weak and that no trend exists. It seems that other factors, such as supporting cast and the overall talent of the player, matter more than the age of the QB when he makes his first collegiate start. The team thing is difficult to assess but talent is easy; Rivals.com, be my guide.
Same thing as before, lumped averages with standard error and sample sizes shown. This time, I think the trend is real because: A) it makes sense and B) there is no overlap between 2-stars and 5-stars. Also, a 5-star QB is more likely to have a good team around him than a 2-star player is. All of these things support the trend despite the uncertainty in the data. There’s another reason, let’s zoom in on 5-stars; this time with a table.
|Reggie McNeal||Texas A&M||2003||124.5|
|Trent Edwards||Stanford||2003||79.5||4 new OL; 2 new WR; new RB|
|Kyle Wright||Miami (FL)||2005||137.2|
|Marcus Vick||Virginia Tech||2005||143.3|
|Anthony Morelli||Penn State||2006||111.9||4 new OL|
|Matthew Stafford||Georgia||2006||109||3 new OL; 2 new WR;|
|Xavier Lee||Florida State||2006||123.5|
|Jimmy Clausen||Notre Dame||2007||103.9||3 new OL; 1 new WR|
|Tyrod Taylor||Virginia Tech||2007||119.7|
|Terrelle Pryor||Ohio State||2008||146.5|
When you strip out the four guys that had extenuating circumstances (Mallett stays in), the average is about 131. That’s approaching the theoretical upper limit right away, on average.
I’m currently working an a project that tries to use this information to see what we can expect out of the QBs on our upcoming schedule. I’ll also try to use the dataset to try and tease out what we can expect out of our guys based on QBs similar to themselves.