I, for one, appreciate the effort.
A quick guide to where my numbers come from and how they are calculated.
My sole source is the NCAA website, which hosts the play by play data for every year since 2003. 2004 and forward is nearly all there but 2003 is a bit hit and miss.
Thanks to MCaliber I can pull each week’s games down directly from the site into Excel where I translate the text into a variety of field and calculations that ultimately end up in an Access database. My tools are somewhat crude but they work and I can get what I need from them.
To data I have 992,624 plays in the database.
All games between two FBS teams. Any games against FCS teams don’t exist as far as I’m concerned.
Every play from these games are in the database but not all plays go into calculations. End of half drives are excluded as are any drives in the second half where one team leads by 16 points or more. Only plays under those circumstances are excluded, all other plays from those games are included.
Sacks are counted as pass plays and all fumbles are excluded due to their random nature.
Based on all of this historical data, each down, distance and line of scrimmage are given an expected value. For example:
1st and 10 from your own 20: 1.53 expected points
1st and goal from the 1: 6.48
Since each situation has a value, the value of any play is the change in value created. A 79-yard pass on 1st and 10 from the 20 to the other 1 is worth 4.95 points (6.48 points – 1.53 points). If the running back then punches it in from the 1, he is awarded .49 points (6.97 – 6.48). Touchdowns are worth 6.97 because they create the opportunity for the PAT which is successful 97% of the time. If the PAT is good, the values for the drive look like this:
QB/WR 3.95 points
RB: .49 points
K: .03 points
Thus the 7 points the offense generated are accounted for between the initial 1.53 from field position and the remaining 5.47 from play.
Even plays that gain yards can yield to negative expected point changes. A two-yard gain on 1st and 10 puts the offense in a worse spot than they began even though it was positive yardage. If a drive ends, all of the initial field position points are “left on the field.”
Let’s say a team hands the ball to their running back three times from the 20 and gains 3 yards each play. A punt on fourth and 1 means that the initial 1.53 expected points is now 0 so the running back now has three plays for –1.53 on the books. Third down plays are typically swing plays and can provide large deviations. Convert a lot of third downs and your value/play will be larger than your yards indicated. Fail on a lot of third downs and it quickly swings in the opposite direction.
We are finally getting to PAN, Points Against Normal. All previous calculations are done independent of opponent. Once several games are on the books in a season, we start to get a picture of who is good and who is not so we can make calibrations to performances.
The baseline as calculated above is adjusted based on the strength of opponents' rush/pass offense/defense. Last year Michigan allowed 0.19 points/rush, which [Ed-M: moment of shock coming] is really bad. So even if the opponent averaged 0.15 points per rush initially, their final tally was negative at –0.04 per play since they performed below what the average team did versus Michigan. A team would have to have an initial average of at least 0.20 to come out positive on the final scoring.
The final scoring is what I will refer to as PAN. It is a measure of actual scoreboard points above the average team you are. PAN can refer to a specific unit such as passing offense, total defense or kick returns, or for a team in total. It is also a good metric for comparing quarterbacks and running backs. It is only somewhat effective for wide receivers since they rarely yield negative plays.
Zero PAN means you are completely average. For a BCS conference team like Michigan this typically means bottom third of the league. A three-points swing in PAN typically equates to an additional win or loss over the course of a season.
+7 will put you around the Top 25 on the season
+14 is typically Top Ten and potential BCS game
+21 is best in class and probably playing for a national championship
The top rated team I have is Florida 2008. They finished +13 on offense, +7 on defense and +3 in special teams. The top Big Ten team is Ohio 2005 at +19 (7/9/3). The top Michigan team is 2006 at +14 (4/6/4). They come in at 50th overall in the last 8 seasons.
I will try and add relevant updates if more questions come up in the comments.
I, for one, appreciate the effort.
Dude, I love the stats and glad you take the time to analyze them. Always interresting to see what is out there.
Thanks for the post, I love your work.
One question, why are fumbles considered random? I understand that actually falling on the ball is probably pretty close to or completely random, but you can coach techniques to strip and hold onto the ball. If a team is good at both sets of techniques, it makes sense that they'd fumble less and cause more fumbles (relatively), which gives them fewer 50-50 chances to lose the ball and more 50-50 chances to get the ball back.
Every way I have tried to look at it shows no correlation from year to year on fumbles forced or recovered. I know there is a lot of teaching coaching behind it, but I haven't been able to find any data to suggest it's possible to be consistently good or bad at it.
Fair enough. I'm really interested in how advanced stats are created and the thought process behind them. I suspect Michigan wouldn't have faired so well in FEI, etc last year if fumbles weren't considered random.
I know it's not easy to make a model like this in your free time, so thanks again.
Using the line on the game helps project the turnover margin. For every 23 points a team is favored by, expect a +1 turnover margin. The relationship is linear, and that scale is good enough to use in the NFL too. My boxscores go back to 1990.
I'm assuming that's cumulative over a season if it works in the NFL, right? There aren't too many pro games with 20+ point spreads.
I think part of the difficulty may be finding actual data on fumbles vice fumbles that result in turnovers. I'd guess fumbles are probably not random over the course of a season for a given team. I find it unlikely that Mike Hart randomly never lost a fumble for that many carries--there was probably some skill involved...
"There aren't too many pro games with 20+ point spreads."
I'm not sure what that has to do with my post. A 10 pt NFL fave will average about +0.4 in TO margin, as will a CFB fave of the same number. Just because there are fewer big NFL faves doesn't mean the relationship doesn't hold. It does.
Thanks for the feedback.
#1 is absolutely an issue. I have done a double strength of opponent adjustment to try and adjust for this issue and you are right the SEC teams are the biggest gainers and the lower conferences are the biggest losers in the process. Unfortunately due to time, resource and coding ability constraints I haven't been able to do it in a manner that is fast and flexible enough to use in a productive way, at least not enough to justify what it is still a minor tweak. The first pass of opponent adjustment probably is 95% accurate for 75% of teams and about 75% accurate for 95% of teams.
#2 Don't know if I completely understand your question but I'll give it a go. My ultimate goal is to be able to compare teams against a universal standard. To do so, whether as a replacement value or average value, requires making some broad assumptions. Those assumptions won't be valid across all situations for all teams but in the aggregate should be reasonable, and at the very least they are highly consistent.
The other challenge with a more team specific approach is the sample size quickly reduces. Because college teams turn over faster than pro teams and the there are fewer games in a season, getting an adequate sample on a specific situation is nearly impossible.
I look forward to your posts and insight.
Do you manually pull down each week's games, or did MCalibur find a way to automate the process?
I built a model last year that is based on a different premise, but is limited by the number of teams I feel like pulling down historical data for. As such, places where I ran regressions would be aided by having a significantly larger population.
I too, am a big fan and in all honesty the Mathlete diaries are my favorite part of the site. Keep up the good work!