A quick guide to where my numbers come from and how they are calculated.
Where Does The Data Come From?
My sole source is the NCAA website, which hosts the play by play data for every year since 2003. 2004 and forward is nearly all there but 2003 is a bit hit and miss.
Thanks to MCaliber I can pull each week’s games down directly from the site into Excel where I translate the text into a variety of field and calculations that ultimately end up in an Access database. My tools are somewhat crude but they work and I can get what I need from them.
To data I have 992,624 plays in the database.
All games between two FBS teams. Any games against FCS teams don’t exist as far as I’m concerned.
Every play from these games are in the database but not all plays go into calculations. End of half drives are excluded as are any drives in the second half where one team leads by 16 points or more. Only plays under those circumstances are excluded, all other plays from those games are included.
Sacks are counted as pass plays and all fumbles are excluded due to their random nature.
What’s The Baseline?
Based on all of this historical data, each down, distance and line of scrimmage are given an expected value. For example:
1st and 10 from your own 20: 1.53 expected points
1st and goal from the 1: 6.48
Since each situation has a value, the value of any play is the change in value created. A 79-yard pass on 1st and 10 from the 20 to the other 1 is worth 4.95 points (6.48 points – 1.53 points). If the running back then punches it in from the 1, he is awarded .49 points (6.97 – 6.48). Touchdowns are worth 6.97 because they create the opportunity for the PAT which is successful 97% of the time. If the PAT is good, the values for the drive look like this:
QB/WR 3.95 points
RB: .49 points
K: .03 points
Thus the 7 points the offense generated are accounted for between the initial 1.53 from field position and the remaining 5.47 from play.
Even plays that gain yards can yield to negative expected point changes. A two-yard gain on 1st and 10 puts the offense in a worse spot than they began even though it was positive yardage. If a drive ends, all of the initial field position points are “left on the field.”
Let’s say a team hands the ball to their running back three times from the 20 and gains 3 yards each play. A punt on fourth and 1 means that the initial 1.53 expected points is now 0 so the running back now has three plays for –1.53 on the books. Third down plays are typically swing plays and can provide large deviations. Convert a lot of third downs and your value/play will be larger than your yards indicated. Fail on a lot of third downs and it quickly swings in the opposite direction.
What Adjustments Are Made?
We are finally getting to PAN, Points Against Normal. All previous calculations are done independent of opponent. Once several games are on the books in a season, we start to get a picture of who is good and who is not so we can make calibrations to performances.
The baseline as calculated above is adjusted based on the strength of opponents' rush/pass offense/defense. Last year Michigan allowed 0.19 points/rush, which [Ed-M: moment of shock coming] is really bad. So even if the opponent averaged 0.15 points per rush initially, their final tally was negative at –0.04 per play since they performed below what the average team did versus Michigan. A team would have to have an initial average of at least 0.20 to come out positive on the final scoring.
The final scoring is what I will refer to as PAN. It is a measure of actual scoreboard points above the average team you are. PAN can refer to a specific unit such as passing offense, total defense or kick returns, or for a team in total. It is also a good metric for comparing quarterbacks and running backs. It is only somewhat effective for wide receivers since they rarely yield negative plays.
What Does It All Mean?
Zero PAN means you are completely average. For a BCS conference team like Michigan this typically means bottom third of the league. A three-points swing in PAN typically equates to an additional win or loss over the course of a season.
+7 will put you around the Top 25 on the season
+14 is typically Top Ten and potential BCS game
+21 is best in class and probably playing for a national championship
The top rated team I have is Florida 2008. They finished +13 on offense, +7 on defense and +3 in special teams. The top Big Ten team is Ohio 2005 at +19 (7/9/3). The top Michigan team is 2006 at +14 (4/6/4). They come in at 50th overall in the last 8 seasons.
I will try and add relevant updates if more questions come up in the comments.