By The Numbers - How it Works

Submitted by The Mathlete on October 6th, 2009 at 3:59 PM
Here is a more detailed look at how I calculate the numbers for By The Numbers.

For every down, distance and yardline I have a calculate expected value. The expected value equates to the average points scored from an average team in that situation. 
   *Example, 1st and 10 at your own 20, no situation has more data points than this one. Last year, this situation yielded an average of 1.57 points every time it occurred. Obviously, you can't score 1.57 points in a football game.  If you had the ball in this situation 100 times, you would score 157 points.  It could be a TD every 4-5 possessions or a FG every other possession or probably some mix.

Each play changes the that expected value and that value is then attributed to the player/players who were recorded on the play.  Over the course of games and seasons these points add up, some positive, some negative and we begin to see a clearer picture of what value was added by what players/units.

But adding value isn't the same for all opponents.  A total of +10 is a very impressive number, but its more impressive against a good team than a bad team.  After all of the data is collected, every team's unit is rated on a per play basis.  This value is then added or subtracted from every play that occurs against it. 
    *Example, a good rush defense averages -0.1 against it every time the opponent runs. They are playing a decent run offense that averages +.04 every play.  If the net result for the game is a -5 on 40 carries, the adjusted results would be a -1 rating for the offense (-5 + 0.1*40 = -1) and a +6.6 rating for the defense (-[-5 - 0.04*40]) in my write-ups, positive is always above average and negative is below average.

So the essence of the metric is how many scoreboard points did the player/unit contribute vs average and accounting for competition.

Exceptions and Notes

  • Plays with lost fumbles are removed from all numbers because fumbles are considered random and greatly skew ratings
  • QB sacks are included for team passing metrics but not for individual players
  • Garbage time is not included in stats.  If a team is up by 4 TDs in the 3rd quarter or 3 in the 4th it is considered garbage time and no plays are recorded.
  • Wide receivers have 2 ratings, a rating on balls caught (Value) and a rating on balls caught or on balls targeted at them (Value+) the two metrics tell two different things and I haven't figured out how to combine them.  WR values typically run higher because of the lack of negative plays assigned directly to a WR.
  • Performing on third down is huge, on third down you either make a first down and you gain big points, or your drive is over and you lose any points expected for the drive (unless in FG range).  This is one of the big advantages of this system, it can reward/punish plays made on big downs appropriately
  • Only games against 1A competition count.  Games against 1AA teams are basically scrimmages with nothing good or bad counting.
  • All data is pulled directly from play by play data hosted on the NCAA website.  I load all the data into a SS, run a bunch of fancy formulas and then dump it into a database where I can run queries till I pass out or the boss shows up.


It is scary to put this in writing, but here are my goals.

Monday - Game Review
Tuesday - Big 10 Player Rankings
Wednesday - Big 10 Team Rankings
Thursday - Flex/Catch up if I missed a deadline
Friday - Game Preview

During the offseason I am looking for ideas to pull from my DB of plays to validate or refute conventional wisdom.  Items such as, is momentum real on quick change plays? Examining 4th down convention. Etc, again, looking for ideas.

Ideas going forward

I am very open to ideas anyone has on how to improve what I pull, how its calculated or what I do with it.  Also, I am working on moving from expected points to a win percentage calculator so that there is no need for garbage time gray area.  Won't happen this year but hopefully next year I will have that added.



October 6th, 2009 at 4:12 PM ^

you should add units to your plus/minus so people know what the stat is denominated in. or maybe just refer to it as points above average rather than plus/minus? the former has a lot more concrete meaning and evident utility.


October 6th, 2009 at 5:06 PM ^

I think historical conference strength would be very useful. Maybe identify the 25th, 50th and 75th percentile teams from each conference against FBS average over however long your database goes. Particularly early in the season when team quality isn't evident, using those levels as proxies could be helpful in sorting out early season variance.

I've also been meaning to get around to investigating recruiting rankings and their effects on winning. The way Rivals is set up now makes it pretty easy to rip average stars per team conference by conference. A simple average or, better, a weighted average based on average starts by years in the program should prove to be a very useful indicator of future wins.

It would go some way to defeating the nefarious forces of Resume Rankers in the Blogpoll, which I think are ridiculous. If anything, it makes far more sense to assume that the history of the program's recent success on the field and in recruiting would be far better predictor than a handful of games early in the season. And I don't see why a poll should not try to be predictive, considering the success of 'wisdom of crowds' and futures markets.

As a sidenote, I suspect the Blogpoll does a poor job of leveraging the potential analytic power of its bloggers. IME, bloggers should be instructed to create a poll that reflects how the blogger thinks the season will play out.

The Mathlete

October 6th, 2009 at 5:34 PM ^

The strength of opponent takes shape pretty quickly, generally by week 3 or 4 for college and week sooner in the NFL because you don't have the huge talent and scheduling disparities. The weakness is not in accounting for opponent strength, the weakness is that you are compared to all the other teams a school has played. So if you do 0.1 points per play better than the competition, there is no accounting for whether or not the rest of the competition was really good or really bad. Still figuring this one out.


October 6th, 2009 at 6:24 PM ^

I'm pretty sure there's fairly little difference in talent between the lesser conferences. Using them as a presumed baseline to measure against would probably go a long way toward determining team quality since BCS teams tend to play cupcakes far more than real opponents prior to conference play.

The Mathlete

October 6th, 2009 at 6:51 PM ^

Good conference opponents and bad non-conference opponents vs bad conference opponents and good non-conference opponents will wash out some of the gap. I do a conference ranking right now that averages each teams total rating and it shakes out about as the conventional wisdom. YTD the SEC is first at +6.4 meaning the average SEC team is 6.4 points per game better than the average team. The Big 10 is 5th at +1.1 and the BCS conferences occupy the top 6 spots and are a full 3 points better than the next best conference.


October 6th, 2009 at 5:08 PM ^

Mathlete, speaking just for my dumb self, I'm still kind of lost. I'm with my understanding of how the numbers add up. So, I'll try to confirm just the result of 1 play:

A 1st/10 at the Opp20 can be expected to yield 1.57 points. If the offense gains 10 yards (a good play), then they've got a 1st/10 at the Opp30 and from there, they can be expected to yield more than 1.57, for the sake of argument, I'll guess a 15% increase in expectation to 1.81. The difference, +.24, between play one and play two expectation is what gets "banked" ... to the credit (or debit in the case of bad plays) of the player and unit.

Is this correct?

The Mathlete

October 6th, 2009 at 5:31 PM ^

First of all, excellent assessment of a 1st and 10 at the 30, I have it in for 1.82. So a players gains the 10 yards, expected points goes from 1.57 -> 1.82, a gain of .25. That play was worth .25 actual points that are banked to the player(s) and the unit(s). The defense, likewise, is deducted the .25. If that play was against a good defensive unit, the actual value may be increased to say .35 or decreased if it was against a bad defense.

Think of it this way: Michigan has 1st and 10 at the 20 and the following plays happen.
1st and 10 own 20, 1.57 exp value: Forcier to Matthews for 10 yards.
Forcier, Matthews and Michigan pass offense get credited .25 points. Opp pass defense gets credited -.25 points.
1st and 10 own 30, 1.82 exp value: Minor runs for gain of 2.
Minor and Michigan rush offense credited -.11. Opp rush defense credited +.11 pts.
2nd and 8 own 32, 1.71 exp value: Forcier to Odoms for TD. Olesnavage PAT good.
Forcier, Odoms Michigan pass offense all credited with 5.26 pts. Opp pass defense credited with -5.26 and Olesnavage credited with +0.03 pts.

A TD is worth 6.97 points and the PAT is worth .03.

Hope this helps.


October 6th, 2009 at 6:25 PM ^

OK, so, sticking with the Offensive Unit on this drive, they get + .25 - .11 + 5.26 for a total of 5.40. That's the total EXPECTED points awarded from the drive while the total ACTUAL points awarded from the drive is obviously 7. So, in this way, Michigan has out-delivered expectation by 1.6 points.

Just so I'm clear, let me take it to the ridiculous extreme.

A game opens with Team A retrieving Team B's attempted onside kick at B45. They run exactly 13 plays, gaining exactly 3.4 yards per play, getting to the B1 where they fumble. Team B takes over and scores on a 99 yard play from B1.

Strangely enough, this sequence happens over and over again, with Team B winning 84-0, outgaining Team A 1,188 yds in 12 plays vs 528 yds in 156 plays. Team A therefore has the worst defense and worst offense ever measured by your system (or even in the system used on Zoltan's planet) and Team B, the best.

The interesting perspective comes in the fact that Team A's 528 yards would otherwise be great and Team B's defense would appear to be a seive. But your system pays no regard to boxscore elements; in fact you completely ignore them, allowing the analysis to focus only on the units' abilities to overdeliver or underdeliver on statistical expectations.

Am I still on the right track?

The Mathlete

October 6th, 2009 at 6:45 PM ^

First on the fake Michigan drive.

Michigan actually outperformed expectations by 5.4 points. They were expected to score 1.57 points based on the fact that they started at their own 20. The field position "created" 1.57 points of value, the offense created 5.40 points of value and the kicker created .03 points of value totaling the 7 points on the scoreboard. If for example Michigan would have gone three and out, then they would have lost 1.57 points in value that they were given to start the drive.

As for the Team A/Team B scenario you are right. Team A would have a value for the game of -35.16 (-2.93 for starting at the B45 x 12 possessions). This means that based on starting field position, Team A should have scored 35.16 points, they scored 0, so -35.16 for the game. Team B would only be expected to score 0.65 points per drive, meaning they (and the kicker) added 75.84 points of value to the game.

Let's take your scenario for Team A and change one thing, instead of failing on 4th down at the 1, they kick a field goal and miss every time. Now the responsibility of the lost points is on the field goal unit, not the offensive unit. A field goal attempt from the 1 yard line is worth an average of 2.78 points, 93% success. Therefore on each drive the offense started with 2.93 points and ended with 2.78, they lost 0.15 points in value each drive, 1.8 points for the game. The field goal unit lost the other 33.36 points. The team still lost 35.16 points for the game, but the distribution is different because instead of the offense ending the drives, the special teams ended it.


October 7th, 2009 at 12:01 PM ^

So, if a team starts with 1st and 10 on the 20 yard line, and the first play is an 80 yard TD with made PA, there is a +5.4 to the offense? The QB and WR will each also bank +5.4

The next drive starts at the 20 yard line again, only this time, the first play is a rush for no gain, and the second play is an 80 yard TD. The offense again gets credited with a +5.4, only this time, the QB and WR will bank a number higher than 5.4 because the expected point total on the play they scored was lower 1.57, and the RB will have a negative?

The offense's point total is simply a reflection of actual points scored minus expected points based on starting points for all drives?


November 12th, 2009 at 9:07 PM ^

YOu guys are freaking sweet! Where do you get the data from? I like your system for sure, as it incorporates the other teams skill. This system completely excludes the other team and does a subjecive single team offensive analysis. If it doesn't make sense I'll clean it up, but see what you think.

I have been doing this calculation for a few years, and it has been fairly accurate on predicting the likelihood of winning or losing the given game.

for every third down play you engage in you are penalized, weighted by the number of yards to the opponent's endzone divided by 2 if you are less than halfway between where you took possession of the ball and the opponent's endzone. SO if you get to third and 1 and you are 35 yards from the midpoint, you are charged 3.5 points.

SO each third down should be penalized like this. (yards to endzone) * 0.5 if we are not across the midpoint to the endzone when third down occurs, and if we are across the midpoint, .05 for each third down, reduced by successful fourth down conversions.

So 4 points at the 20, 3.5 at the 30, 3 at the 40, 2.75 at the 45, and 2.5 at the 50. Thereafter, we only add 0.5 for each third down play where the offense either has to kick a field goal or punt. If we go for it on fourth down, the 0.5 penalty from third down is subtracted from the penalty, regadless of success.

THe goal is to break even at 0 at the end of the game.

SO if on your first drive, you get the ball at the 20, and face third down prior to getting to the 30, 4 points. Imagine then that we convert on third down. THe 4 points stays. Thereafter, we have a 3rd down on the 40, so we are charged 3 points. If we convert that third down, but do not make it past the 50 before the next third down, 2.5 points. SO 9.5 total, even assuming we convert that third third down. Two more third downs inside the 50 would give us 1 point more, so a total of 10.5. If we score a touchdown, we offset it with our penalty points, if we don't score a touchdown, we reduce the penalty points by the number of converted third downs on the drive. So on this drive if we didn't score we would have 5.5 penalty points. If the next drive saw no third downs prior to the midway point and there were no turnovers and we scored then we would be at -1.5.

Ultimately then at the end of the game we would want to be at or close to a penalty point number that was at least 7 less than our actual point total.

I think this way now because this, in my opinion, is how you approach the game when you have a team that mostly loses because they beat themselves. I would simply focus on execution and what we're doing on offense, without regard to what the other team is going to bring. When this offense comes of age next year, they are going to go undefeated. We got Lewan and Schofield, Molk back, and all sorts of team speed.