Bradley-Terry Statistical Rating (KRACH) for FBS Football

Submitted by quakk on November 10th, 2010 at 7:17 PM

What do the numbers say?

The Bradley-Terry method applied to college football.

A couple of notes regarding the calculations. I use the Bradley-Terry method for determining the ratings. This is an iterative, statistical rating that computes a hypothetical round-robin winning percentage if all teams played each other. Clearly, that's not the case in college football, and this method gives infinite results if teams are undefeated. This problem is 'solved' for the sake of comparison by adding a fictitious tie to each team's record.

Further notes:

  • Game results are pulled from the
  • Blogpoll results are pulled from SBNation.
  • There are a lot of explanatory notes and links; I put those at the end of the post so people who don't care about them can skip them and get right to the results. There is also a link to all my results.
  • For brevity, I only listed the top 20 here. For those who are interested, I also listed Michigan's position, FYI.
  • I release this after all the major polls come out to avoid 'influencing' anybody's vote.

To the numbers...

Through games of 2010.11.06
  Team Rating Previous Delta Blogpoll
1 Auburn 85.581 1 0 2
2 Oregon 38.314 4 2 1
3 T-C-U 37.076 3 0 3
4 Boise State 28.029 2 -2 4
5 L-S-U 27.484 5 0 5
6 Stanford 15.449 13 7 6
7 Oklahoma State 15.175 12 5 10
8 Nebraska 14.947 8 0 8
9 Michigan State 12.443 9 0 11
10 Wisconsin 12.340 11 1 7
11 Utah 9.372 7 -4 15
12 Virginia Tech 8.671 14 2 20
13 Iowa 8.383 17 4 13
14 Alabama 8.000 16 2 12
15 Arkansas 7.989 20 5 14
16 Miss State 7.656 19 3 18
17 Ohio State 7.608 18 1 9
18 Missouri 7.282 6 -12 17
19 Arizona 6.162 15 -4 16
20 Oklahoma 5.663 10 -10 19
-- -- -- -- -- --
29 Michigan 2.621 41 12 NR

Auburn is pulling away as the number one team. Oregon has finally caught up with everybody's ranking due to strength of schedule, and Boise St. has begun to fall for the same reason. The bonus for being undefeated this far into the season is starting to balance out with perceived strength of schedule, as LSU is nipping at the Broncos heels.

As a point of comparison, it lines up really well with the blogpoll.  Two outliers:  blogpollers really like Ohio State, and they really dislike Virginia Tech (see discussion of limitations, below).

There's a pretty significant drop-off from #5 LSU to #6 Stanford, and another notable drop-off from #10 Wisconsin to #11 Utah. I think this supports the notion that a 16-team tournament would be sufficient to include all the top teams. If you're in the muddle around 16, there really isn't all that much to complain about if you're left out.

The conference breakdown in the top 10 and top 16 (non-BCS conferences in parentheses):

Conference Top 10 Top 16
SEC: 2 5
Big Ten: 2 3
Big 12: 2 2
Pac 10: 2 2
(MWC): 1 2
ACC: 0 1
(WAC): 1 1
Big East: 0 0
(Conference USA): 0 0
(Independent): 0 0
(MAC): 0 0
(Sun Belt): 0 0

The top 10 has roughly equal representation from the BCS conferences. Looking at the top 16, a bias toward the SEC begins to emerge, with 5 teams, including those in spots 14-16. Not surprisingly, the Big East is absent, and the ACC is, well, underrepresented. TCU, Boise St. and Utah are ranked above all comers from these conferences, including Virginia Tech, who lost to Boise St. and, as we all know, James Madison. The nagging question remains: how do you compare the relative strengths of conferences when they don't play each other?

The next few weeks should be interesting.

Technical notes

I explained the mathematics of the calculations in much more detail last year. Wikipedia has a nice explanation of the foundation of the Bradley-Terry method.

Discussion of limitations

Many of the strengths of this method are addressed at U.S. College Hockey Online.

That said, there's always a bit of resistance when I post this rating. It's one additional data point. It's not even my opinion, and it doesn't mean your team is better or worse than you think it is. It attempts to look objectively at how teams would fare, should they play every other team. There are some limitations, namely the infinite results, and incomparability, of undefeated teams. As with any statistical calculation, sample size is important; while there are only ~12 games per team, there are ~120 teams. One could argue the merits of using any sort of statistical calculation on said sample. Also, it should be pointed out that games against FCS teams are ignored. This is a double-edged sword: teams don't get credit for beating up on FCS teams, but Virginia Tech effectively gets a pass for losing to James Madison.

Brian pointed out another interesting anomaly (it's the double star at the very bottom) in last year's end-of-season college hockey KRACH. A similar effect can be seen in this rating, as discussed above. So, why does this happen? Like college football, there's little overlap between conferences, teams tend to get compartmentalised.

As with any tool, it's only as good as its user; we can't blindly take the results as fact. One possible solution is to take the top 30 teams at the end of the season and run a KRACH on only those teams. Although, for any hypothetical tournament, I would strongly support the inclusion of all conference champions.

What if I want to see the entire rating, and results for each week?

All the results are available, if you'd like to see the numbers yourself. As I said last year, John Whelan freely gave me the perl script in 1998 to calculate KRACH for ACHA club hockey teams, so I'm happy to share the script and input data if you don't want to write it yourself. And I am fallible. There's a lot of data to crunch, and I copied and pasted from the NCAA site; there may be errors. If you find one, please bring it to my attention and I'll make the fix posthaste.



November 11th, 2010 at 1:08 AM ^

plus a modified version which takes into account margin of victory and is geared more toward prediction. The basic version has some minor differences from yours (mostly because I count 1-AA games, with all 1-AA teams lumped into a single entity) but on the whole matches up well. (One notable difference: OSU jumps the SEC trio that's right ahead of them in your version because of Ole Miss's loss to a 1-AA, and the bottom falls out on Virginia Tech as well - they drop to #27, just ahead of Michigan and PSU.)

You can see them at ; I've also been posting projections of the Big Ten standings based on both versions at Off Tackle Empire.


November 11th, 2010 at 7:28 PM ^

with the margin of victory.

It would be interesting to see which one more accurately predicts the results of games:  strict won/loss, or margin of victory.  What are your thoughts on home/away?

As for the I-AA, it's interesting what you've done to lump them all together.  Have you considered adding a I-AA strength of schedule component?  IIRC, App St was pretty good the year they beat us.  And I was going to suggest the same of UMass, until I looked at their record;  they're still #14, though.  (An aside, they play Delaware this week - I would have to think they're the only team to play the winged helmet twice in one season.)

FWIW, I'm not a statistician, I just dabble with this because I can and I really hate the subjective human element (polls) used to determine a 'national champion.'

EDIT:  Doh!  Meant as a reply to SpartanDan.


November 12th, 2010 at 10:47 AM ^

For home/away, I think the right way to handle it (if at all) is probably as an adjustment to strength of schedule - count your opponent's rating as maybe 1.5x for a road game or 2/3x for a home game. Haven't tested it, might over the summer.

My intuition is that the basic system does a great job of rating teams on what they've achieved over the season but the score-based one is a little bit better predictively and gives sane results earlier in the season. Best of both worlds might be with a different victory-point formula that's discontinuous near zero (so there's a big jump from winning a close game to losing a close game, but margin still matters beyond that).

For 1-AA, it's a little bit of a problem, but getting clean rankings of 1-AAs relative to 1-As is just about impossible with so little overlap and most of the little tricks I'd thought of to try such a thing would probably make the ratings fail to converge. For top-level teams I don't think it matters much; if it's a 99.5% chance of a win instead of a 99.99% you're really not going to see any difference (which is one area I like this _much_ better than RPI). It might have a little effect on mid-tier teams, though.


January 29th, 2014 at 5:02 AM ^

Not sure if you're still active on here, but I'd be interested to take a look at that KRACH perl script you mention at the end of this post.

I've built a KRACH implementation in Excel VBA (also for playing around with ACHA hockey rankings, oddly enough), but it's a bit cumbersome and may be at greater risk of breakage with software updates than simple script would be.

And I don't suppose you know if Whelan has updated his work on KRACH since 1998, do you?