KRACH applied to Division I-A college football

Submitted by quakk on
EDIT:  Fixed a typo and added section headers.

Background


A few years ago I applied Ken's Rating for American College Hockey (KRACH), or Bradley-Terry statistics, to ACHA club hockey teams.  At the time, participants for the national tournament were determined by an opinion poll, and there wasn't enough interplay for that to be meaningful (sound familiar?).

In an earlier post here, I intimated that I'd like to see someone crunch the numbers as a mechanism for rating Division I-A college football teams.  It's something I've been thinking about for quite some time, just to see what would happen.  So tonight I threw something together.

Brief Summary:


"The KRACH rating system is an attempt to combine the performance of each team with the strength of the opposition against which that performance was achieved, and to summarize the result as one number, a "rating", for each team. The higher the rating, the better the team."

"Interpreting the ratings

The ratings are given on an "odds scale": that is, if team A is rated at 400 and team B at 200, team A is reckoned to have odds of 2 to 1 of defeating team B when they meet (since 400 is twice 200). Equivalently, team A is reckoned to have probability 2/3 of defeating team B (since 400/(400+200) is 2/3)."

"There are two things we need to check, to make sure that the rating system is sensible:
  1. If you win more against the same opposition as another team, your rating will be higher.
  2. If you have the same record as another team, but against tougher opposition, your rating will be higher."

Methods


So, I took the season results from the official NCAA page.  I excluded results against FCS competition, as a matter of principle.

One caveat here - I haven't yet worked out what to do exactly with undefeated and winless teams.  This will become meaningful at the end of the season if there are multiple undefeated teams (I'm not sure I really care about the winless teams).  While I sort that out, I've done the following:
  • Verified my calculated rating by calculating the predicted number of wins;
  • Determining a percentage difference between the predicted and actual number of wins;
Teams with the same KRACH are ranked according to percentage difference, lower is better.  Interestingly, of the undefeated teams, Florida and Alabama rate near the bottom.  Woo!  SEC!  This should be sorted by the end of the season, when it IS important.

Results


Without further ado, the first KRACH rating for Division I-A college football:
    KRACH Predicted
Wins
Actual
Wins
%
Difference
1 Texas 200.000 8.937 9 0.702
2 TCU 200.000 7.928 8 0.898
3 Boise State 200.000 7.848 8 1.906
4 Cincinnati 200.000 7.830 8 2.131
5 Florida 200.000 7.752 8 3.104
6 Alabama 200.000 8.716 9 3.160
7 LSU 39.863 6.999 7 0.011
8 Georgia Tech 35.246 7.998 8 0.024
9 Oregon 32.752 6.998 7 0.021
10 Iowa 28.690 7.998 8 0.026
11 USC 21.015 6.999 7 0.021
12 Pittsburgh 17.119 6.998 7 0.024
13 Arizona 14.574 4.999 5 0.025
14 Oregon State 14.533 4.999 5 0.021
15 Miami (FL) 13.562 5.999 6 0.022
16 Ohio State 13.418 7.998 8 0.026
17 South Florida 12.804 3.999 4 0.021
18 Penn State 12.186 6.998 7 0.025
19 Virginia Tech 10.888 5.999 6 0.020
20 Clemson 9.944 4.999 5 0.023
21 Wisconsin 9.061 5.999 6 0.023
22 Oklahoma State 8.788 5.999 6 0.020
23 Temple 7.966 6.998 7 0.026
24 Stanford 7.682 5.999 6 0.021
25 Houston 7.679 6.998 7 0.024
26 Utah 7.215 7.998 8 0.024
27 California 7.194 4.999 5 0.024
28 Auburn 5.098 5.999 6 0.021
29 Navy 4.686 6.998 7 0.026
30 Georgia 4.627 3.999 4 0.019
31 Rutgers 4.399 3.999 4 0.022
32 Boston College 4.292 4.999 5 0.025
33 West Virginia 3.877 5.999 6 0.020
34 Tennessee 3.789 4.999 5 0.018
35 Notre Dame 3.761 5.999 6 0.022
36 Arkansas 3.534 3.999 4 0.022
37 UCLA 3.515 3.999 4 0.016
38 Florida State 3.103 2.999 3 0.019
39 North Carolina 3.030 3.999 4 0.020
40 Brigham Young 2.787 6.998 7 0.025
41 Washington 2.689 3.000 3 0.013
42 Central Michigan 2.345 5.999 6 0.024
43 South Carolina 2.315 4.999 5 0.017
44 Minnesota 2.130 4.999 5 0.023
45 Mississippi State 2.117 3.000 3 0.016
46 Fresno State 2.057 4.999 5 0.025
47 Kentucky 1.951 3.999 4 0.021
48 Mississippi 1.905 3.999 4 0.020
49 Arizona State 1.582 2.999 3 0.024
50 Texas Tech 1.541 4.999 5 0.020
51 Northwestern 1.260 4.999 5 0.023
52 Michigan State 1.219 3.999 4 0.022
53 Oklahoma 1.196 3.999 4 0.017
54 Nebraska 1.172 5.999 6 0.022
55 Wake Forest 1.133 3.000 3 0.015
56 Connecticut 1.010 2.999 3 0.023
57 Duke 0.905 3.999 4 0.025
58 Purdue 0.865 3.999 4 0.018
59 Syracuse 0.836 2.000 2 0.017
60 Air Force 0.729 4.999 5 0.025
61 Virginia 0.719 3.000 3 0.017
62 Missouri 0.718 3.999 4 0.020
63 Troy 0.690 6.998 7 0.022
64 North Carolina State 0.561 2.000 2 0.017
65 Kansas State 0.555 3.999 4 0.020
66 Michigan 0.545 3.999 4 0.023
67 Texas A&M 0.536 4.999 5 0.020
68 Baylor 0.476 2.999 3 0.025
69 Idaho 0.466 6.998 7 0.024
70 Iowa State 0.437 3.999 4 0.023
71 Kansas 0.434 3.999 4 0.020
72 East Carolina 0.413 3.999 4 0.024
73 Middle Tennessee State 0.363 5.999 6 0.022
74 Illinois 0.358 1.999 2 0.026
75 Southern Methodist 0.333 3.999 4 0.022
76 Louisville 0.305 2.000 2 0.020
77 Northern Illinois 0.304 4.999 5 0.024
78 Indiana 0.289 2.999 3 0.023
79 Nevada 0.287 5.999 6 0.022
80 Marshall 0.259 3.999 4 0.021
81 Southern Miss 0.240 3.999 4 0.020
82 UCF 0.223 3.999 4 0.018
83 Bowling Green 0.206 3.999 4 0.024
84 Colorado 0.203 2.999 3 0.025
85 Ohio 0.181 4.999 5 0.020
86 Louisiana-Monroe 0.177 3.999 4 0.018
87 Maryland 0.149 1.000 1 0.021
88 San Diego State 0.147 2.999 3 0.019
89 Wyoming 0.140 2.999 3 0.017
90 UAB 0.134 3.999 4 0.015
91 Toledo 0.117 3.999 4 0.020
92 Colorado State 0.115 2.000 2 0.020
93 UNLV 0.103 2.999 3 0.019
94 Kent State 0.092 3.999 4 0.020
95 Western Michigan 0.088 3.000 3 0.013
96 Washington State 0.074 1.000 1 0.013
97 Buffalo 0.050 2.000 2 0.019
98 Louisiana-Lafayette 0.048 3.999 4 0.016
99 Florida Atlantic 0.036 2.000 2 0.021
100 Tulane 0.035 2.000 2 0.013
101 UTEP 0.033 3.000 3 0.016
102 Memphis 0.031 1.000 1 0.019
103 Miami (OH) 0.031 1.000 1 0.020
104 Tulsa 0.031 3.000 3 0.015
105 Akron 0.030 1.000 1 0.025
106 Hawaii 0.028 2.000 2 0.023
107 Louisiana Tech 0.021 2.000 2 0.019
108 Arkansas State 0.019 1.000 1 0.021
109 Florida International 0.018 2.000 2 0.018
110 Army 0.015 3.000 3 0.013
111 New Mexico State 0.009 2.000 2 0.017
112 Utah State 0.008 1.000 1 0.007
113 North Texas 0.006 2.000 2 0.012
114 Vanderbilt 0.004 1.000 1 0.005
115 Ball State 0.002 1.000 1 0.007
116 Eastern Michigan 0.001 0.442 0 0.000
117 New Mexico 0.001 0.162 0 0.000
118 Rice 0.001 0.259 0 0.000
119 San Jose State 0.001 0.006 0 0.000
120 Western Kentucky 0.001 0.216 0 0.000

Comments

erik_t

November 11th, 2009 at 4:36 PM ^

Outside of some previously admitted need of finessing of the undefeated team numbers, I like this quite a lot. The top 25 is surprisingly defensible. And it loves the Pac-10, which makes me feel warm and happy inside.

Rasmus

November 11th, 2009 at 4:48 PM ^

What I'd really like to see is college football bloggers get together and instigate a true open-source computer rankings project. It's crazy that the computer programs used for the BCS are not open-source.

joeyb

November 11th, 2009 at 5:38 PM ^

I think the reason that they aren't is that there would be too much criticism on weighting. They would tweak it year to year and it would cause more controversy because someone would go back to the previous year with the new formula and show that team #3 should have actually been playing for the championship. Some things need to be a black box.

Rasmus

November 12th, 2009 at 7:44 AM ^

project is you could have different versions, all such that people (meaning people who know math and can read a computer program) can understand how they work. Or, better yet, you make it so the user can set the parameters, the "weighting" as you put it. Sure, there would be a lot of arguments and disagreement about what to do, but that's the beauty of the open source movement -- anyone can splinter off and build a better mousetrap, as long as they keep it open so everyone can benefit. The concept of the BCS is flawed for a variety of reasons, but one of the most problematic is that it is not transparent. Do any of the computer programmers "tweak" their weighting from year to year, or even mid-season? Surely they do. But we know nothing about it. Some are more transparent than others in explaining their approach, but I don't think any of them have actually opened their code -- it's all proprietary.

Alton

November 12th, 2009 at 10:59 AM ^

Actually, one of the six BCS computer rating systems is open-source: Wes Colley's system. http://www.colleyrankings.com/matrate.pdf I used the iterative version of the ratings described in his paper to re-create his ratings one year, and they were an exact match. Colley's website also allows you to add and remove games to see the effect those changes would have on the ratings. As far as KRACH is concerned, there is one large flaw if you do not add in the imaginary team with 1 tie against each of the other teams--any undefeated team, no matter how easy their schedule is, will be rated ahead of any team with one loss. If you add in the imaginary tie, this problem goes away. Let's say Team A is 10-0 against the bottom 10 teams in the nation, and Team B is 9-1 against the top 10 teams in the nation. We all agree that, in the absence of any other information, Team B should be rated ahead of Team A, right? If you don't add in that imaginary tie game, then KRACH would put Team A ahead of Team B.

Togaroga

November 11th, 2009 at 4:56 PM ^

When given the ultra-reliable eyeball test, Alabama looks stronger than Florida or Texas. I know Texas has a great defense, but Alabama just seems to manhandle people without flashiness, but with all of the consistency I'd like to see out of a true #1.

joeyb

November 11th, 2009 at 5:49 PM ^

There is one problem with this. An undefeated team automatically has the highest possible score, so Hawaii ends up in the BCS Championship game. This also leads to easier and easier schedules instead of harder and harder schedules for elite teams (teams not expecting to go undefeated would actually benefit from playing tougher schedules as shown by Temple). To apply this to football, it should probably try to evaluate how you played in wins as well. Not to mention the top 6 teams all have the same score. How do you pick which is the best? This is not meant to knock the work put in by the OP, but more as a point of constructive criticism to improve the ranking.

quakk

November 12th, 2009 at 2:02 AM ^

While it looks so in this analysis, it's a direct result of my failure to properly handle undefeated teams and a lack of a wins and losses comparison between the top teams. Next week I'll either add the fictional tie team (which I'm loathe to do from a purist perspective) or calculate the round-robin winning percentage (RRWP) and strength of schedule (SOS) to go along with the infinite rating for undefeated teams.

wile_e8

November 11th, 2009 at 5:51 PM ^

One caveat here - I haven't yet worked out what to do exactly with undefeated and winless teams. This will become meaningful at the end of the season if there are multiple undefeated teams (I'm not sure I really care about the winless teams).
I wrote a KRACH program a few years ago, and what I did for the undefeated/winless teams was create a fake "tie team", and then have every single team play that team and have a tie. I had built my program off the description at USCHO, so it already handled ties. This had a bit of a fudge factor, but you need to make sure everyone has greater than zero wins and losses unless you have figured out how to divide by zero. I also made just one "I-AA" team to use for whenever someone played against a team from the other division because I didn't want to have to keep track of all those teams too. It was nice in the middle of 2006 when I could use it to confirm that Michigan was better than their ranking in the polls, but in the following years keeping up the spreadsheet was too big a pain just to further my misery.

colin

November 11th, 2009 at 6:55 PM ^

neutral field ~21 point spread for us against OSU. Other estimators have us nearer to a 50% chance at 6 wins, where this method makes like 10%. I think it would work better if point margin were used instead of a win/loss binary. Also, the Krach Dude mentions in a link on his site what his solution for the perfect and perfectly futile teams are, which is to average in a tie (.5 wins) against a fictional perfect team for every team.

SpartanDan

November 12th, 2009 at 1:02 AM ^

Your ranking for the unbeatens is actually backward - if you give them a fixed rating, the ones with the greatest discrepancies between predicted wins and actual are the ones that have played the toughest schedules. As others have mentioned, the "usual" fudge factor is to throw in a tie against an average team, which gets rid of all the connectedness issues. The other option is to calculate RRWP (round-robin win percentage) by using the KRACH method within groups of teams that are fully connected in both directions (that is, every team in the group has a win chain to every other team) and assigning win probabilities when connections exist in only one direction or in neither (if one team has a chain to the other but not vice versa, that team is assumed to have a win probability of 100%; if they are not connected in either direction, such as two unbeaten teams, they are arbitrarily assigned 50-50 win probabilities). This is messy because you have to go ahead and determine victory chain groups, though. The fictional tie against an average team is probably much easier. (It may even work to give that game a lower weight, so it has less impact on the standings while still normalizing everything to get rid of ugly zeros and infinities.)

Seth9

November 12th, 2009 at 1:16 AM ^

College hockey has about three times the number of games as college football and half the teams. There simply isn't enough information in the common opponent data set to produce meaningful numbers. It is interesting, to be sure. The system just isn't designed for College Football.

Seth9

November 12th, 2009 at 3:16 PM ^

KRACH is designed for a league with approximately the number of games and teams as college hockey. It is not functional for college football because there aren't enough non-conference games and there are too many teams for KRACH to be successful. KRACH was designed with the assumption that being undefeated at the end of the season would be a miraculous event and is reliant on a larger sample size than exists in college football to assign probabilities.