well i'm impressed.
Graph Theory Division I Ranking
First of all, I should thank Coach Schiano for the idea for this ranking system. I've taken the concept from his diary, made a few tweaks, and applied it to all FBS and FCS teams. If you need an explanation of what is happening, I suggest reading there first.
So, the first thing that I did was I loaded all of the game results thus far (excluding last night and this night) into a database. Then, I was able to write a script that built paths between every combination of teams with the lowest possible value. So, MSU beat Wisconsin and Wisconsin beat OSU. The first step is to create paths of length 1 for those two games. After that, I can calculate the value of the path from MSU to OSU and then create a direct path of length 2 between them. Iowa's path to OSU is 3 and so on.
Once I had calculated the minimum value for all possible paths, I averaged out the values and had a very redimentary ranking. This is the exact system that was used in the original diary to rank the teams of the Big Ten, but now we have expanded the teams to compare.
|Original, Unweighted, Without Record|
|North Carolina State|
As you can see, this isn't a very accurate reflection of the best teams in college football. It took me a few minutes to figure out what was going on, but it became clear when I realized that Michigan was on there. The problem is that this system strongly favors a diversified schedule. The more teams that you beat that don't play each other, the better chance you have of getting 2 or 3 length paths.
The first thing that I tried to do to fix this was to weight the games. Instead of automatically giving a team a path of length one for a win, I started dividing that by the number of scores (8 points) that a team won by. So, winning by 24 points, or 3 scores, would give you a path length of .333 over that team. This works out really well because it benefits teams that win by 2 or 3 scores, but it doesn't benefit teams too much for going beyond that.
|Weighted, Without Record|
This seemed to get me a lot closer to where we want to be with a poll, but there are still some issues. How can Auburn be ranked 14 and Missouri be ranked 3? Well, now there is too much weight on winning strong games. But wait, if that's the case, then why isn't Wisconsin ranked in the top 10? That's because all of their blowouts came in the Big Ten. Wisconsin looked like a pretty bad team at the beginning of the year because they won some very close games against lesser opponents.
The only way that I could figure to solve the problems with overweighting is to add another weighting component, which is actually pretty obvious. I decided to add a winning percentage multiplier at the end. Originally, I figured that the concept of graph theory would account for winning. What it really does is account for beating the right teams, i.e. it is the strength of schedule calculation. The reason that I went with a winning percentage is because I don't want to give an advantage to teams playing Hawaii or in championship games, so raw wins was out of the question. I also need a way to penalize a team that loses a 13th game. The only way to do this is to do a winning percentage. 13-0 and 12-0 are now 1.000. 12-1 and 11-1 are only .006 apart. This gives a slight benefit to teams playing an extra game, but also makes sure to penalize properly for losses.
|Weighted, With Record|
|29||Delaware (Highest rated FCS Team)|
|31||Jacksonville State (Beat Ole Miss)|
As you can see, this looks like a real ranking now and it won't automatically place an undefeated team ahead of a 1-loss team. I'm pretty excited about the outcome, because this is actually pretty comparable to the polls that are out there. How comparable?
|My Poll||R||BCS||R||D||AP||R||D||Coaches||R||D||Harris||R||D||Computer Average||R||D|
|Ohio State||5||Ohio State||6||1||Ohio State||6||1||Ohio State||6||1||Ohio State||6||1||Ohio State||9||4|
|Boise State||6||Boise State||11||5||Boise State||9||3||Boise State||10||4||Boise State||10||4||Boise State||14||8|
|Michigan State||11||Michigan State||8||3||Michigan State||7||4||Michigan State||7||4||Michigan State||7||4||Michigan State||11||0|
|Oklahoma State||13||Oklahoma State||14||1||Oklahoma State||16||3||Oklahoma State||15||2||Oklahoma State||16||3||Oklahoma State||12||1|
|South Carolina||15||South Carolina||19||4||South Carolina||18||3||South Carolina||16||1||South Carolina||17||2||South Carolina||18||3|
|Virginia Tech||16||Virginia Tech||15||1||Virginia Tech||12||4||Virginia Tech||11||5||Virginia Tech||12||4||Virginia Tech||20||4|
|Texas A&M||19||Texas A&M||18||1||Texas A&M||19||0||Texas A&M||18||1||Texas A&M||19||0||Texas A&M||16||3|
|Florida State||21||Florida State||21||0||Florida State||20||1||Florida State||20||1||Florida State||20||1||Florida State||22||1|
|Northern Illinois||22||Northern Illinois||25||3||Northern Illinois||24||2||Northern Illinois||23||1||Northern Illinois||24||2||Northern Illinois||25||3|
|West Virginia||24||West Virginia||24||0||West Virginia||23||1||West Virginia||24||0||West Virginia||23||1||West Virginia||24||0|
|Mississippi State||33||Mississippi State||22||11||Mississippi State||22||11||Mississippi State||22||11||Mississippi State||22||11||Mississippi State||21||12|
What you have here is my ranking along with some of the more common rankings you will see. I didn't think all of the computer rankings would fit in this chart, so I just took the average of them. The numbers to the right of the poll are the ranking of those teams in the poll and then the difference between that ranking and my ranking. The BCS poll does not show beyond the top 25, so I can't compare those teams to mine. At the bottom, you will see the average difference and and the most common difference between the polls. The team that seemed to cause me the most troubles is Mississippi State. They alone account for a little less than half a rank in each of the averages.
So, I'm planning on doing another ranking next week after all the games have been played and another after all the bowls have been played. I'd also like to do conference rankings and go back to previous years and "resolve" controversies. By next week, I will have tweaked this a bit more to add in homefield advantage and hopefully perfected the formula.
I have one last ranking for you. This is the algorithm without the margin of victory used to weight the wins. This is essentially what would be submitted to the BCS because they don't allow points into the calculations. It's interesting that it places Auburn in first now just like the rest of the computers.
|No Weights, With Record|
Is it Indian? Seriously though this is very impressive.
Big playoff game today in my backyard (literally, I live across the street from the stadium). I'm not going though, b/c I don't really care about their team at all, but I thought it was interesting that they're ranked the highest out of FCS teams.
Also, very cool post. I look forward to reading about the rankings after the season and about past years.
Question - did you choose the weighting factors in order to get your ranking closer to "eyeball-worthy", or was there a mathematical reason why you chose the numerical weights you did (ie 24 points, 0.333, etc.)?
This isn't actually a criticism - it's perfectly valid to optimize the result by varying the inputs, but you would probably have to do this over multiple seasons and finetune the precise weights (similar to using a "survival" mechanism to find the best algorithms in a mathematical model).
A lot of work, I know. Thanks for doing this much already - it's interesting to see.
There are no handpicked weights. 24 points means you lead your opponent by 3 scores of 8 points. So, 1/3 = .333. MSU beat Wisconsin by 10 points, which is 2 scores, so the weight in that game is 1/2, or .500.
However, my next step is to account for homefield advantage by removing points for winning at home or adding points for winning on the road, which could be handpicked point values. I'm not sure if I'm going to use the standard 3 points, 4 points (half a score), or I am going to try to calculate the difference in average differences between home and away games, then use that. In any case, I would expect homefield advantage to take MSU's win down below 8 points, which would give them a 1 score lead and a 1 point weight for the game instead of .500.
So, yes, I did add parts of the algorithm to meet the eyeball test, but everything worked out as I expected it to, so I did not tweak things to get the values to be what I wanted them to be.
+1 for referring to Hegel in a blog about football.
Great work and thank you and Coach Schiano for sharing this idea with us.
Again with all respect to the amount of effort put in, it would be interesting to see the original path lengths and see how various modifiers would affect the results. I'm not in the math or stats field but it might be worth seeing how tweaking your modifier (the 2 or 3 score margin) affects the overall results. A general solution identifiying the key inputs may be of value not just to college football but perhaps to other sports - I'm thinking specifically of college basketball where there is always a huge debate at selection time about whether one conference's 22-5 team is worse or better than another conference's 18-12 team with a stronger strength of schedule.
Finally, as you noted this system works best in a closed system (e.g. league play) where the participants have played against the majority of the same opponents - may be a superior form of tie breaker outside of direct head to head. (e.g. this year with the 3 way jam with OSU, MSU and Wisconin)
Again, great work. Any chance you might have a image of the graph and nodes? It would probably be a pretty neat image.
I was also thinking about doing it for basketball, but I'm not sure what numbers I would use to determine weighting. I'll probably attempt it after I get the algorithm nailed down. The graph would be pretty hard to do, but it is possible. I'll look into doing it in the future, though.
Maybe the NCAA will use this to determine seeding for the championship tournament we all want.