Graph Theory Division I Ranking

Submitted by joeyb on December 4th, 2010 at 12:15 AM

First of all, I should thank Coach Schiano for the idea for this ranking system. I've taken the concept from his diary, made a few tweaks, and applied it to all FBS and FCS teams. If you need an explanation of what is happening, I suggest reading there first.

So, the first thing that I did was I loaded all of the game results thus far (excluding last night and this night) into a database. Then, I was able to write a script that built paths between every combination of teams with the lowest possible value. So, MSU beat Wisconsin and Wisconsin beat OSU. The first step is to create paths of length 1 for those two games. After that, I can calculate the value of the path from MSU to OSU and then create a direct path of length 2 between them. Iowa's path to OSU is 3 and so on.

Once I had calculated the minimum value for all possible paths, I averaged out the values and had a very redimentary ranking. This is the exact system that was used in the original diary to rank the teams of the Big Ten, but now we have expanded the teams to compare.

Original, Unweighted, Without Record
Alabama
Auburn
LSU
Missouri
Oklahoma
Arkansas
TCU
Texas A&M
South Carolina
Oregon
Navy
Nebraska
Notre Dame
Michigan State
Ohio State
Stanford
Wisconsin
Oklahoma State
Boise State
Florida State
Michigan
North Carolina State
North Carolina
Utah
Arizona

As you can see, this isn't a very accurate reflection of the best teams in college football. It took me a few minutes to figure out what was going on, but it became clear when I realized that Michigan was on there. The problem is that this system strongly favors a diversified schedule. The more teams that you beat that don't play each other, the better chance you have of getting 2 or 3 length paths.

The first thing that I tried to do to fix this was to weight the games. Instead of automatically giving a team a path of length one for a win, I started dividing that by the number of scores (8 points) that a team won by. So, winning by 24 points, or 3 scores, would give you a path length of .333 over that team. This works out really well because it benefits teams that win by 2 or 3 scores, but it doesn't benefit teams too much for going beyond that.

Weighted, Without Record
Oregon
Stanford
Missouri
Oklahoma
Nebraska
South Carolina
TCU
Alabama
Texas A&M
Ohio State
Boise State
Wisconsin
Oklahoma State
Auburn
Florida State
California
Notre Dame
Iowa
Arkansas
Michigan State
Navy
Arizona
Virginia Tech
USC
Nevada

This seemed to get me a lot closer to where we want to be with a poll, but there are still some issues. How can Auburn be ranked 14 and Missouri be ranked 3? Well, now there is too much weight on winning strong games. But wait, if that's the case, then why isn't Wisconsin ranked in the top 10? That's because all of their blowouts came in the Big Ten. Wisconsin looked like a pretty bad team at the beginning of the year because they won some very close games against lesser opponents.

The only way that I could figure to solve the problems with overweighting is to add another weighting component, which is actually pretty obvious. I decided to add a winning percentage multiplier at the end. Originally, I figured that the concept of graph theory would account for winning. What it really does is account for beating the right teams, i.e. it is the strength of schedule calculation. The reason that I went with a winning percentage is because I don't want to give an advantage to teams playing Hawaii or in championship games, so raw wins was out of the question. I also need a way to penalize a team that loses a 13th game. The only way to do this is to do a winning percentage. 13-0 and 12-0 are now 1.000. 12-1 and 11-1 are only .006 apart. This gives a slight benefit to teams playing an extra game, but also makes sure to penalize properly for losses.

	Weighted, With Record
1	Oregon
2	TCU
3	Stanford
4	Auburn
5	Ohio State
6	Boise State
7	Wisconsin
8	Oklahoma
9	Missouri
10	Nebraska
11	Michigan State
12	Nevada
13	Oklahoma State
14	Arkansas
15	South Carolina
16	Virginia Tech
17	Alabama
18	LSU
19	Texas A&M
20	Utah
21	Florida State
22	Northern Illinois
23	Navy
24	West Virginia
25	Hawaii
29	Delaware (Highest rated FCS Team)
31	Jacksonville State (Beat Ole Miss)
36	Notre Dame
39	Appalachian State
44	Florida
45	Connecticut
52	Michigan
85	James Madison
110	Massachusetts

As you can see, this looks like a real ranking now and it won't automatically place an undefeated team ahead of a 1-loss team. I'm pretty excited about the outcome, because this is actually pretty comparable to the polls that are out there. How comparable?

My Poll	R	BCS	R	D	AP	R	D	Coaches	R	D	Harris	R	D	Computer Average	R	D
Oregon	1	Oregon	2	1	Oregon	1	0	Oregon	1	0	Oregon	1	0	Oregon	2	1
TCU	2	TCU	3	1	TCU	3	1	TCU	3	1	TCU	3	1	TCU	3	1
Stanford	3	Stanford	4	1	Stanford	5	2	Stanford	5	2	Stanford	5	2	Stanford	4	1
Auburn	4	Auburn	1	3	Auburn	2	2	Auburn	2	2	Auburn	2	2	Auburn	1	3
Ohio State	5	Ohio State	6	1	Ohio State	6	1	Ohio State	6	1	Ohio State	6	1	Ohio State	9	4
Boise State	6	Boise State	11	5	Boise State	9	3	Boise State	10	4	Boise State	10	4	Boise State	14	8
Wisconsin	7	Wisconsin	5	2	Wisconsin	4	3	Wisconsin	4	3	Wisconsin	4	3	Wisconsin	7	0
Oklahoma	8	Oklahoma	9	1	Oklahoma	10	2	Oklahoma	9	1	Oklahoma	9	1	Oklahoma	6	2
Missouri	9	Missouri	12	3	Missouri	15	6	Missouri	14	5	Missouri	14	5	Missouri	10	1
Michigan State	11	Michigan State	8	3	Michigan State	7	4	Michigan State	7	4	Michigan State	7	4	Michigan State	11	0
Nebraska	10	Nebraska	13	3	Nebraska	13	3	Nebraska	13	3	Nebraska	13	3	Nebraska	15	5
Nevada	12	Nevada	17	5	Nevada	14	2	Nevada	17	5	Nevada	15	3	Nevada	17	5
Oklahoma State	13	Oklahoma State	14	1	Oklahoma State	16	3	Oklahoma State	15	2	Oklahoma State	16	3	Oklahoma State	12	1
Arkansas	14	Arkansas	7	7	Arkansas	8	6	Arkansas	8	6	Arkansas	8	6	Arkansas	5	9
South Carolina	15	South Carolina	19	4	South Carolina	18	3	South Carolina	16	1	South Carolina	17	2	South Carolina	18	3
Virginia Tech	16	Virginia Tech	15	1	Virginia Tech	12	4	Virginia Tech	11	5	Virginia Tech	12	4	Virginia Tech	20	4
Alabama	17	Alabama	16	1	Alabama	17	0	Alabama	19	2	Alabama	18	1	Alabama	13	4
LSU	18	LSU	10	8	LSU	11	7	LSU	12	6	LSU	11	7	LSU	8	10
Texas A&M	19	Texas A&M	18	1	Texas A&M	19	0	Texas A&M	18	1	Texas A&M	19	0	Texas A&M	16	3
Utah	20	Utah	20	0	Utah	21	1	Utah	21	1	Utah	21	1	Utah	19	1
Florida State	21	Florida State	21	0	Florida State	20	1	Florida State	20	1	Florida State	20	1	Florida State	22	1
Northern Illinois	22	Northern Illinois	25	3	Northern Illinois	24	2	Northern Illinois	23	1	Northern Illinois	24	2	Northern Illinois	25	3
Navy	23	Navy			Navy	30	7	Navy	31	8	Navy	29	6	Navy
West Virginia	24	West Virginia	24	0	West Virginia	23	1	West Virginia	24	0	West Virginia	23	1	West Virginia	24	0
Hawaii	25	Hawaii			Hawaii	25	0	Hawaii	27	2	Hawaii	28	3	Hawaii
UCF	28	UCF			UCF	31	3	UCF	25	3	UCF	27	1	UCF
Arizona	30	Arizona	23	7	Arizona	26	4	Arizona	26	4	Arizona	25	5	Arizona	23	7
Mississippi State	33	Mississippi State	22	11	Mississippi State	22	11	Mississippi State	22	11	Mississippi State	22	11	Mississippi State	21	12
		Average		2.92			2.93			3.04			2.96			3.56
		Mode		1			3			1			1			1

What you have here is my ranking along with some of the more common rankings you will see. I didn't think all of the computer rankings would fit in this chart, so I just took the average of them. The numbers to the right of the poll are the ranking of those teams in the poll and then the difference between that ranking and my ranking. The BCS poll does not show beyond the top 25, so I can't compare those teams to mine. At the bottom, you will see the average difference and and the most common difference between the polls. The team that seemed to cause me the most troubles is Mississippi State. They alone account for a little less than half a rank in each of the averages.

So, I'm planning on doing another ranking next week after all the games have been played and another after all the bowls have been played. I'd also like to do conference rankings and go back to previous years and "resolve" controversies. By next week, I will have tweaked this a bit more to add in homefield advantage and hopefully perfected the formula.

I have one last ranking for you. This is the algorithm without the margin of victory used to weight the wins. This is essentially what would be submitted to the BCS because they don't allow points into the calculations. It's interesting that it places Auburn in first now just like the rest of the computers.

No Weights, With Record
Auburn
TCU
Oregon
Michigan State
Ohio State
Stanford
Wisconsin
Boise State
LSU
Nevada
Missouri
Oklahoma
Arkansas
Nebraska
Oklahoma State
Utah
Virginia Tech
Alabama
Texas A&M
South Carolina
Northern Illinois
Florida State
Navy
Jacksonville State
Tulsa
(47) Michigan

Football

2010 FB Rankings

Comments

suthee

December 4th, 2010 at 12:38 AM ^

well i'm impressed.

Joined: 10/26/2010

MGoPoints: 26

BraveWolverine730

December 4th, 2010 at 1:25 PM ^

Is it Indian? Seriously though this is very impressive.

Joined: 09/21/2009

MGoPoints: 4193

Mitch Cumstein

December 4th, 2010 at 8:42 AM ^

Big playoff game today in my backyard (literally, I live across the street from the stadium). I'm not going though, b/c I don't really care about their team at all, but I thought it was interesting that they're ranked the highest out of FCS teams.

Also, very cool post. I look forward to reading about the rankings after the season and about past years.

Joined: 10/02/2009

MGoPoints: 14072

SmithersJoe

December 4th, 2010 at 10:05 AM ^

Question - did you choose the weighting factors in order to get your ranking closer to "eyeball-worthy", or was there a mathematical reason why you chose the numerical weights you did (ie 24 points, 0.333, etc.)?

This isn't actually a criticism - it's perfectly valid to optimize the result by varying the inputs, but you would probably have to do this over multiple seasons and finetune the precise weights (similar to using a "survival" mechanism to find the best algorithms in a mathematical model).

A lot of work, I know. Thanks for doing this much already - it's interesting to see.

Joined: 11/28/2010

MGoPoints: 704

joeyb

December 4th, 2010 at 10:28 AM ^

There are no handpicked weights. 24 points means you lead your opponent by 3 scores of 8 points. So, 1/3 = .333. MSU beat Wisconsin by 10 points, which is 2 scores, so the weight in that game is 1/2, or .500.

However, my next step is to account for homefield advantage by removing points for winning at home or adding points for winning on the road, which could be handpicked point values. I'm not sure if I'm going to use the standard 3 points, 4 points (half a score), or I am going to try to calculate the difference in average differences between home and away games, then use that. In any case, I would expect homefield advantage to take MSU's win down below 8 points, which would give them a 1 score lead and a 1 point weight for the game instead of .500.

So, yes, I did add parts of the algorithm to meet the eyeball test, but everything worked out as I expected it to, so I did not tweak things to get the values to be what I wanted them to be.

Joined: 10/12/2008

MGoPoints: 16869

Waters Demos

December 4th, 2010 at 11:34 AM ^

Great work. Also, great question by smithers.

I also mean no criticism by this comment. I'm not sure how much value the comparison to the actual polls adds to your work. I mean it - I'm really not sure. It could add a lot, or very little, or perhaps even detract from it some. I just don't know.

What I mean is that you've, in a sense, ran a statistical analysis that, intentional or not, justifies common opinion on the topic. Is that a good thing? It reminds me of one prominent criticism leveled at the work of GF Hegel, i.e., that his work only went so far as to justify the prussian state in which he lived. Is it reason discovering worthy grounds for common opinion after the fact? Or reason that is derivative of common opinion?

Perhaps this issue I raise is speculative because we only have so many inputs, and based on these alone, it's impossible to know, for example, whether MSU is a "better" team than the likes of Alabama or Nebraska (doubt it!) as the rankings (and your analysis) suggest.

I really don't know (and I'm not a statistician), but if you (or anyone else) have any views on that issue, I'd be interested.

Joined: 08/19/2010

MGoPoints: 3522

Communist Football

December 4th, 2010 at 12:04 PM ^

+1 for referring to Hegel in a blog about football.

Joined: 09/18/2010

MGoPoints: 18963

treetown

December 4th, 2010 at 1:47 PM ^

Great work and thank you and Coach Schiano for sharing this idea with us.

Again with all respect to the amount of effort put in, it would be interesting to see the original path lengths and see how various modifiers would affect the results. I'm not in the math or stats field but it might be worth seeing how tweaking your modifier (the 2 or 3 score margin) affects the overall results. A general solution identifiying the key inputs may be of value not just to college football but perhaps to other sports - I'm thinking specifically of college basketball where there is always a huge debate at selection time about whether one conference's 22-5 team is worse or better than another conference's 18-12 team with a stronger strength of schedule.

Finally, as you noted this system works best in a closed system (e.g. league play) where the participants have played against the majority of the same opponents - may be a superior form of tie breaker outside of direct head to head. (e.g. this year with the 3 way jam with OSU, MSU and Wisconin)

Again, great work. Any chance you might have a image of the graph and nodes? It would probably be a pretty neat image.

Joined: 11/10/2010

MGoPoints: 6339

joeyb

December 4th, 2010 at 3:35 PM ^

I was also thinking about doing it for basketball, but I'm not sure what numbers I would use to determine weighting. I'll probably attempt it after I get the algorithm nailed down. The graph would be pretty hard to do, but it is possible. I'll look into doing it in the future, though.

Joined: 10/12/2008

MGoPoints: 16869

Waters Demos

December 4th, 2010 at 6:33 PM ^

Aw hell, I was reading your comparison graph wrong when I posted the comment above. I now see, e.g., that there is significantly more deviation in your analysis from the polls than I first saw (and that MSU is behind Neb; chalk it up to hangover? Idk).

In light of a more proper reading, and in retrospect, most of what I wrote before is unwarranted.

Apologies - again, great work.

Joined: 08/19/2010

MGoPoints: 3522

sammylittle

December 4th, 2010 at 1:22 PM ^

Maybe the NCAA will use this to determine seeding for the championship tournament we all want.

Joined: 09/29/2008

MGoPoints: 7824