Tracking KenPom

Submitted by stubob on March 3rd, 2010 at 1:56 PM
This post will examine the accuracy of the KenPom rankings and predictions, and try to evaluate the performance of Michigan's basketball season in comparison.

I've been following along with Brian/Tim's basketball previews and was wondering how accurate the KenPom predictions have been.  I'll graph the predictions versus the outcomes, and try to adjust the predictions based on the current rankings (versus rankings at the time).  I will also include a "baseline" program for analysis and comparison to our manic/depressive performance this season.

Numbers:
Michigan is currently ranked 85/47offensively/defensively according to KenPom.  Compare that to the competition:
teamcurrent offense rankcurrent defense rank
minnesota3853
osu1019
ill6740
psu73113
iowa141162
minnesota3853
wisc1115
nw27163
iowa141162
msu2831
purdue335
wisc1115
uconn6628

And prediction/results for those games:
teamkenpom predictionactual differencekenpom - actual
minnesota2-1618
osu12111
ill16-5
psu-84-12
iowa-3-2-1
minnesota9-716
wisc58-3
nw313-10
iowa-11-143
msu312
purdue11101
wisc-916-25
uconn-1-54


Simple numerical average of (kenpom - actual) gives -0.85, which shows pretty good prediction value.

Showing the results graphically:

The orange line shows how close the kenpom prediction was at the time.

Now, we will look at the current rankings to try to get a better feel for the prediction value.  Assuming that a better team will beat a worse team, we will estimate margin of victory based on relative ranking.
teamrank averagemichigan rank - team rankranking difference prediction
minnesota45.520.52.05
osu14.551.55.15
ill53.512.51.25
psu93-27-2.7
iowa151.5-85.5-8.55
minnesota45.520.52.05
wisc13535.3
nw95-29-2.9
iowa151.5-85.5-8.55
msu29.536.53.65
purdue19474.7
wisc13535.3
uconn47191.9

The last column is expected margin of victory, if the teams played today.  Graphing the RDP versus actual gives this:

The games with big gaps would be upsets, but overall the prediction percentage is .61, that is, the percentage of games that the current rankings would predict correctly, win or lose.

Now let's compare that chart to a control, Michigan State. MSU's rank is 28/31.  The data in question:
teamactual differenceranking difference prediction
osu71.5
ind-14-14.65
psu-12-6.35
purdue81.05
ill5-2.4
wisc141.65
nw-9-6.55
mich-1-3.65
minn-2-1.6
iowa-7-12.2
ill-10-2.4
minn-13-1.6
iowa-18-12.2

and chart:
Now the prediction rate is .92 (12/13).

So what does all this show?  I think it shows the value of KenPom's system when used on a good team.  Or, conversely, the inconsistency of Michigan this season - beating teams they shouldn't beat, losing to teams they should beat.  I'm not a gambler, so I didn't take into account the value of covering against the spread, I'm simply looking at this as a fan and judging based on wins/losses.  As far as wins and losses, this system seems very accurate.  I may look into tweaking the ranking calculation to better match the results, but I think the basic idea is pretty solid.

Comments

ntclark

March 3rd, 2010 at 2:04 PM ^

Small nitpick: why is iowa listed twice in the charts?

EDIT: nice analysis, though. I always wondered how statistically significant KenPom predictions were. Thanks!

Kilgore Trout

March 3rd, 2010 at 3:15 PM ^

Couple of points.

1, I think in the first graph, you can't get to your average by using positive and negative numbers. A basic principle of signal averaging is that random noise averages out to zero. That's more of what I see in those kenpom - actual numbers for Michigan. If you just did the absolute value of the kenpom - actual you would get how far "off" he was for each game, regardless of whether it was an upset or not. That would put his average prediction for Michigan at 7.76. Meaning that his predictions were, on average, 7.76 points off. I don't think that's very good. In fact, I'd be willing to bet that most people who follow basketball with a decent amount of effort could do as well or better.

2, I don't think you can look back at games a month or two ago using current rankings. There is so much fluidity to the game, I just don't think that will end up being representative of much.

stubob

March 3rd, 2010 at 3:49 PM ^

I agree that the average difference is kind of a useless number. I think the right/wrong percentage is the data of value from this exercise, and it's easier to see in a bar graph than a line graph. Now, I'm not sure I've proven anything other than "Good teams beat bad teams, most of the time."

I figured 2. would be a question. What I was trying to show was that the current positions would represent likely outcome, not taking into account outside interference, like OSU or Purdue losing good players. It was intended to reflect what we know now about the game, rather than what we knew then. If a team has fallen apart, then an earlier "upset" wouldn't be as big a deal, since now we know/expect them to lose.

By the way, for potential diary-makers, I did the whole thing in Google Docs and just pasted the result in here, worked like a charm.

hockebob

March 3rd, 2010 at 7:38 PM ^

Agreed, although the variance of the data is probably even more important to consider. In other words, is KenPom consistently bad or inconsistently less bad at predicting Michigan games? Looking at the data, and given Michigan's play this year, I imagine it's the latter.

mi93

March 3rd, 2010 at 9:11 PM ^

that this is the type of debates we have. Statistical significance and other stuff that relates a class that handed me my lowest grade at UofM. I dig that about this crowd.

Hey Brian, how do we stack up against other blogs?

chitownblue2

March 4th, 2010 at 7:39 AM ^

The accuracy of KenPom's predictions are something I've always privately thought were sort of poor, but I'm glad someone actually took the initiative (boo me) to put some analysis behind it.

As a previous poster noted, I think "averaging" the +'s and -'s of his performance gives a somewhat jumbled number as he could have been wrong in our opposition's favor by 25 points in one game, and wrong in our favor by 25 the next game, and, in that system, some up with a "perfect" prediction record. More useful, I think, would be to average the total of the variance of each game. IE, he was off by 6 points one game, 20 the next, etc.

Also, I think you may be giving the system a slight free pass when just saying "it predicts winners 92% of the time in the case of MSU", because oftentimes, predicting winners and losers isn't that difficult. For instance, if you started with the premise that "The Home team will win", you'd have something like a 70% success rate. If you got more complex and started choosing out road game where, say, MSU or Purdue were playing at Indiana, Iowa, or Penn State, your prediction rate would climb higher.

So, I guess we need a comparison against another prediction system, as you allude to. For instance, is KemPom more accurate than the spread Vegas puts out?