kenpom predictive accuracy

Submitted by steve sharik on

Admittedly, kenpom is the leader in college basketball metrics.  It is not, however, the end-all-be-all.

On kenpom.com: "The first thing you should know about this system is that it is designed to be purely predictive. If you’re looking for a system that rates teams on how 'good' their season has been, you’ve come to the wrong place."

I recently tried to express interest in how well kenpom does with its ratings, especially how well it takes strength-of-schedule into account.  Some of you disapproved.

So, let's take a look at how well kenpom has done with Michigan this season.  Comparing the predictions that mgoblog has published and comparing them to actual results, kenpom has accurately predicted the winning team in Michigan's games 65.2% of the time, with an average error of 8.4 points in those contests.  During B1G play, the kenpom accuracy in Michigan games has dropped to 60% with an error of 9.8 points.

deemarsah

March 2nd, 2014 at 1:07 PM ^

While your point is true, the more obvious point to make is that the average past error does not mean that future games will necessarily be determined by the average past error in either direction, but could also be larger or smaller in either direction.  In other words, a past error of 9.8 points doesn't mean we either lose by 0.8 or win by 18.8.  It could be anything.

But I think you knew that.

More generally, a lot of people learn statistics and regression models and then wade around in minutiae over the third decimal point of the standard error, when there may be something much larger going on that obscures the esoteric point entirely.

Darker Blue

March 1st, 2014 at 4:01 PM ^

I was disapointed that you took your other post down after you had obviously put a lot of hard work into it. I hope you leave this one up. Its okay to try to do something and fail. As long as you learn fromt that failure. 

With that being said, I'd be curious to see how other teams have faired compared to what KenPom predicted. 

Gulogulo37

March 1st, 2014 at 6:42 PM ^

Jesus, dude. Lighten up. You seriously asked the mods to delete the other thread because some people were being smartasses? And now it just looks like you were upset about that and made another post looking like it's trying to be critical of Kenpom but it's hard to tell because you don't explicitly say that. Is 60% for Kenpom good? It sounds like you're trying to say no but to be vague to avoid critcism.

swan flu

March 1st, 2014 at 4:31 PM ^

When you are talking about predictive modeling on things with such high variability and conditionalities, 70% accuracy is pretty damn good.

 

The big problem is that all predictive models are essentially extrapolators. They look at previous data, create a "function of best fit" and then apply the function to data outside the boundaries that were considered in the creation of the function. Extrapolation is always tricky, especially when you apply it to exceedingly variable things: like sports.  Especially sports played by teenagers.

freejs

March 1st, 2014 at 8:09 PM ^

I know it's stating the obvious, but there's a reason Vegas is as good as it gets at setting the opening lines. They have so much money and access to every last bit of data and the best quant guys in the business (you know those guys get paid) - and they would go totally broke if they sucked at it. 

Nitro

March 1st, 2014 at 9:18 PM ^

Wait, so you're telling me there are PEOPLE who can perform analysis and predict outcomes of events determined by humans better than computers crunching numbers that provide an abstract, dicrete, incomplete representation of a sampling what occurred in prior events determined by humans?  I just don't believe it!  I'll have to check with a computer analysis to see if that's possible.  Maybe the computers won't grow a collective awareness and take us all over one day.

turd ferguson

March 1st, 2014 at 11:12 PM ^

Something tells me that the PEOPLE doing this well are using COMPUTERS along with their judgment.  Both extremes on this argument are wrong.  People can see things that computers can't see, and computers can see things that people can't see.

Alton

March 1st, 2014 at 4:41 PM ^

Good premise here--testing KenPom--but I think we need more data.

The question this post brings up for me is whether predicting games with an average error of 8.4 points is very good, pretty good, so-so, pretty bad or very bad.  It seems to me that the OP is trying to imply that it is not good--correct me if I am wrong--but the number given here doesn't tell us anything one way or the other.

There are a lot of rating systems that predict the outcomes of games--Sagarin is the most famous, but there are many others.  There are also many people who predict games for money, their opinions combined make up the Las Vegas lines that you see everywhere. 

So my questions are:  how accurate is Kenpom compared to Sagarin or Massey?  How accurate is Kenpom compared to the people betting in Las Vegas?  8.4 doesn't mean much all by itself.

michelin

March 1st, 2014 at 4:46 PM ^

For instance, Michigan so far has won 74% of its games.  Suppose that figure reflects the true quality of the team.  Then, suppose we had  a monkey who saw UM was winning but did not know why.  If he picked UM every time in a series of comparably difficult future games, he would on average be accurate 74% of the time.

Nitro

March 1st, 2014 at 10:26 PM ^

Actually, I think if we played a series of comparably difficult future games, we'd win more than 74% the way we're playing now.  Unless your definition of comparably difficult is relative.  But then who knows.  A basketball team isn't a set of dice or coin flips -- it doesn't converge to an average over time.

2timeloozer

March 1st, 2014 at 5:06 PM ^

Most times the obvious favorite wins. 70% against the the spread would be impressive. 51% against the spread would be impressive. 70% straight-up, not so much.

Yeoman

March 1st, 2014 at 5:10 PM ^

...a bit older, he stopped running it in 2006. The winner for best predictive method each year, using a mean-square-error method?

2006 Winner: The Vegas Line
2005 Winner: The Vegas Line
2004 Winner: The Vegas Line
2003 Winner: The Vegas Line
2002 Winner: The Vegas Line
2001 Winner: The Vegas Line
2000 Winner: The Vegas Line
1999 Winner: The Vegas Line

That's monotonous and I see why he stopped. There were 56 systems tested, not one ever beat Vegas in 8 seasons.

There was some thinking that this might be because computer methods are pretty worthless in the early part of the season, so he did the same for just the second halves. Vegas won seven years out of 8.

http://tbeck.freeshell.org/fb/awards2006.html

turd ferguson

March 1st, 2014 at 5:25 PM ^

Thanks for posting this.  I had this on my mind, figuring that Vegas is probably a couple steps ahead of everyone.

If one of these algorithms started beating Vegas odds with any regularity, the Vegas oddsmakers would just incorporate information from those algorithms into how they set their lines.  So if the creator of one of these widely known algorithms (or widely known site with an algorithm) might get ahead early, but I'd be shocked to see him stay ahead. 

Nitro

March 1st, 2014 at 9:43 PM ^

I wouldn't call it being cynical to raise a reasonable possibility.  But I don't think being ethical would prevent Vegas from beating computers.  My guess is they start with a computer prediction, then tweak it based on some human analysis, and then make a final adjustment based on the how much a team's fanbase tends to wager regardless of the spread (or expected wagering based on present popular sentiment).  Like, if Notre Dame is playing Purdue, and their analysis determines ND -10, they'll put the spread at ND -14 since more people will be wagering on ND just because.  Remember, they're in the business of making money -- they're not even trying to beat the computers.  But in that regard, it's BIG money, so they probably sprinkle some shadiness on top (maybe indirectly by influencing popular sentiment via some media control).

docwhoblocked

March 1st, 2014 at 5:14 PM ^

There are market places for predictions of elections and other future events.  I am no expert but doesn't the Vegas line just depend on who is betting and they just try to balance the money on both sides.  Is than not somewhat like crowd sourcing/prediction markets? 

I found this article (admittedly written by Ohio economists) that suggests that the betting line is not very predictive for NCAA B ball but is better for NFL and NCAA football.  They say overall betting markets like the Vegas line are not all that predictive.  Hmmmm

http://www.econ.ohio-state.edu/trevon/pdf/BettingMarketPaper_01-11-12.p…

 

Yeoman

March 1st, 2014 at 5:32 PM ^

Not sure about the second link--I can't find any description of how he determined "the Vegas line" and the website hasn't been touched in years so I doubt I'm going to find out.

But the first link specifically used the opening line, before crowdsourcing kicked in.

Combining this with your link, I think what we've got is some evidence that a candidate for a Vegas-beating system is to fade moves in the Vegas line. The paid professionals are better than the crowd.

Nitro

March 1st, 2014 at 10:21 PM ^

The fact that the opening line performs better than the game time line confirms this.  But it's a marginal difference that would only pay off over a large number of bets.  Personally, I think you'd be better off getting, say, NBA league pass, watching a lot and gaining a good understanding of the current ebb-and-flow, and betting NBA games based on your intuition (if you start seeing it developing into something you could trust).

Like, if a team plays a fundamentally solid game and moves the ball well on offense, it's a good bet they're gonna hammer the Pistons by more than the spread says the rest of this season.

Or, if you watched the Pistons play the past 2 seasons (or the NJ Nets in the seasons immediately following their run as a contender), you'd know that Lawrence Frank is a basketball doofus.  So when you see Jason Kidd cast him aside because of disagreements, you'd understand this to mean Kidd wants to do something right, so you'd bet on the Nets outperforming the spread for a stretch.

Or, take the Spartans.  You could see Izzo making all these excuses, understand this will make his team feel like they have reasonable excuse to fall back on if they lose, see the malaise setting in as a result, and bet they'll struggle against the spread.  Not sure how that will work going forward, since their latest loss could snap some urgency into them, but I can see that urgency still taking a game or two to set in, especially given that they're not playing for a title anymore and the team will probably be thinking they can rely on a boost from getting Dawson back before they realize it's really not that much of a boost after all.

There are many specific circumstances computers and patterns can't account for on an individual level.

singler makes …

March 1st, 2014 at 5:58 PM ^

Is there a point to this thread besides your passive aggressiveness? 

 

EDIT: that came off  mean-spirited, so let me add this: it is great that you have taken an interest in predictive models. They are fun to examine and even more fun to build. I'd suggest that rather than posting random threads where you multiply two numbers and post a list, or other threads which seem to say "HAHA Kenpom sucks" (although not in those words) that you take the time to read about the various models and how they work.

 

In your previous thread it looked to me that people were more attacking your concept than you. Take it and learn.

 

SECOND EDIT: Screw it, I’ll enjoy Bolivia.

 

This series of threads represents pretty much everything wrong with the internet. Anyone can spew whatever they want, and the result is society fails and people like AJ Mass are employed journalists. Sure, here it is just sports, but the same stuff has terrible outcomes on science (won’t get into it  cause of the no politics stuff). Just because you can do basic math doesn’t mean it has any value whatsoever.

 

Kenpom has spent years trying to perfect a predictive model that most of us wouldn’t be able to understand. He has essentially dedicated his life to the issue. But hey guys, look at this, I can take his values and then multiply by another number and get a different ranking! Oh wait, you don’t like this thread! Delete it so I can make another one where I give an N of 1 with absolutely no context.

 

Please just stop for a second and try to learn what others have done before presenting your latest “findings”.

 

I’m done.

straight-gangs…

March 1st, 2014 at 5:31 PM ^

I just think people were questioning the rationale, when Kenpom already takes the SOS into consideration and therefore I think there was issue with what your rankings truely represented. Absolutely appreciate the effort and enjoy digging into the numbers, Vegas lines, etc. I'm assuming you're simply trying to highlight the fact that Kenpom isn't perfect when predicting. Not surprised......I would always lean toward Vegas for true odds.

B-Nut-GoBlue

March 1st, 2014 at 5:39 PM ^

Really naive and ignorant question here:  What is "Vegas"?  Is there a secret corporation that hires a relatively small collection of really smart people that also know sports (not that the two are always mutual) and they just sit around and study to make the odds for the casinos?  Or...?