kenpom predictive accuracy

Submitted by steve sharik on March 1st, 2014 at 3:50 PM

Admittedly, kenpom is the leader in college basketball metrics. It is not, however, the end-all-be-all.

On kenpom.com: "The first thing you should know about this system is that it is designed to be purely predictive. If you’re looking for a system that rates teams on how 'good' their season has been, you’ve come to the wrong place."

I recently tried to express interest in how well kenpom does with its ratings, especially how well it takes strength-of-schedule into account. Some of you disapproved.

So, let's take a look at how well kenpom has done with Michigan this season. Comparing the predictions that mgoblog has published and comparing them to actual results, kenpom has accurately predicted the winning team in Michigan's games 65.2% of the time, with an average error of 8.4 points in those contests. During B1G play, the kenpom accuracy in Michigan games has dropped to 60% with an error of 9.8 points.

MGoBoard

champswest

March 1st, 2014 at 3:55 PM ^

We'll that's just great.

Joined: 10/04/2009

MGoPoints: 25916

Come On Down

March 1st, 2014 at 3:59 PM ^

So we will either lose by 0.8 points or win by 18.8 points...Am I doing that math correctly?

Joined: 05/15/2013

MGoPoints: 1636

MosherJordan

March 1st, 2014 at 4:16 PM ^

Depends on if Kenpom uses a probit or Logit regression, or cox proportional hazards, to estimate his model. Interpretation of standard error varies depending on model.

Joined: 01/10/2011

MGoPoints: 529

MBAgoblue

March 1st, 2014 at 4:30 PM ^

What is this math stuff? You forgot to take into account "grit" and "wanting it more."

Joined: 07/09/2008

MGoPoints: 2623

Gobgoblue

March 1st, 2014 at 4:33 PM ^

The team that wants it more wins. It's 100% half mental.

Joined: 07/07/2012

MGoPoints: 21043

deemarsah

March 2nd, 2014 at 1:07 PM ^

While your point is true, the more obvious point to make is that the average past error does not mean that future games will necessarily be determined by the average past error in either direction, but could also be larger or smaller in either direction. In other words, a past error of 9.8 points doesn't mean we either lose by 0.8 or win by 18.8. It could be anything.

But I think you knew that.

More generally, a lot of people learn statistics and regression models and then wade around in minutiae over the third decimal point of the standard error, when there may be something much larger going on that obscures the esoteric point entirely.

Joined: 11/18/2013

MGoPoints: 217

TruBluMich

March 1st, 2014 at 3:57 PM ^

If I knew with 100% certainty that I was going to pick 60% - 70% of all games correctly, Id quit my job.

Joined: 08/03/2011

MGoPoints: 21868

wildbackdunesman

March 1st, 2014 at 4:08 PM ^

Isn't there a big difference between picking against a spread and picking straight up?

Joined: 07/16/2008

MGoPoints: 30244

steve sharik

March 1st, 2014 at 4:18 PM ^

Why don't you quit your job and bet based off kenpom?

Joined: 08/08/2009

MGoPoints: 13860

TruBluMich

March 1st, 2014 at 4:30 PM ^

I think I'll pass, was broke once it wasn't much fun.

Joined: 08/03/2011

MGoPoints: 21868

Gobgoblue

March 1st, 2014 at 4:35 PM ^

college has its benefits.

/checks bank account

/revises resume for second job

Joined: 07/07/2012

MGoPoints: 21043

Darker Blue

March 1st, 2014 at 4:01 PM ^

I was disapointed that you took your other post down after you had obviously put a lot of hard work into it. I hope you leave this one up. Its okay to try to do something and fail. As long as you learn fromt that failure.

With that being said, I'd be curious to see how other teams have faired compared to what KenPom predicted.

Joined: 10/30/2011

MGoPoints: 80450

LexArborWolvCats

March 1st, 2014 at 4:11 PM ^

Remember when BiSB pissed off KenPom...

Joined: 04/14/2012

MGoPoints: 134

BiSB

March 1st, 2014 at 8:41 PM ^

I do not recall this event. No sir, never happened.

Joined: 08/15/2009

MGoPoints: 46071

Simps

March 1st, 2014 at 4:14 PM ^

This guy again. Wheres that really accurate chart?

Joined: 08/11/2012

MGoPoints: 2071

Darker Blue

March 1st, 2014 at 4:16 PM ^

Don't be a dick.

Joined: 10/30/2011

MGoPoints: 80450

tigers17fan

March 1st, 2014 at 4:20 PM ^

There's a reason simps has negative mgopoints

Joined: 06/07/2012

MGoPoints: 167

steve sharik

March 1st, 2014 at 4:20 PM ^

First, I never said it was accurate. In fact, I said that at the outset.

Second, the real reason I asked the mods to delete it was b/c people like you were behaving like you just did. How's Bolivia this time of year, by the way?

Joined: 08/08/2009

MGoPoints: 13860

Gulogulo37

March 1st, 2014 at 6:42 PM ^

Jesus, dude. Lighten up. You seriously asked the mods to delete the other thread because some people were being smartasses? And now it just looks like you were upset about that and made another post looking like it's trying to be critical of Kenpom but it's hard to tell because you don't explicitly say that. Is 60% for Kenpom good? It sounds like you're trying to say no but to be vague to avoid critcism.

Joined: 03/16/2010

MGoPoints: 28316

BiSB

March 1st, 2014 at 8:42 PM ^

It's summer in Bolivia.

Joined: 08/15/2009

MGoPoints: 46071

PizzaHaus

March 1st, 2014 at 9:08 PM ^

You deleted it because it was total nonsense. You had no idea about the math you were doing and it was embarrassing for you.

Joined: 01/09/2014

MGoPoints: 1124

Simps

March 1st, 2014 at 10:40 PM ^

Bolivia rocks. You seem like a high school student with that math you had in the chart on the other post. I was honestly just embarrassed for you.

Joined: 08/11/2012

MGoPoints: 2071

B-Nut-GoBlue

March 1st, 2014 at 5:33 PM ^

Where's your apostrophe?

Joined: 09/30/2011

MGoPoints: 43665

swan flu

March 1st, 2014 at 4:31 PM ^

When you are talking about predictive modeling on things with such high variability and conditionalities, 70% accuracy is pretty damn good.

The big problem is that all predictive models are essentially extrapolators. They look at previous data, create a "function of best fit" and then apply the function to data outside the boundaries that were considered in the creation of the function. Extrapolation is always tricky, especially when you apply it to exceedingly variable things: like sports. Especially sports played by teenagers.

Joined: 08/16/2010

MGoPoints: 9374

turd ferguson

March 1st, 2014 at 5:44 PM ^

Totally agree, and just because it's worth stating, this is a big part of what makes following sports fun. If outcomes were predictable enough that these models could predict 99% of the winners, the games really wouldn't be much fun to watch.

Joined: 12/09/2009

MGoPoints: 26493

Yeoman

March 1st, 2014 at 4:35 PM ^

http://www.thepredictiontracker.com/bbresults.php

Unfortunately he doesn't include Massey or KenPom, but no system he tracks has a better record, straight up, than the opening Vegas line. That's been typical through the years, both basketball and football. No computer consistently beats the spread; no computer does better than Vegas straight up.

Joined: 06/08/2011

MGoPoints: 21390

freejs

March 1st, 2014 at 8:09 PM ^

I know it's stating the obvious, but there's a reason Vegas is as good as it gets at setting the opening lines. They have so much money and access to every last bit of data and the best quant guys in the business (you know those guys get paid) - and they would go totally broke if they sucked at it.

Joined: 04/07/2010

MGoPoints: 9002

Nitro

March 1st, 2014 at 9:18 PM ^

Wait, so you're telling me there are PEOPLE who can perform analysis and predict outcomes of events determined by humans better than computers crunching numbers that provide an abstract, dicrete, incomplete representation of a sampling what occurred in prior events determined by humans? I just don't believe it! I'll have to check with a computer analysis to see if that's possible. Maybe the computers won't grow a collective awareness and take us all over one day.

Joined: 04/27/2013

MGoPoints: 17041

Nitro

March 1st, 2014 at 10:32 PM ^

Nevermind. Since no one's on this thread anymore except for me, I think my computer just negged me...run for the hills!!!

Joined: 04/27/2013

MGoPoints: 17041

turd ferguson

March 1st, 2014 at 11:12 PM ^

Something tells me that the PEOPLE doing this well are using COMPUTERS along with their judgment. Both extremes on this argument are wrong. People can see things that computers can't see, and computers can see things that people can't see.

Joined: 12/09/2009

MGoPoints: 26493

Alton

March 1st, 2014 at 4:41 PM ^

Good premise here--testing KenPom--but I think we need more data.

The question this post brings up for me is whether predicting games with an average error of 8.4 points is very good, pretty good, so-so, pretty bad or very bad. It seems to me that the OP is trying to imply that it is not good--correct me if I am wrong--but the number given here doesn't tell us anything one way or the other.

There are a lot of rating systems that predict the outcomes of games--Sagarin is the most famous, but there are many others. There are also many people who predict games for money, their opinions combined make up the Las Vegas lines that you see everywhere.

So my questions are: how accurate is Kenpom compared to Sagarin or Massey? How accurate is Kenpom compared to the people betting in Las Vegas? 8.4 doesn't mean much all by itself.

Joined: 07/05/2008

MGoPoints: 13273

michelin

March 1st, 2014 at 4:46 PM ^

For instance, Michigan so far has won 74% of its games. Suppose that figure reflects the true quality of the team. Then, suppose we had a monkey who saw UM was winning but did not know why. If he picked UM every time in a series of comparably difficult future games, he would on average be accurate 74% of the time.

Joined: 09/22/2009

MGoPoints: 2541

Gulogulo37

March 1st, 2014 at 6:51 PM ^

OK, what's your point? Kenpom isn't as good as an observant monkey at predicting results?

Joined: 03/16/2010

MGoPoints: 28316

Nitro

March 1st, 2014 at 10:26 PM ^

Actually, I think if we played a series of comparably difficult future games, we'd win more than 74% the way we're playing now. Unless your definition of comparably difficult is relative. But then who knows. A basketball team isn't a set of dice or coin flips -- it doesn't converge to an average over time.

Joined: 04/27/2013

MGoPoints: 17041

2timeloozer

March 1st, 2014 at 5:06 PM ^

Most times the obvious favorite wins. 70% against the the spread would be impressive. 51% against the spread would be impressive. 70% straight-up, not so much.

Joined: 04/29/2012

MGoPoints: 1151

deemarsah

March 2nd, 2014 at 1:08 PM ^

Why did it take so many posts for somebody to make this obvious point? And why have you not been posbanged to 10000?

Joined: 11/18/2013

MGoPoints: 217

Yeoman

March 1st, 2014 at 5:10 PM ^

...a bit older, he stopped running it in 2006. The winner for best predictive method each year, using a mean-square-error method?

2006 Winner: The Vegas Line

2005 Winner: The Vegas Line

2004 Winner: The Vegas Line

2003 Winner: The Vegas Line

2002 Winner: The Vegas Line

2001 Winner: The Vegas Line

2000 Winner: The Vegas Line

1999 Winner: The Vegas Line

That's monotonous and I see why he stopped. There were 56 systems tested, not one ever beat Vegas in 8 seasons.

There was some thinking that this might be because computer methods are pretty worthless in the early part of the season, so he did the same for just the second halves. Vegas won seven years out of 8.

http://tbeck.freeshell.org/fb/awards2006.html

Joined: 06/08/2011

MGoPoints: 21390

turd ferguson

March 1st, 2014 at 5:25 PM ^

Thanks for posting this. I had this on my mind, figuring that Vegas is probably a couple steps ahead of everyone.

If one of these algorithms started beating Vegas odds with any regularity, the Vegas oddsmakers would just incorporate information from those algorithms into how they set their lines. So if the creator of one of these widely known algorithms (or widely known site with an algorithm) might get ahead early, but I'd be shocked to see him stay ahead.

Joined: 12/09/2009

MGoPoints: 26493

swan flu

March 1st, 2014 at 5:42 PM ^

Am I the only cynical ass-hole who sees this and thinks Vegas is doing something unethical to obtain these results?

Joined: 08/16/2010

MGoPoints: 9374

Nitro

March 1st, 2014 at 9:43 PM ^

I wouldn't call it being cynical to raise a reasonable possibility. But I don't think being ethical would prevent Vegas from beating computers. My guess is they start with a computer prediction, then tweak it based on some human analysis, and then make a final adjustment based on the how much a team's fanbase tends to wager regardless of the spread (or expected wagering based on present popular sentiment). Like, if Notre Dame is playing Purdue, and their analysis determines ND -10, they'll put the spread at ND -14 since more people will be wagering on ND just because. Remember, they're in the business of making money -- they're not even trying to beat the computers. But in that regard, it's BIG money, so they probably sprinkle some shadiness on top (maybe indirectly by influencing popular sentiment via some media control).

Joined: 04/27/2013

MGoPoints: 17041

BiSB

March 1st, 2014 at 8:47 PM ^

Because Vegas can react to stuff a passive algorithm wouldn't pick up, like injuries, suspensions, etc.

Joined: 08/15/2009

MGoPoints: 46071

docwhoblocked

March 1st, 2014 at 5:14 PM ^

There are market places for predictions of elections and other future events. I am no expert but doesn't the Vegas line just depend on who is betting and they just try to balance the money on both sides. Is than not somewhat like crowd sourcing/prediction markets?

I found this article (admittedly written by Ohio economists) that suggests that the betting line is not very predictive for NCAA B ball but is better for NFL and NCAA football. They say overall betting markets like the Vegas line are not all that predictive. Hmmmm

http://www.econ.ohio-state.edu/trevon/pdf/BettingMarketPaper_01-11-12.p…

Joined: 09/05/2010

MGoPoints: 632

deemarsah

March 1st, 2014 at 5:17 PM ^

Actually has a Michigan connection and is a pretty good guy.

Joined: 11/18/2013

MGoPoints: 217

turd ferguson

March 1st, 2014 at 5:28 PM ^

My understanding is that no, Vegas isn't just trying to balance the money on each side of a line. They're likely smart enough to take advantage of inefficiencies in these public betting markets.

Joined: 12/09/2009

MGoPoints: 26493

Yeoman

March 1st, 2014 at 5:32 PM ^

Not sure about the second link--I can't find any description of how he determined "the Vegas line" and the website hasn't been touched in years so I doubt I'm going to find out.

But the first link specifically used the opening line, before crowdsourcing kicked in.

Combining this with your link, I think what we've got is some evidence that a candidate for a Vegas-beating system is to fade moves in the Vegas line. The paid professionals are better than the crowd.

Joined: 06/08/2011

MGoPoints: 21390

Nitro

March 1st, 2014 at 10:21 PM ^

The fact that the opening line performs better than the game time line confirms this. But it's a marginal difference that would only pay off over a large number of bets. Personally, I think you'd be better off getting, say, NBA league pass, watching a lot and gaining a good understanding of the current ebb-and-flow, and betting NBA games based on your intuition (if you start seeing it developing into something you could trust).

Like, if a team plays a fundamentally solid game and moves the ball well on offense, it's a good bet they're gonna hammer the Pistons by more than the spread says the rest of this season.

Or, if you watched the Pistons play the past 2 seasons (or the NJ Nets in the seasons immediately following their run as a contender), you'd know that Lawrence Frank is a basketball doofus. So when you see Jason Kidd cast him aside because of disagreements, you'd understand this to mean Kidd wants to do something right, so you'd bet on the Nets outperforming the spread for a stretch.

Or, take the Spartans. You could see Izzo making all these excuses, understand this will make his team feel like they have reasonable excuse to fall back on if they lose, see the malaise setting in as a result, and bet they'll struggle against the spread. Not sure how that will work going forward, since their latest loss could snap some urgency into them, but I can see that urgency still taking a game or two to set in, especially given that they're not playing for a title anymore and the team will probably be thinking they can rely on a boost from getting Dawson back before they realize it's really not that much of a boost after all.

There are many specific circumstances computers and patterns can't account for on an individual level.

Joined: 04/27/2013

MGoPoints: 17041

singler makes …

March 1st, 2014 at 5:58 PM ^

Is there a point to this thread besides your passive aggressiveness?

EDIT: that came off mean-spirited, so let me add this: it is great that you have taken an interest in predictive models. They are fun to examine and even more fun to build. I'd suggest that rather than posting random threads where you multiply two numbers and post a list, or other threads which seem to say "HAHA Kenpom sucks" (although not in those words) that you take the time to read about the various models and how they work.

In your previous thread it looked to me that people were more attacking your concept than you. Take it and learn.

SECOND EDIT: Screw it, I’ll enjoy Bolivia.

This series of threads represents pretty much everything wrong with the internet. Anyone can spew whatever they want, and the result is society fails and people like AJ Mass are employed journalists. Sure, here it is just sports, but the same stuff has terrible outcomes on science (won’t get into it cause of the no politics stuff). Just because you can do basic math doesn’t mean it has any value whatsoever.

Kenpom has spent years trying to perfect a predictive model that most of us wouldn’t be able to understand. He has essentially dedicated his life to the issue. But hey guys, look at this, I can take his values and then multiply by another number and get a different ranking! Oh wait, you don’t like this thread! Delete it so I can make another one where I give an N of 1 with absolutely no context.

Please just stop for a second and try to learn what others have done before presenting your latest “findings”.

I’m done.

Joined: 10/08/2011

MGoPoints: 204

BJNavarre

March 1st, 2014 at 8:45 PM ^

Nail on the head

Joined: 07/02/2008

MGoPoints: 4434

straight-gangs…

March 1st, 2014 at 5:31 PM ^

I just think people were questioning the rationale, when Kenpom already takes the SOS into consideration and therefore I think there was issue with what your rankings truely represented. Absolutely appreciate the effort and enjoy digging into the numbers, Vegas lines, etc. I'm assuming you're simply trying to highlight the fact that Kenpom isn't perfect when predicting. Not surprised......I would always lean toward Vegas for true odds.

Joined: 06/03/2011

MGoPoints: 529

B-Nut-GoBlue

March 1st, 2014 at 5:39 PM ^

Really naive and ignorant question here: What is "Vegas"? Is there a secret corporation that hires a relatively small collection of really smart people that also know sports (not that the two are always mutual) and they just sit around and study to make the odds for the casinos? Or...?

Joined: 09/30/2011

MGoPoints: 43665