Researcher's Model Predicts Where a Recruit Will Commit with 70% Accuracy

Submitted by Caesar on

A University of Iowa PhD candidate came up with a predictive model (link). Where it was wrong, the article says the school chosen was 2nd on the list of possible places to commit that the model generated.

Her data is based on analyzing tweets from recruits. Here's the meat and potatoes of it:

She mined data from 573 athletes in 2016 from the 247Sports recruiting database who had at least two Division I scholarship offers and public Twitter accounts. Then she pulled their tweets, followers and accounts they followed each month and distilled the data into a model that makes it all easy to understand.

She found that if a recruit tweeted a hashtag about a school, his likelihood of committing there jumped 300 percent. For every coach the athlete followed from a given school, his likelihood of committing went up 47 percent. When a coach follows an athlete, likelihood increases 40 percent.

“The most significant actions online are the actions the athlete is doing,” Bigsby said. “Who is he following? What is he tweeting? What hashtags is he using?”

Her model crunched those numbers along with other data sets — i.e.: a college’s location relative to the recruit’s home town, a college’s academic ranking, a college’s recent football performance, and more — and spit out a list of universities a recruit was likely to attend, along with each school’s odds.

Lancer

January 18th, 2018 at 7:00 PM ^

I get you. I also concede that a models ability to predict an outcome is not always the end all be all. Understanding the relationship between the variables in question and their contribution to the outcome in question is also important. 

mgobaran

January 19th, 2018 at 9:51 AM ^

Is that because every crystal ball analyst is able to change their mind last minute based on leaked information? Take Tyler Friday for example. Everyone and their brother switched CBs from Michigan to OSU in one day like a week and a half ago. On January 9th the crystal ball predicted 85% Michigan. On 1/10/18 it switched  to 91% OSU. If he chooses OSU, they are all "right" after being wrong for months and months. If her model showed Michigan 70% OSU 30% for the past 6+ months, and he chooses OSU, isn't she more right by being less wrong? 

 

bronxblue

January 18th, 2018 at 6:51 PM ^

It's a cool idea, but I wonder how good it is with higher rated recruits. That's sort of where it matters; figuring out a 3* kid goes to the local college isn't particularly hard to predict.

Swang on These

January 18th, 2018 at 6:54 PM ^

Overfitting nice. I don't need a model to tell you a recruit from Austin is 300x more likely to commit to Texas than Hawaii.

My guess would be that each recruit really only tweets and follows coaches from the 2-3 schools they are seriously considering. So by the time you have that feature to include each of those schools is instantly 300x more likely than Colorado School of Mines.

If the model could on a cold start predict 70% of where kids are going to college in 2018 that would be more impressive.

Swang on These

January 18th, 2018 at 8:12 PM ^

This isn't a logistic regression done in a terrarium. It's a bounded choice model that updates probablities as certain events happen.

The major point is that when a recruit mentions a school the probability goes up. So it can determine P(School|Tweet) very well. 

If you putting the model in context of making a general prediction about where a recruit will go it will perform very poorly because it is 'overfit' to that occurrence.

tl;dr by the time one team gets up by 20 it's likely to win now use that model to predict who's going to win at kickoff

Seth

January 18th, 2018 at 6:57 PM ^

Northwestern was using a similar model and we use these same things unscientifically plus visits and whose media they talk to. The Northwestern article noted (but didn't make the connection) that when models are wrong it's often because they're going to the schools most notorious for buying players: Bama, Georgia, FSU, Clemson, previously Ole Miss.

FauxMo

January 18th, 2018 at 7:00 PM ^

FYI, for those interested, this is called "web scraping," and algorithms have been written to scour social media sites (mostly) for consumer sentiment about companies, brands and products. Those results have then been used to go long/short on companies, and been pretty successful. There is even an ETF (BUZ) that trades on these principles... 

StephenRKass

January 18th, 2018 at 7:44 PM ^

Very interesting. It is scary how much people can use stuff like that to predict and find what someone will do, or has done.

I have a former roommate and dear friend from Michigan who has done an awful lot in statistics and data mining. It turns out that if you had someone's 5 digit zip code, birthdate, and gender, and they've registered to vote, you can often get a positive and total identification. Insurance companies figured this out. They were able to combine medical studies which listed the zip, DOB, and gender with voter registration lists to come up with positive identification. However, it became apparent that unscrupulous insurance companies could use this data to deny coverage to identified individuals in a high risk pool. Hence, bringing about HIPAA regulation, to help protect personal privacy.

Here's a link for any of you who are nerds.

LINK:  https://www.forbes.com/sites/adamtanner/2013/04/25/harvard-professor-re-identifies-anonymous-volunteers-in-dna-study/#7a9b55dc92c9

What I'll say is that if you don't want your actions predicted, or information about you known by others, you want to be careful about how much you share. There are apparently data mining bots out there (and companies behind them) who know much more about you than you might want. This isn't my world, and I don't have a lot to hide, but it is fascinating.

East German Judge

January 18th, 2018 at 9:56 PM ^

While I get the pursuit of knowledge bit and while I do not have a PhD, but I did stay in a Holiday Inn Express last night, the university that you are at also provides a fair amount of financial support for the candidates and a committee needs to approve your choice of a thesis, amirite?  If so, our corn friends think this is a worthwhile academic pursuit - WTF!

copacetic

January 18th, 2018 at 8:11 PM ^

Couldn't find it in the article, but did the data only account for recruits not currently committed to a school? 

 

A verbal commit somewhere is probably a lot more likely to use that school's hashtags and follow their coaches, and then of course they're a lot more likely to end up with that school come signing period. 


So if a Michigan commit tweets #GoBlue 20 times during the season, and eventually signs with us, wouldn't that make it seem more likely that recruits who tweet #GoBlue are more likely to sign with Michigan? Seems kind of obvious, but I feel like that could skew the data

WestQuad

January 19th, 2018 at 9:18 AM ^

I just looked through Nicholas Petit-Frere's twitter and have no idea where he might be leaning?    He has official visits set up for Alabama,  Florida and Ohio State in the next couple of weeks and is also fielding visits from their coaches.   Anyone have any info on how having our bowl practices at his highschool went?   Are we still in it?   This site says he's going to Florida, but I've never heard of the site. This fall Sam Webb said the kid was pretty tight lipped (smart) about his recruiting/favorites.  Any wild speculation?