things go poorly
Update at bottom
Update 2 at bottom
Note: This is a long and complex read. I know that. I'm looking for assistance with a project I'm working on that I know everyone will be interested in. If you wish to skip all of the reading, I have summarized everything in bullet points at the bottom.
I had hoped to keep this my little secret until I was completely done and I could unveil everything at once, but I no longer believe that I could do this project as efficiently without some other input. As an engineer, I require myself to do everything with as high efficiency as possible so I must petition the MGoBlog community for help.
As many (more likely all since you're on a site like this) of you are aware, there have been more and more threads being posted which essentially go down as so:
Poster 1: "We're going after slot-dot X and he's only 3 stars!. Argh! Doesn't RichRod understand he's not at WVU anymore and he needs to get MICHIGAN quality recruits. RichRod=Fail."
Poster 2: "Stars don't matter, obviously RichRod thinks that he's good enough and that's good enough for me."
Poster 3: "Rankings are early, they'll change, just settle down for now"
Poster 2: "He's only 3 stars but look at his offer sheet, I'd take someone that's 3 stars with offers from USC, OSU, UF, 'Bama, etc. over a 5 star with offers from us and the MAC."
Poster 1: "Stars do matter, you need talent!"
Poster 2: "Mike Hart, Braylon Edwards... nuff said"
And so on and so on.
So, I started thinking about rankings and their usefulness at predicting future college and pro success. To that end, I'm going to undertake what I believe will be the largest statistical analysis of recruiting rankings to date. But I need some help.
Let me describe what I'm planning on doing, what I've already done to accomplish that goal, and what I still need to do. Then I'll finally be able to show everyone what I need help with. You'll also be informed enough to offer criticisms, advice, and ask questions if necessary.
1- What I plan on doing
I'm going to take all recruiting data from Scout and Rivals from 2002-2009. As of right now, that includes: name, positional rank, number of stars, HT/WT/40, position, hometown, and home state. I'm then going to also compile data on how many starts each player had in each year of his career, if he redshirted, if he left early for the draft (manifested as number of years of eligibility remaining), the number of All-Conference honors received, and the number of All-American honors received. I will also take information on if they were drafted, what round they were drafted in, what overall number they were drafted as, what position they were drafted for, and what team they went to.
Once I have all of that data, I will first do a top-level analysis to see, independent of everything else, how star rankings alone are at predicting collegiate and pro success as defined by the stats that I will have collected above.
From then on, I will keep trying to dig further to get more and more relevant models and conclusions. This will include but will not be limited to how the average rankings of the other players around another player (independent of that player's rankings) affect collegiate/pro success, the number of blue-chip recruits that completely fail, the number of blue-chip recruits that leave their home state, the average team ranking, success of rankings at predicting success at each individual position, the affect of positional ranking on future success, etc.
I'm going to try to come up with as many ways as possible to analyze the data that either decouples the data or gives conclusions that are independent of coupling. Figuring out how to do that will be difficult but fun.
As a side note, this will also let me eventually compare Scout and Rivals to say with some authority, whose [final] rankings are more accurate.
Of course, I will also apply standard statistical analysis procedures to determine if my conclusions could be deemed statistically relevant or not (I don't know with what percent confidence yet so don't ask).
2- What I have done
It's all well-and-good to have thought all of this out, I'd be willing to wager that at least one other person currently reading this has thought about it, but thinking alone won't get any of us anywhere. So, I've started to do a lot of the grunt work as a sign of my commitment so that people will understand that I'm dedicated enough to make helping me worth their time.
I have already collected all of the information from Rivals for every class and every player.
So, for the classes from 2002-2009, I have every name, positional rank, Rivals Rating (RR), star rating, position (as Rivals breaks it down), and what school they committed to.
I have also created an Excel spreadsheet template that will allow me (once I get all of that data) to merely copy and paste a few things from Rivals and all of the data that I have on every player will be retrieved. With that, I will be able to create a spreadsheet for every BCS team (as Rivals only has complete listings for BCS teams) which will have every class and all of the data for each kid in every class all in one spot. Then I'll be able to do my analyses more easily.
3- What I still need to do
Obviously I'm still not done with the collecting data/grunt work as I still have to take all of Scout's data. It's taking a little while because of the way that they format their data compared to Rivals. Fortunately, I have solved the problem and can now do the usual copy and paste (followed by several other things to make it all work).
I'm considering also grabbing data from ESPN but I'm really not sure if it's even worth it. They only have data from 2007-2009 (I believe) so that doesn't even include a class that been drafted yet.
More importantly, I need to find a source for the other data that I'm trying to collect. I need to find some place(s) that lists all of following:
- If a player redshirted
- Number of starts each year
- Every All-Conference team (not just first team) for all BCS conferences starting from 2002
- Every All-American team starting from 2002
- Every transfer since 2002
- What position each player was drafted for
- Individual player positional statistics (e.g. completion percentage, interceptions, tackles for loss, etc.)
There is also some other data that I’m going to try and collect but I already have sources for that so it need not be listed here.
4- What I need help with
I need help finding the data that I list above. Pieces of it are available everywhere but I haven’t found a single site that has a repository of all the information implied in even just one of those points above.
Additionally, getting individual statistics is extremely hard. But, it would allow more comparisons than possibly anything else. But, there are literally tens-of-thousands of players. There were over 1000 wide-receivers in 2009 alone! There are simply too many players to try and go to each player individual profile page somewhere and collect the data. I, unfortunately, require lists. That is, unless there is some tool or way to automate that data collection process. I myself know of no such way but that is one of the reasons that I’m asking the MGoBlog community for help, because I don’t necessarily know everything that I could do to make this project as easy as possible (at least on the data collection front).
I’d also like to find a way to collect data on all of the schools that have officially offered a kid a scholarship to see if there is some way to show that stars or scholarship offers is, statistically speaking, the best measure of a kid’s future ability. Again, I can’t go to every Rivals profile page to try and collect that data. This is one area where I feel that since the pages are so similar, it might be possible to write some sort of script to do the work for me. Unfortunately, I’m a ChemE and MSE person, not a CSE person (for those of you outside the engineering that’s Chemical Engineering, Material Science Engineering, and Computer Science and Engineering respectively) so I don’t know what tool or utility I would go about using to accomplish that. I am in Tech. Services so I’m sure that if someone pointed out to me the appropriate tool and maybe some documentation on how to use it then I wouldn’t have any problems.
I know that what I wrote above was long so here’s the summary (whether you read everything preceding this or not).
I’m going to perform a statistical analysis on Scout and Rivals to determine how good their final star ratings and positional rankings are at predicting future success both in college and the pros. To do so, I have already collected the data from Rivals and am currently working on collecting data from Scout. I will probably not take data from ESPN although that is not a certainty.
To determine collegiate success I will take data that includes but is not limited to All-Conference honors, All-American honors, and the number of starts. To determine pro success I will take into consideration where a player was drafted and for what position.
I know where to acquire some of the information that I need but I still need help finding useful places to take large amounts of data on:
- All-Conference teams
- What position each player was drafted for
- Number of starts by each player
I would also like to find a way to automate data collection, specifically with an eye towards collecting data on what schools offered each kid a scholarship. Since there are tens-of-thousands of kids this cannot be done individually but must somehow by automated. I do not know how to do that and am thus asking for help. The same situation applies for collecting individual, positional specific, statistics on each kid.
If anyone would like to help me out with what I have asked, then I would greatly appreciate it. Any criticisms will be well-received (or at least as well-received as I can) and taken into account. Any comments or other thoughts are also welcome and appreciated.
For more information, read the sections above.
Since so many people have responded with helpful ideas, if you wish to contact me with anything that you either don't want to post in the comments, is too long and complicated for the comments, or that you wish to have a more private dialogue about then email me at:
That's not my main email so I won't check it as often (i.e. not every 20 minutes) but I'll try to check it at least once a day. If you want to send me anything, links or other work that you've done that might help me, then send it there.
Thanks for all the great ideas and please keep them coming. I'm still thinking about ways to handicap a teams that have a lot or a little talent relative to the average (for reasons that are too long to fully explain in this update, although there are some interesting thoughts on why and how in the comments below). I'm also looking for ways to automate the data collection process. There are a few suggestions below but I'm going to be looking for more so please tell me.
Again, I prefer using the comments if possible but if not then email me.
Update 2: 3-27-09
Well, it's been pointed out in the comments and confirmed by me that the email address is listed above doesn't work. That's because I had a small typo. Of course, small typos in email addresses are big typos.
Anyways, the correct email address is: Bleedin9Blue@gmail.com
If you tried emailing me earlier with the previous email address then please try again. I appreciate your patience.
I've always admired people who are able to argue forcefully about a topic without getting personal, attacking, or losing their temper. It seems a rare quality these days, especially on the web, where the cloak of anonymity seems to lend itself to comments that, were they made in person, would likely get someone's ass kicked.
One of my best friends at M was politically the direct opposite of me. At the time we were both on our way to law school, and very much into philosophical and political debates, which sometimes degenerated into yelling. Yet it never got personal, and it was forgotten by the time we entered Rick's. And while we don't keep in touch much, I consider him one of my best friends.
About a year ago, I saw a profile of Justice Scalia on 60 minutes, and what struck me most about it was they said that the two best friends on the Supreme Court were Scalia and Ruth Bader Ginsburg, his direct opposite politically. They asked him about it and he said "I don't argue with people, I argue with ideas." I thought that was pretty rare.
While I haven't always lived up to that standard myself (especially when I read some troll on the Freep website), I try to. I've thrown around "you're an idiot" too much, and turning to this site, I see that thrown out or worse quite a bit in the majority of threads. I wonder if you guys think the overall tone of this site and the comments section is: A-about right, B-too accusatory and personal, or C-not personal and derogatory enough asshole!
Not that "you're an idiot" is not ever deserved. There was a thread the other day about poole1dan, who defined the word idiot and worse in his comments. In fact I went to youtube, where he has a channel, to tell him he was ignorant and giving the rest of us a bad name. More (perhaps) deserved scorn might go to that guy who ignores actual facts in favor of his argument (e.g. the recruiting rankings don't matter guy)... debatable anyway. Not debatable is that guy who says that Brian is an idiot (poole1dan again), or calls people he doesn't agree with idiots simply because they have the opposite opinion. You may or may not agree with Obama, but clearly he is an intelligent guy who deserves respect.
So, this may too much of a Rodney King thread, but I'd be interested in the replies. I love a good argument, but like going to get a beer afterwards a lot more.
Now that the initial Scout 300 and Rivals 250 have both been released I thought it would be interesting to compare the two sets of early rankings. Below I have recorded both rankings for recruits of interest (if I missed any, help me out). I have also averaged the two to reach a composite ranking for each player, which I'll call the player's "preliminary consensus" ranking.
Disclaimer: I think rankings are imperfect. I know they evolve. I do not think they guarantee success or damn a player to mediocrity; but neither do I think they are worthless or arbitrary. I hope this post invites comments about the rankings themselves, but not another tired back and forth over their general validity, e.g. "rankings don't mean anything" -- "yes they do" -- "what about successful 2/3 star player X?" -- "but look at this NFL draft data" -- etc.
2010 "Preliminary Consensus"
 Seantrel Henderson (S:1; R:1)
 Marcus Lattimore (S:2; R:4)
[4.5] Lache Seastrunk (S:7; R:2)
 Jackson Jeffcoat (S:6; R:6)
[9.5] Jeff Luc (S:9; R:10)
 Kyle Prater (S:10; R:18)
 Jordan Hicks (S12: R:16)
[32.5] Robert Crisp (S:36; R:29)
[45.5] Christian Green (S:54; R:37)
[50.5] William Gholston (S:57; R:44)
[54.5] Mack Brown (S:63; R:46)
 Chris Dunkley (S:93; R:61)
 Tai-ler Jones (S:81; R:101)
[107.5] Dietrich Riley (S:70; R:145)
[118.5] Corey Brown (S:168; R:69)
 Ricardo Miller (S:115; R:123)
[123.5] Marvin Robinson (S:148; R:99)
 Devin Gardner (S:77; R:177)
[133.5] Brennan Clay (S:155; R:112)
[135.5] Dior Mathis (S:137; R:134)
[139.5] Latwan Anderson (S:213; R:66)
 Chaz Green (S:127; R:155)
[158.5] Robert Bolden (S:83; R:234)
 DeJoshua Johnson (S:210; R:114)
 A.J. Cann (S:212; R:162)
 Nickell Robey (S:265; R:125)
[197.5] Jerald Robinson (S:157; R:238)
[199.5] Jeffrey Godfrey (S:215; R:184)
[202.5] Josh Furman (S:105; R:X)
[210.5] Austin White (S:121; R:X)
 Jay Guy (S:140; R:X)
 Brandon Ifill (S:154; R:X)
[240.5] Torrian Wilson (S:X; R:131)
[251.5] C.J. Olaniyan (S:203; R:X)
 Caleb Lavey (S:214; R:X)
 Cullen Christian (S:254; R:X)
 Nick Hill (S:262; R:X)
[288.5] Austin Gray (S:277; R:X)
[291.5] Kenny Shaw (S:283; R:X)
[296.5] Scott McVey (S:293; R:X)
Unranked of interest: Jeremy Jackson, D.J. Williamson, Lo Wood
Note on methodology: For players not ranked in the Rivals 250, I computed the Rivals half of their composite ranking with a value of 300. For players missing the Scout 300 I used 350. This is admittedly inexact. It may punish a player too much for not making the list (or not enough).
So, we've had some great posts recapping a wonderful basketball season. I've heard a lot of talk about next year, unsurprisingly, and there's only been little spatterings of our incoming talent on the interwebs that I've seen. So, here's my attempt to give a deeper look into what this roster will look like next year. Feel free to criticize away!
Darius Morris - PG 6'3" 175 ****
Matt Vogrich - SG 6'4" 180 ***
Jordan Morgan - PF 6'8" 245 ***
Blake McLimas - C 6'9" 210
Eso Akunne (pref. walk on)
This year's walk-ons were Corey Person and Eric Puls. With four scholarship players coming in and preferred walk-on Eso Akunne entering the fold, I find it unlikely (without knowing exact roster limitations) that either will be back.
First, the addition of Darius Morris will be a huge step for this program - he's a legit college PG with size. Something we haven't had since, well since before I got to Michigan, that's for sure. Morris has an above average handle and a good head on him. He's a guy that you look at and say "he's the future." It'll be fun to see what he can do with JB teaching him.
Matt Vogrich put up sensational numbers throughout his senior year. He's a pure shooter and a great fit for the Beilein system. It was nothing out of the ordinary to see headlines of "Vogrich puts up 35" all winter. The kid is a scorer.
Morgan is an interesting case. Nothing has been overly impressive about his senior season. He has very good size for a PF and could potentially be something we're missing - a big body that can rebound - yet there's been a lot of talk about a red-shirt.
McLimas is certainly a project. He transferred to a top academy where he put up decent to meh numbers. I fully expect a redshirt.
Everyone likes to throw around projected starting lineups, myself included. But at this point - none of these guys have set foot on campus since their recruitment - that's kinda pointless. But looking at how the roster fills out is still interesting and provides some big questions:
Point Guard-y type players: Grady, LLP, Morris, Douglass
Guards: Harris, Douglass, Novak, Vogrich, LLP
Wings: Harris, Novak, Wright, Sims
Bigs: Sims, Gibson, Cronin, Morgan, McLimas
So, I went with PG-y type players because I feel the PG position is a real crap shoot as of now. I think one thing that can be universally agreed upon is that our best bet is for Kelvin Grady to pick it up defensively, work out his mental lapses on the offensive end, have a great off-season, and emphatically win the job. In a perfect world, we'd have that. Darius Morris would be able to learn his way through his freshmen season, gaining some valuable season from the bench while still having an experienced PG in control of the team. Now, if Kelvin can't make that leap, then we have issues. Does LLP split time with Grady and Morris early on? Does Douglass, who has improved tremendously and shown he's an extremely good passer, get called on to handle the responsibilities?
I'm personally ok with Douglass at point in certain instances. He's a very heady player, he's brought the ball up the court at several points in the season, he's an excellent passer, and he defends well. Obviously, he would be playing out of position though, and would be giving up foot speed to almost any B10 PG. Early in the season if there is flux and uncertainty at point, however, I can see JB looking to Douglass to handle the PG role on the offensive end with perhaps LLP also on the floor to handle the defensive side of the ball.
The other option is to throw the job at Morris, the heir apparent, and live with freshman mistakes. I'm not a huge fan. Maybe by B10 season he'll be ready to take the reins, but this is a complex system. I'd be much more in favor of the above Stu/LLP situation
Looking to the other end, it will be very interesting to see how far Cronin can come along. IMO, ideally, Cronin will have an excellent off-season and be able to find a role as a 10-minute guy off the bench. I think asking for anything more is pushing it. From all accounts he still plays soft and small. He's been away from basketball for several months now. It would take a very good off-season to get to the point where JB can use him as the first big off the bench. That said, I wouldn't be surprised if we see more of the small lineups that we had this year (at least to start the season). That means Novak will be our everything guy, yet again. Perhaps Ant can help fill in that role as well, but judging by this year, who knows.
Whether or not Jordan Morgan can come in and contribute as a big off the bench is a big question. If he and Cronin are able to step in and provide quality minutes, we could get the Sims/Gibson starting lineup that many of us crave. That's an ideal situation, however. I would expect us to still play small with either Cronin or Morgan providing spot minutes and at times allowing Sims and Gibson to play together when necessary (for those UConn, OU, Illinois type games).
What happens with the bigs will heavily affect how much playing time Matt Vogrich gets, IMO. With LLP, Douglass, and Novak providing the 2/3 outside shooter roles, and Manny being Manny, Vogrich is going to have a tough time finding PT. If we go small then I think it's likely that we see Vogrich get some time. However, if our bigs play out in an ideal fashion, Vogrich may be the odd man out. it's truly amazing the depth we have at 2/3. There's so many possible lineup combinations that it hurts to even consider it. But in the end, Vogrich is competing with Douglass, Novak, LLP, Manny, and Ant for PT. Despite the fact that the guy is supposed to be lights out from the arc, I find it hard to see where he'll find the floor with all these other guys already with 1 or 2 years of experience in Beilein's system.
So, all in all, alot depends on the off-season development of Kelvin Grady and Ben Cronin. Those are the two guys that, if they work extra hard in the off-season and make a deep commitment to truly getting better, they can really change the face of this team. Otherwise, we could be looking to freshmen (Morris at point, Morgan as reserve big) to be taking over important roles. The talent for this team is at a level that can be compared to the senior season of Dion Harris, Courtney Sims, et al. It'll be interesting to see if they can take advantage of that talent better than Amaker's final team could.
When Manny Harris fouled out yesterday, my first thought was I hate these refs. My second thought was oh my god if we somehow pull this out John Beilein's benching of Manny Harris during the Iowa overtime will go down in Michigan basketball lore.
At the time of the benching, most fans were pissed because they thought his coaching arrogance had cost us a trip to the tourney. JB claimed he did it becuase it was "best for the team in the long run". Yesterday showed us what JB meant by the long run. The Iowa overtime gave the team experience playing important minutes without Manny. Remember, the first time Manny missed important minutes (after the ejection), the team totally imploded and quit. JB wanted to teach the team that they are capable of competing against good teams without Manny. Although they got smoked in the overtime against Iowa, they never quit during the overtime, just as they never quit yesterday (while also going against shit refs in both games). JB's most controversial coaching decision almost resulted in the team going to the sweet 16.
In fact, I think this, more than anything else, proves why John Beilein is such a great coach. John Beilein's most controversial decisions, such as benching Manny Harris during the Iowa overtime, recruiting two low rated Hoosiers over Detroit leftovers, and inserting CJ Lee into the starting lineup, end up being his most applauded decisions in the long run.
It is REALLY early to start looking ahead to next season for the basketball team, but after such a successful turnaround this year it is hard not to get excited for the future. I thought I'd take a first look at how the minutes are going to break down for next season.
First, let's make some assumptions:
-- Manny and Deshawn do not go to the NBA.
-- We'll ignore any possibility of injury for this excercise and assume that Cronin is healthy.
-- The incoming freshmen are all eligible and remain committed.
Now let's talk minutes:
-- Manny and Deshawn will play the same number of minutes as this past season.
-- Douglass has shown he can play D, hit some shots, and will probably get some minutes even at PG next year. Let's assume his minutes stay the same.
-- Gibson showed some spunk in the tournament, but given the influx of height next year I'll assume his minutes stay about the same. They might go up slightly, but that's about it.
-- Despite his great game against OK, Wright will struggle to get off the bench.
-- Merritt, Lee, and Douglass give up all their minutes.
(I can't figure out how to insert a graphic, so sorry in advance for the crappy formatting)
NAME:_______2009 MIN___2010 MIN?
I made the assumption that Novak would get fewer minutes since there is more height coming in and he'll spend less time at the four.
I made the assumption that LLP will improve, play a little point, and boost his minutes a little.
But what do you do at with Grady? I think he'll start the season as the primary PG until Morris is ready. But given how much he was glued to the bench lately I suspect he'll be used sparingly. I put him at 15.
If you add all of that up, you're looking at 44mpg that are currently unused for next year. Where do they go? Here's my theory:
Morris = 18 (Eventually he'll start and get minutes)
Cronin = 10 (Year of experience, size, D, rebounding)
Vogrich = 10 (Lot of SG candidates, so he'll only get a small dose)
That leaves about six minutes left. I expect that to go to either McLimans or Morgan. I can't see either playing more than some garbage time next year and it makes sense to redshirt one of them. I think they'll duel it out in camp to see who gets the redshirt.
What does everyone think?