I am MGoData, a senior undergraduate at Michigan majoring in various technical fields. I am very interested in data and its ability to shed light on human behavior. Before embarking on a 5-6 year tour of PhD studies, I plan to take some time this summer to relax and do some "fun" projects. Given the emphasis this community places on actual facts and numbers over other more subjective metrics, I thought this would be a good place to look for a project or a two.
Jumping right into it then...
Given the success (luck?) Michigan State has had over the last few seasons, there has been a reasonable amount of discussion over Little Brother's "obsession" with everything Michigan. So I ask the question:
Is Michigan State obsessed with Michigan and if so, how can we quantify it?
To answer this question we need some data. We want this data to reasonably capture behaviors associated with constantly thinking about or wanting information on Michigan and we want there to some way to identify this behavior as being directed from Michigan State fans/students towards Michigan fans/students.
Google is a great place to start. People search google for millions of things, millions of times per day, and they search for things they are interested in. Google has had recent success predicting flu outbreaks 2 weeks ahead of the CDC, just be looking for places where people are searching for things like "flu symptoms".
Conveniently, Google has build a tool for called Google Trends with which you can compare the popularity of search terms, including where most searches are coming from. So what happens when we compare people searching for the "University of Michigan" to the people searching for "Michigan State University". Lets look at the data.
The most immediate thing to notice is the obvious fact that more people are searching for The University of Michigan than Michigan State University. Sadly, it appears that "University of Michigan" is becoming less and less popular over the last few years.
This isnt surprising given the national (and world) recognition that Michigan recieves. Google scales this data as follows:
In relative mode, the data is scaled to the average search traffic for your term (represented as1.0) during the time period you’ve selected. For example, if you entered the term dogs, the graph you’d see would be scaled to the average of all search traffic for dogs from January 2004 to present. But if you chose a specific time frame – say 2006 – the data would then appear relative to the average of all search traffic for dogs in 2006. Then, let’s suppose that you notice a spike in the graph to 3.5; this spike means that traffic is 3.5 times the average for 2006.
Now lets look at the other information Google gives us. Conviently, we also get information on geographic regions that are searching for these two topics. The top cities break down as follows:
Again we look to Google's FAQ to understand how these results are calculated and interpreted.
To rank the top regions, cities, or languages, Google Trends first looks at a sample of all Google searches to determine the areas or languages from which we received the most searches for your first term. Then, for those top cities, Google Trends calculates the ratio of searches for your term coming from each city divided by total Google searches coming from the same city.
It's possible that Google uses the IP address of the searcher to tailor results to the geographic location of the user, but others can test this by performing the same Trends search and seeing what you get.
Interpreting these results is a little bit tricky so bear with me. First thing to notice is that everything is scaled to the number of relative search popularity of "University of Michigan" in Ann Arbor. Essentially they use the percentage of all Ann Arbor Google searches that are for "University of Michigan" as a baseline and compare everything else to this number. Again though, the search popularity is scaled by the total searches coming from Ann Arbor so population of other cities does not skew results. If they didn't do it this way, places like New York City would always be on top just because there are so many people Googling for everything.
What are some qualitative things we can learn just from the bar graph. Well first of all, people in Ann Arbor are searching for "University of Michigan" way more than people everywhere else, and people in East Lansing are searching for "Michigan State University" more than any place else. Furthermore, it seems that people in East Lansing are searching for "University of Michigan" more than people in Ann Arbor are searching for "Michigan State University". We can see this by comparing the the Blue Bar next to East Lansing, to the Red Bar next to Ann Arbor.
Even better, though, is the fact that Google will actually let you download text files with actual numbers. The following table are Google's measure for search popularity coming out these top cities. Again notice that Ann Arbor searching for "University of Michigan" is the baseline at 1.000.
|City||university of michigan||university of michigan (std error)||michigan state university||michigan state university (std error)|
|Ann Arbor (USA)||1||0%||0.1||3%|
|East Lansing (USA)||0.27||2%||0.52||2%|
|Bay City (USA)||0.175||2%||0.09||3%|
From these numbers we can see that "Michigan state University" is about ten times less popular than "University of Michigan" in Ann Arbor (not surprising) where as East Lansing is searching for "University of Michigan" a little more than a quarter as much as Ann Arbor is. Comparing the search popularity of opposing universities in each town, we see that East Lansing is searching for "University of Michigan" a whole 2.7X as much as Ann Arbor is searching for "Michigan State University". Furthermore search popularity of "Michigan State University" in East Lansing is about half that of "University of Michigan" in Ann Arbor, so maybe they just don't like searching for themselves (its no fun reading more "SPARTY NOOO!" articles).
Finally some disclaimers about this analysis. The entire arguement hinges on the assumption that people's search behavior reflect something about things they are interested in and particularly something they are obsessing over. This is probably a long shot conclusion. Even if we accept that people might Google things they are obsessed with, there is no gaurentee that the search trends for people obsessing won't be washed out by the everyday searches of people who just need information. The fact is we have no idea WHY these people are searching for things, just that they are. Finally, there are probably tons of demographic differences between East Lansing and Ann Arbor that make these numbers really difficult to compare. If, for example, students make up a larger portion of the population if one city, it will skew the data because students use Google in a much different way than other demographics.
However, if you buy that people's search behavior is a reasonable proxy for things they are thinking about a lot, and that the demographic breakdown of two communities, largely driven by students and tech-savy individuals, is similar, then these results are kind of need. We can basically see that people in East Lansing are trying to get info on The Univerity of Michigan nearly 3 times as much as Ann Arbor is trying to find out whats going on in East Lansing. I assume Brain keeps info on who is visiting his site and from where so it might be possible to look at just people trying to get information on Michigan sports.
This is the first of many small projects I'd be interested in doing over the summer. If anyone has questions, comments, or ideas for the future, leave them in the comments.
I should probably have post more about this in anticipation of the argument that doing this research is proof of obsession or whatever. My answer to this is mainly in two parts. First of all I am an out-of-state student so I don't have the complicated relationship with State that most people do. I dont have friends or family there and I dont plan on getting a job in Michigan so I wont have to deal with co-workers.
My main point in this is that there are interesting ways to try to test some of the very subjective debates that go on in these parts. Plus I just plain find it interesting that you can attempt to make sociological claims from behavioral data being generated and made available these days.
|Friday 3:05pm, Ray Fisher Stadium|
|Alan Oaks (2-3, 2.67 ERA)||vs||TBA|
|Notes: Michigan is 5-0 all time against the Mastadons, including a |
3 game sweep to open the home schedule in 2009
|Saturday 1:05pm, Ray Fisher Stadium|
|Bobby Brosnahan(1-2, 5.14 ERA)||vs||TBA|
|Saturday 30min after Game 2, Ray Fisher Stadium|
|Notes: My guess is Miller starts. Game time change!|
Ah yes, home opening day. The true signal of spring has come to Ann Arbor is here at last. No longer shall the Michigan team have to travel from Lubbock, TX to St. Petersburg, FL to Chapel Hill, NC to Conway, SC to Port St. Lucie. This weekend, the baseball team returns to it's friendly confines of Ray Fisher Stadium at the Wilpon Complex. They return home, where they haven't lost a home opener since 2000 (30-3 since 1975).
Preview and such after jump…
The implicit assumption is that Warren would have been better off had he returned for his senior season. But is that true?
Returning for a senior season brings a lot of risks to it, most notably the possibility of a major injury. It also includes a significant opportunity cost by missing a season of potential earnings in the NFL. Furthermore, the NFL may be implementing salary restrictions on rookies beginning in 2011 that further hinder his potential earnings.
The potential gains for returning for one more year are:
- Another year of college life / experience
- Potential improved draft stock
Reason 2 is often cited for guys that turn pro early, but its not like Warren is forever locked out of getting a college degree. If he wants a diploma he can still get one. Assuming he’s approaching graduation, he’d be able to go back to school easily after football (especially considering the money he’ll be making as a pro football player for at least the next few years. The signing bonus alone will cover it). In short, Plan B is still Plan B if football doesn’t work out. It’s just a few months further off. And he’ll probably get more out of school when not spending 40 hours a week on football.
Reason 3 is where the debate really begins. Some players do have a lot to gain by performing well in the their senior seasons, particularly if they can make it into Round 1. However, Donovan has already proven to be a very good college player. Coming back for another season wouldn't make a major change in perceptions about him in that regard. The reason he is (reportedly) falling in the eyes of evaluators is his 40 time. No matter how much magic you want to attribute to Barwis, there isn’t much he can do to change speed and considering Warren has been around for 2 years of Barwis, he’s probably progressed as far as he can.
As a result, Warren’s draft stock won’t be significantly different in 2011. You can argue that improved play would have bumped up his stock, but it might have done just as much to hinder it if he didn’t play any better or played worse. It’s a gamble, especially when you consider he’d be a year older and therefore viewed as having diminished “upside”.
The bottom line is that Warren considers himself an NFL player (and most talent evaluators agree); he'll be drafted and have a chance to prove himself as a professional. If hes not good enough, he'll be cut whether he's a 2nd round pick or a 5th round pick. It really doesn't make a huge difference. The pay differential from his rookie contract is offset by an additional season of playing football professionally (look at the differences between lower 2nd round picks to 3rd and 4th round guys - http://www.macsfootballblog.com/2009/05/2009-nfl-rookie-signing-status-team-by.html - they generally aren’t that huge)
Warren didn't make a mistake. Most guys who are drafted are not making a mistake when they turn pro. The decision is the correct one when all the costs and benefits are factored in for most. The decision is only a mistake is if you’re immediately cut and never earn a penny as a pro football player or are really enjoying life as a collegiate athlete and will miss it more than you’ll appreciate the money you’ll earn as a pro.
While we as fans may have desperately wanted Warren to return for the sake of Michigan football, he made the right decision for himself. He's going to make a lot of money compared to most of us in a very short time frame. I wish him nothing but the best. Go Blue!
EDIT: Fixed spelling error and link
To follow up on the previous KenPom charts and graphs, I decided to pick my NCAA tourney bracket based on Ken's predictions and see how accurate he is. The way I used the data is as follows: I assumed that M = (AdjOffence - AdjDefense)* (AdjTempo)/100 gives an average margin of victory for the dataset. Then, M1 - M2 = margin of victory difference between the two team playing. To apply to the Michigan/Ohio State games gives
Michigan = (107.0 - 92.7) * (62.7/100) = 8.99
Ohio State = (118.9 - 89.8) * (65.8 / 100) = 19.1
Which predicts a 10.1 point margin of victory for OSU, pretty close to the actual KenPom prediction.
I'll save you all the eye chart of the data table. If you're interested, it's here:
The relevant data:
|team||adjusted tempo||adjusted offense||adjusted defense||difference||rd1|
|Texas El Paso||69.45||107.47143||88.3081||13.30893269|
|Nevada Las Vegas||67.3486||109.46974||90.69505||12.64449087|
|San Diego St.||64.6273||110.97383||92.09351||12.20184105|
Calculating the probable winners in this fashion gave a win/loss of 24/8. And, four of those that are wrong were predicted to be 3 point games, and ended up +/- 3. Here's the corresponding chart.
I calculated the total average margin of error (absolute value) for all games at 7.44, and margin of error in games correct at 7.07, and margin of error in games wrong at 10.6.
I next calculated the distribution of error. Since I used absolute value in the previous calculation, I ended up with half a bell-curve distribution. Data:
What's interesting is that this is a better prediction than just using KenPom as a relative rating. By picking solely based on the higher ranked team, the record is 23/9.
If you can draw any conclusion from all this, it is that Ken is pretty accurate, except when he's not. I didn't expect to be 100%, because I don't think any system out there will predict Georgetown, or Kansas or Villanova to lose, based on the numbers. But, by this point in the season, the system is remarkably accurate in predicting probable outcomes. It has some margin for error in predicted close games, but I don't think there's any system that would be able to predict close games, either. They just come down to the luck of the draw.
So I was planning on interviewing Michigan baseball coach Rich Maloney tomorrow, but this morning's Michigan Insider with Sam Webb had their time with Rich Maloney this morning and they more than asked pretty much all of my same questions. So I'll summarize Maloney's comments here (full audio):
- The rotation is still up in the air. Oaks and Brosnahan appear to have solidified their spot in the weekend. Burgoon is going back to the closer role for now, but if Miller continues to struggle, Burgoon will be the third guy. If for some reason we don't have to use Burgoon in a Friday and Saturday, we could see him on Sunday as well.
- Ryan LaMarre is due to have the pins out of his thumb tomorrow (Wednesday). It should be at least a week of rehab to build up strength. Maloney expects LaMarre to battle to get in quicker against Indiana, but "but that might be a bit of a reach."
- Coley Crank has been a surprise to Maloney. They knew he'd one day be an offensive force, but his explosion this early was surprising. He's also greatly improved his defense.
- "Chris Berset is playing at an unbelievable level right now. He's truly one of the best catchers in the country right now."
- "Dufek hasn't been hitting them out of the park, but he's been starting to come alive with the bat." We'll have some more on this later in the week here at mgoblog.
- We're playing at Stanford next year, at LSU in 2012.
- Rain delays in Atlanta pushed back getting into Ann Arbor from 10pm Sunday via airplane to 2pm Monday (with a 12 hour bus trip).
- WTKA will broadcast 10 games this season between WTKA and WLBY
So it sounds like LaMarre is still on schedule. I wouldn't be surprised to see Ryan get a few at bats against Indiana, at least in a DH or pinch hitter role, most likely the latter.
The Wolverines faced some of the greatest adversity they've experienced all season, but managed to emerge from their game against Oregon with their 46-game win streak intact. The Ducks didn't go down without a fight, however, pushing the game to overtime before a Trevor Yealy goal gave Michigan a 5-4 victory.
The scene was Dallas for the 2010 Patriot Cup, benefitting the Wounded Warrior Project. The MCLA matchup between #1 Michigan and #7 Oregon was the headlining event of the weeklong, 18-game Patriot Cup. The weather, however, didn't get the memo, and the teams had to play through cold temperatures, snow, and high winds. It's understandable, considering the conditions, how low-scoring the game turned out to be.
The game never turned strongly in favor of either team, as the Ducks got on board first, but senior Jamie Goldberg scored for Michigan to tie the game up at the end of the first quarter. Svet Tintchev added a tally of his own shortly before halftime, and Michigan went into the locker room with a slim 2-1 lead. Michigan got third-quarter goals from junior Trevor Yealy and senior Josh Ein, and Oregon notched one of their own to make the margin 4-3 in favor of the Wolverines heading to the fourth. The quarter nearly passed scoreless, but Oregon managed to force the extra frame with just seconds left. In overtime, however, Michigan forced a turnover and Trevor Yealy buried the game-winner for the 5-4 victory.
Mark Stone got the start in net, and finished with nine saves. Faceoff specialist David Reinhard won 7 of his 12 draws, a somewhat subpar result given his exceptional play thus far in the season. Yealy was the only multi-goal scorer, but Jamie Goldeberg and Anthony Hrusovsky each tallied two points.
Based on reports (and the @UmichLacrosse Twitter feed), both teams played sloppy, which is understandable given the conditions. However, the season will come to a point where the sloppy play (which happened in much better conditions on the Arizona/BYU road trip) has to be attributed to the team not being quite as dominant as they were last year. Who knows if that will eventually cost them a game.
Michigan will have its second consecutive one-game weekend, as this time they'll travel to Milwaukee to take on Minnesota-Duluth on Saturday. The Bulldogs were ranked #6 in last week's Prodigy Poll, and are the owners of a 6-1 record. Their one defeat came at Arizona State, where Michigan managed to pull out an 11-10 victory. Other than that, the strength of schedule is questionable, as they've defeated horrible Saint Cloud State (pounded by Eastern MIchigan!), unranked MSU-Mankato, Central Michigan, and Wisconsin Milwaukee, and fringe top-20 teams in Lindenwood and Cal. They've lost to the only competition they've played that's even close to Michigan's level.
Offensively, Duluth is led by sophomore attack Alex McNamara, who averages 3.6 goals and 1.6 assists per game, and senior middie Daniel Pitzl (a 2009 2nd-Team All-American), who adds 2.6 goals and 1.4 assists of his own. Sophomores Drake Peterson and redshirt sophomore Kevin Gaydos also average more than three points in each contest. Benjamin Shandley and Andrew Madsen are both very good faceoff specialist, winning over 70% on the year, so there should be some entertaining battles with Michigan's David Reinhard. Redshirt sophomore Joey Slattery is the #1 goalie, allowing 3.2 goals per game, and saving just over 70% of shots faced.
The game is being played at a neutral site (Marquette University in Milwaukee, WI), and weather shouldn't play a huge factor, as it's projected to be in the mid-40s and sunny on Saturday. I encourage all MIchigan fans (and especially lacrosse fans) in the Milwaukee, CHicago, and Madison areas to make the trek out to see a championship-caliber Wolverine squad take on a ranked opponent.
Beating The Varsity Drum
It's no secret that the Michigan Lacrosse team very much wants to earn varsity status from the Athletic Department (and I've made it no secret that I support such a move). I asked a former Michigan Lacrosse player a few questions about why the team deserves to be a varsity sport, and how this should happen. Greatest hits:
On why the Michigan Lacrosse team deserves varsity status:
If you want to know what makes them worthy, the short answer is that they have already gotten most of the way there without varsity status, and now want a chance to be the best of the best. What more can you ask for in sports besides a shot at being the best?
What makes this situation unique is the fact that the organizational structure currently in place is a highly capable group, and has been operating as a “virtual varsity” program for many years now. When varsity teams are added to any university, a typical process would be to hire the staff a year or two ahead of the target inaugural season for planning and recruiting purposes. This team is good to go tomorrow given the chance.
...Which provides a perfect segue into how long it would take for the team to be competitive on the D-1 level, should they be promoted:
There are several reasons that the transition to varsity competition will be much smoother for Michigan than some might think. First of all, the current club team has beaten several Division I programs in exhibition play. The current talent on the team consists of more than a handful of players that either turned down the chance to play for Division I teams to come to Michigan, or would be significant contributors in a Division I program. Another reason is the number of students from the east coast that come to Michigan. Even though the sport is growing across the country, the bulk of blue chip recruits still come from Atlantic states like New York and Maryland. The Michigan brand already draws a relatively large percentage of the student body from these areas, which would enable the rapid expansion of the team’s recruiting footprint into traditional hotbeds. Finally, the planned facilities featured in an earlier post would be bar none the best in the country.
He also points out that the infallible Transitive Rule of Sports Victory show MIchigan has beaten opponents in the past several years that have gone on to make the NCAA tournament in subsequent years, or themselves beaten an opponent that ultimately made the tournament. Assuming a boost from moving up to the D-1 level, they could step in and be competitive right away.
The final question I asked regarded why now is the right time for the team to earn varsity status, when it hasn't been for the past several years:
[T]he program has reached critical mass at the club level. While there are still many good teams in the MCLA, Michigan has asserted its dominance over the past several years and now only plays a few competitive games each season. In order to find good competition, the team has to travel to places like California, Utah, and Arizona, which is not easy to pull off on a club budget. The truth is that the current level of operations is only made possible by the level of private support that the team receives, which is not sustainable in the long run nor worthwhile if the team is never elevated to varsity status.
The change in athletic director played a role in spawning the effort by providing a timeline for taking a proposal to the university. When Bill Martin announced his retirement, Coach Paul initiated several conversations about what we would need to do to ensure that lacrosse was on the new athletic director’s agenda from day one. Unfortunately, our departing AD was not a supporter of elevating lacrosse to NCAA Division I during his tenure.
It seems as though positive steps are being taken toward varsity status, something that certainly sounds long overdue. The change in AD seems to be a major benefit, if only because there's at least the chance he's listening. David Brandon can do no wrong, man. So far, he's been a home run pick.
In the name of this post running too long, I'll cut off the varsity talk there this week, but make no mistake, I'll be hitting this topic in nearly every update.
The men aren't Michigan's only club lacrosse team, as the women's club team has the coveted "Varsity Club" distinction as well. While they haven't reached the same level of success as the men (but who has, really? Penn State club hockey?), they are a strong program in their own right. If a varsity push for the men's lacrosse team happens, it's likely that the women's lacrosse squad might be the program promoted at the same time, to satisfy Title IX requirements.
The women earned a 3-game sweep over the weekend, beating Michigan State 12-2, Pitt 13-10, and Miami (Not That Miami Unless You're Talking About Hockey, In Which Case Yes That Miami) 18-10. Thanks to their media contact Stuart Zaas for keeping me updated, and if you're interested in learning more about the women's lacrosse team, check out their website.