I thought that myself when I read that article that talked about a Data Scientist(tm)
The last straw for Run of Play proprietor, Slate contributor, and Dirty Tackle blogger Brian Phillips were two articles on consecutive days citing Franklin Foer's assertion that dictatorships led to good soccer. Many of the nations that have been super good at soccer over the years have been run by dictators if you lump Vichy France in with them and think Hitler and Mussolini have anything to do with anything in the 21st century. The first problem with this piece of intellectual noodling is that the percentage of teams who have won the World Cup during or after a period of dictatorship (86%) is almost equivalent to the percentage of countries that have undergone periods of dictatorship since 1930. Twenty-five of the 32 teams in this year's edition have done so, 78%.
The second is that the statement means nothing. Phillips on the Kuper/Szymanski book Soccernomics, which endeavors to be a Freakonomics for the beautiful game:
You want to say that money is the secret behind soccer success, so you break down international games by GDP and find that, yeah, it matches up fairly well. But it doesn’t work as a theory, because China is terrible at soccer and the US is only okay at it. So you invent a variable called “tradition” and add it into the formula, which helps (now Brazil’s looking really strong), but you’re still left struggling to explain why, say, England doesn’t do better. So you add in population size, and on and on and on. Eventually, you have a delicately balanced curl of math that correctly reproduces the results of most recent matches (even if it accidentally predicts that Serbia will reach the current World Cup final). So you go to a publisher, but no one wants to buy a book about how GDP is covariable with national-team success 40% of the time, or whatever; they want a book that claims to have Uncovered the Secrets of Soccer© Using Funky Mathematical Techniques™. And so you’re led into making grand claims for the predictive power of research that really only demonstrates correlation. And there’s enough data swirling around a complex event like the World Cup that you could get the same results by collating fishing exports, number of historic churches, and percentage of authors whose names include a tilde.
You have no mechanism. Your correlation is extraordinarily weak. You have just wasted everyone's time.
The very same day, Slate (et tu!) published an article by a guy who studies a particular brain parasite claiming a correlation between soccer performance and infection rates of Toxoplasma gondii, a bacteria whose raison d'être is to get in a cat's stomach so it can make babies. An R-squared was not mentioned, but it was gestured to. Regression rules everything around me. This is why most published research results are false.
Soccer is not the only sport suffering from pseudoscience obsessed with elevating correlation above all else, mechanism be damned, and elegant curls of math that prove little other than the academic's talent for obfuscation in the name of publishing. Kuper and Syzmanski actually got to the party late. Princeton economist and Malcolm Gladwell fave-rave David Berri's been here for years, and he's packing the platonic ideal of delicately balanced curls of math that end up ludicrous on further inspection. Behold the best (and sixth-best) players of the 1999 NBA season:
the emperor's clothes are fine indeed.
Berri made a splash in the sports world when he released a transparently silly book that purported to show that Dennis Rodman was responsible for more wins than teammate Michael Jordan. This drew the ire of the basketball statistics community and anyone with a damn lick of sense. People set about showing that Berri was peddling snake-oil. I even had a go at it in one of the erratic Pistons posts that showed up around here a couple years ago, noting that after Ben Wallace left the Pistons' rebounding changed not one percent on either end of the floor. Ben Wallace got his rebounds from his teammates. (It turned out that Wallace's major skill was an ability to keep opponents off the free throw line.)
This did not take, unfortunately, and Berri has been permitted to say silly things about all sports that apparently intelligent people take seriously because he has "Princeton" next to his name. He moved on from basketball to "show" that NFL teams don't care how well their quarterbacks perform, only how high they're drafted…
Aggregate performance and draft position are statistically related. But as Rob and I argue, this is because in the NFL (like we see in the NBA) draft position is linked to playing time. And this link is independent of performance.
…that NHL goalies are indistinguishable from each other…
... there simply is little difference in the performance of most NHL goalies.
…and has returned to state basketball coaches don't understand who their best players are:
"... the allocation of minutes suggests the age profile in basketball is not well understood by NBA coaches."
Berri's at least had the common sense to stay away from baseball, where a horde of men with razor-sharp protractors wait for him to make a false move. (We will see later that collaborator JC Bradbury has not.) The statistical communities in football, basketball, and hockey are considerably more unsure of what the hell is going on in their chosen sport and are thus vulnerable to suggestion from an economist, even if it's one who seems to have never watched a sport of any variety.
The problem with all of Berri's outlandish theories is that they are wrong. Not because of old guys who peer into the soul of Andre Ethier and see a ballplayer, but because of other, more careful numbers from people who are looking for things that are true instead of things that are impressive to Malcolm Gladwell.
Berri's study actually shows that amongst quarterbacks who play a lot, draft position is not a strong factor in their performance. This is his magnificent leap:
For us to study the link between draft position and performance, we can only consider players who actually performed. It’s possible that those quarterbacks who never performed were really bad quarterbacks. But since they never played, we don’t know that (and Pinker also doesn’t know this).
Low draft picks who don't play only find the bench because of bias. A coach's decision to start one player over the other is a worthless signal. Coaches are dumb.
When you restrict your regressions to the top 20 goalies in terms of minutes, about half of the variation in save percentage appears repeatable. A standard deviation of talent is worth around ten goals. These days, a unit of five skaters who finished +50 at the end of the season would be heroes on the league's best team. Berri's undisclosed approach to the data set apparently takes goalies with far fewer than starter's minutes. A quick correlation run by Phil Birnbaum shows radically different r-squared values than those Berri finds just by upping the sample size. Maybe Birnbaum's numbers aren't dead-on—he doesn't use even strength save percentage, for instance—but he's not the one claiming a massive inefficiency. He's just showing that throwing a small r-squared out doesn't actually mean anything:
I don't know how the authors got .06 when my analysis shows .14 ... maybe their cutoff was lower than 1,000 minutes. Maybe there's some selection bias in my sample of top goalies only. Maybe my four seasons just happened to be not quite representative. Regardless, the fact that the r-squared varies so much with your selection criterion shows that you can't take it at face value without doing a bit of work to interpret it.
Age in the NBA
In the NBA, 23 and 24 year old players net more minutes than any other age bracket, and while the average age of an NBA minute is 26.6 this year there's a blindingly obvious explanation for this:
Berri and Schmidt think that NBA minutes peak later than 24 because coaches don't understand how players age. It seems obvious that there's a more plausible explanation -- that it's because players like Shaquille O'Neal are able to play NBA basketball at age 37, but not at age 9.
In sum: wrong, wrong, wrong, and wrong.
When you've got a hammer, everything looks like a nail. Berri's hammer is regression analysis, and he goes about hitting everything he can find with it until he finds something that seems vaguely nail-like from a certain angle. Then he proclaims a group of extremely well-paid subject matter experts dumb. When challenged about this, he says things like "regressions are nice, but not always understood by everyone." He calls the protestors dumb.
This is more than a logical fallacy: it's a worldview. In a post on a cricket study by another set of authors, Birnbaum points out the assumption built into a lot of economics studies. It, like most of Berri's work, runs a regression on some data and reports back that something fails to be statistically significant:
The authors chose the null hypothesis that the managers' adjustment of HFA [home field advantage] is zero. They then fail to reject the hypothesis.
But, what if they chose a contradictory null hypothesis -- that managers' HFA *irrationality* was zero? That is, what if the null hypothesis was that managers fully understood what HFA meant and adjusted their expectations accordingly? The authors would have included a "managers are dumb" dummy variable. The equations would have still come up with 4% for a road player and 10% for a home player -- and it would turn out that the significance of the "managers are dumb" variable would not be significant. Two different and contradictory null hypotheses, both which would be rejected by the data. The authors chose to test one, but not the other.
Basically, the test the authors chose is not powerful enough to distinguish the two hypotheses (manager dumb, manager not dumb) with statistical significance.
But if you look at the actual equation, which shows that home players are twice as likely to be dropped than road players for equal levels of underperformance -- it certainly looks like "not dumb" is a lot more likely than "dumb".
The goalie example is the most illuminating here: by adjusting the parameters of your study you can arrive at radically different conclusions. I'm not sure if Berri is intentionally skewing his results to get shiny Moneyball answers, but given how dumb his justifications are for the NFL study that's the kinder interpretation. Running around saying that we don't know that the average sixth rounder isn't John Elway waiting to happen because they can't get on the field is obtuseness that almost has to be intentional. On the other hand, he does blithely state he's "not sure there is much to clarify" about his assertion that NFL general managers are on par with stock-picking monkeys when it comes to identifying quarterbacks, so he may be that genuinely clueless. (The Lions tried a stock-picking monkey. It didn't work out.)
There's often a kernel of truth in a Berri study. When the Oilers were casting about for a goalie, smart Oilers bloggers were noting the glut of basically average goalies available and jumped off a cliff when they signed a mediocre 36-year-old to a four year, $15 million dollar deal when they could have signed two guys for something around the league minimum and expected about the same performance. That's something close to the criticism Berri levels with the volume turned way down. Hockey and football and basketball are not baseball. It is incredibly difficult to encapsulate performance in any of these sports in statistics. So when Berri makes a proclamation that NHL goalies are basically the same based on plain old save percentage—which isn't even the best metric available—he ascribes more power to a stat than it deserves and simultaneously ignores a raging debate about one of the most difficult questions in sports statistics to get a handle on.
At the very least, the questions Berri attempts to tackle with really complicated regressions are murky things best delivered with a dose of humility. Instead Berri and colleagues say there is "simply" no difference, that his research is "not understood by everyone," that a formula that declares Jeff Francouer worth 12 million a year is justifiable and that protestors are making "consistent basic errors in logic, economics and statistics" when any minor league player making the minimum could replace his production, and that David Berri went to Princeton. If he bothers to respond to what's admittedly a pretty shrill criticism, he will undoubtedly state that if only I had managed to understand his papers the many ludicrous conclusions easily disproved by competing studies (QBs, save percentage), simple facts that blow up the idea being presented (NBA minutes), or common sense (Rodman, Francouer) would have come to me in an epiphany.
These things are all ridiculously complicated and it's obvious with every response to another Berri study that declares someone dumb that different views on the data produce different results. Berri's overarching thesis is that subject matter experts make huge errors because they refuse to look at data from all possible angles. Stuck in their ruts, they robotically bang out decisions like their forefathers. Statistician, heal thyself.
There may be some social utility in distracting economists from theorizing about the economy, but there's no utility in the domain they're actually tackling.
I know, I know, but let me say in the Mathlete's defense, he does not wander outside the white lines of the field of play in his analyses, and his assertions, to me anyway, are more open-ended and leave greater room for interpretation. Sometimes I think the Mathlete is on to something, and sometimes not so much, but he makes me think about how I watch the game, and that has value.
I think Mr. Cook and Mr. Philips are criticizing those who seek to connect outside factors (like dictatorships, GDP, etc) to what happens in the sporting world. And there probably is a place for that, to a degree. But theses like "dictatorships = good soccer" are over-thinking things. Or not thinking enough. 50/50
Good soccer nations have been pretty easy for me to identify:
Nations with developed youth programs and good coaches generally develop good senior soccer teams.
The Dutch and Spanish are pretty good examples of this.
Should have paid more attention in Stats 350.
This post reminds me of why I skipped that class a lot
Phil Jackson did say that Rodman was the best athlete to ever play for him. Not that that means anything to the particular argument that he was more responsible for wins than MJ, but at least he has something going for him.
That Canto III is posted.
And September is just around the corner.
This is a problem a lot of people (including myself at times) have. Taking a position contrary to the general consensus with only a small piece of terrible evidence provides an opportunity to look brilliant if you're correct.
It's also pretty easy to weasel your way out of your position if it becomes obvious that you're wrong.
What sucks is that these guys get paid to spew bullshit. What's nice is that it's easily ignored.
If you're the "prophet in the wilderness," screaming that everyone's most trusted assumptions have been wrong and oh look I have maths to prove it, you look brilliant--and, what's more, become a made man for the rest of your career. Being that right, once, when everyone else has been wrong for forever? It gives you HUGE latitude to be as wrong as you want, all the way to the bank, to the tune of two or three book deals.
All this post needed was some matrix multiplication and linear algebra and I would have been in heaven.
Although really Brian, its easy to see what David Berri's line of reasoning through stats
Of course, in this view, rejecting Ho does not mean that Ha is necessarily true. Even if he were wrong, it emphatically does not mean that you are right!
You have no mechanism. Your correlation is extraordinarily weak. You have just wasted everyone's time
is right on as far as I'm concerned. If you can't identify a mechanism, then you can't provide a causal account of whatever you're looking at. And without causation, you're not going to have an informative (in the sense of being explicable) correlation.
Every time I read something like this it reminds me of how the V2 bombing patterns in Gravity's Rainbow are isomorphic with a map of Slothrop's sexual dalliances.
But you do need to have some idea of a plausible mechanism for it to be meaningful. Some important discoveries have started with "hey, there's this correlation here that we didn't expect; is there a plausible causal relationship?". If the latter answer is "yes", you may be on to something (at least worth further investigation). Between dictatorships in the distant past and soccer prowess now? Uh, that would be no. (Present dictatorships and soccer prowess ... you might be able to posit one if the dictator is obsessed with soccer, but even then the correlation should be weak.)
I agree with this. I understand the desire to identify causation but to imply that correlation is useless/meaningless is going too far. Recruiting rankings are a good example; they have no causation with performance and pretty weak correlation with it in my opinion. But it's hard to argue with them; the higher a kid is ranked, the better his chances of being a great player are.
Finding meaningful correlation at least allows us to assess how far away we are from where we'd like to be, identify how we might improve those things, and gauge how long the journey might be. That is what I think the value of the Blue Moon Diary is.
I guess what I was trying to convey is that you don't really have an explanation of a correlation if you don't have a causal mechanism to support it. Recruiting rankings may be useful as a predictor because of the reliabiltiy past statistical correlation, but I would argue that unless there is a causal account of the relation between rankings and performance, you don't have an explanation. You can't speak to why the rankings and performance bear the relation that they do.
But, I think there actually is a plausible explanation for the reliability of rankings as a predictor of performance. It's that basically rankings track some combination of ability and past performance. Ability does play a causally relevant role in future performance, and you could argue that past performance (in the form of experience or "football IQ") also contributes to future performance. Since rankings and performance can be said to be effects of the same causal nexus, the connection is more than mere correlation. But since there are more contributing factors to performance than just the ones that rankings measure, the correlation is not perfect.
But, I'm a philosophy guy. I don't know a lot about statistics, and maybe I'm being too loose or colloquial with the usage of terms like "correlation".
I once spent a few weeks running regressions on our factories (20 or 30 of them) demand planning to better attempt forecasting and sent it to my boss as a "look at what I have been spending my extra time on!" thing. Come to find out regressions can be done with an excel formula (while I actually wrote the formulas myself into the cells). Moral of the story: regressions aren't that complicated.
Single variable regressions are not that complicated. Y = some factor of X.
Multi-variable regressions start getting more and more complex. Excel is great to quickly understand whats going on, but you make lots of assumptions in Excel (ie, normality) and can't do a lot of the back-checking to make sure your results are valid.
Regressions are just matrix algebra. The trick is knowing what variables to include, and doing what you can to ensure that causation indeed flows from the explanatory variables to the dependent variable, and not the other way around.
The problem with all of Berri's outlandish theories is that they are wrong. Not because of old guys who peer into the soul of Andre Ethier and see a ballplayer, but because of other, more careful numbers from people who are looking for things that are true instead of things that are impressive to Malcolm Gladwell.
In fairness to Gladwell, Berri's use of igon values is pretty impressive.
Oooooh Boy, my self esteem just went down a little. Nice, BC. Real nice. /uncomfortable humor
Obvs, this is a topic near and dear to my heart. My beef with Berri is his tact, not his technique. As much as hardcore stats try to create a sterile environment in which to conduct the tests, there is one flaw that is a devil to get rid of: the tester. You can not care what the results are, one way or another. The problem is that it takes so much effort to do something like this, that's it's hard not to care. In order to test something you must have a hypothesis--your own, precious hypothesis--and that act creates bias in your work that is very easy to lose sight of.
Berri's problem is that he says, basically, "they are wrong AND I, me and me only, am right." A simple dose of humility would go a long way.
Now, I must go find some humility. Just what I need, another boondoggle...
I have 3 comments.
The dictatorship/repressive gov't means better soccer argument is just the same one that was resolved in the 80s with the end of the Cold War and the USSR. The resolution was, if the gov't is so repressive that it doesn't stick around it hardly matters if it produces better athletes.
The toxo study is a reminder to stick to scientific articles published in peer reviewed journals or at least based on such (and to be skeptical even of those.) Slate ranks about even with the Onion for learning about science.
There was no need to bring Frenchy Francouer into this discussion. He's suffered enough.
I largely agree with Brian's comments, though with a caveat. And yes, while I did take some statistics courses at UM, I did not enjoy them and would not consider my knowledge any more than adequate, so feel free to point out my idiocy.
Having recently completed Soccernomics, the takeway I got from the book had less to do with absolute mathematical models for the beautiful game and more with some interesting correlations (and some causations) behind why certain regions and teams succeed while others struggle.
With respect to China and the US, the authors note that while both those countries have vast populations and significant GDP, there are still a myriad of factors that are slowing their progression to the top of the soccer world. With China, the issue stems from the fact that soccer is relatively knew and that their explosive growth is wealth has been over a pretty short time frame (compared to other established nations), with exploits of those resources lagging behind. Furthermore, the massive size of the country makes it difficult to centralize these resources and identify the best athletes. The same goes for India - the book notes that the best soccer countries are those in which a free flow of information occurs, where networks with other soccer-playing countries are strong and the internal communications promote new ideas. With China and India, they are geograhically isolated countries that are just now learning how to play soccer on a national level.
As for the US, the authors note that the US's approach to developing soccer players (focus on team winning over individual development, lots of practice, reluctance to let young kids turn it into a career and bypass school a la baseball) has stunted development, as has the fact that our best ahtletes do not play soccer (which amongst world powers is quite rare). American sports like football and basketball are still huge draws, though perhaps that will change as more players are exposed to the sport at younger ages.
Ultimately, the book is more about explaining that England does reasonably well given their size and GDP (but will struggle to be a world power) and identifying the next generation of global powers. There was some noise in their findings, but saying that the US, China, Germany, and (surprisingly) Iraq will likely become stronger in soccer while older countries like England and Italy will struggle doesn't strike me as particularly foolish, and the data certainly provides some worthwhile context.
As for Berri and statistical models in general, I agree that they are overused and dangerous in the wrong hands, but at the same time there is some truth to his findings. You look at that list of 1999 win share players (must be based on 1998 stats, since Jordan was out of the league in 1999), and your eyes bug out because Rodman is #1 and Jordan is #6, but 7-8 of those players should probably be on that list. That Nets team was surprisingly good (43-39) with Jayson Williams averaging a 12.9-13.6, and the next few years they struggled to remain relevant. Sure, Jordan should have been #1, but I view this more as noise than some fundamental flaw with the model.
I don't know much about Berri, and some of his findings are suspicious, but as much as people criticize certain individuals for torturing the numbers to come out like Moneyball, some of the counter-examples seem skewed in their own ways. As they say, there are lies, damn lies, and statistics, and I guess I am willing to give a bit more credence to the spirit of the studies even if the execution is wonky.
There was some noise in their findings, but saying that the US, China, Germany, and (surprisingly) Iraq will likely become stronger in soccer while older countries like England and Italy will struggle doesn't strike me as particularly foolish
Germany? They've made the World Cup finals seven times and won it three times. They see them improving upon that? I'm not sure how that's possible. And I'm surprised they wouldn't predict Germany to decline, given that its very low birthrate is resulting in a smaller number of children than some of its European neighbors.
Actually, I meant to write in Turkey for Germany (I had a bunch of thoughts in my head), but the book also addresses Germany quite extensively.
They acknowledged that Germany is a perrenial power, but the point they made was that unlike other nations in Europe that are struggling economically (like Greece, Italy, and Spain to various extents), the German economy remains strong. Germany is also significantly larger population-wise than most other power countries, and don't seem wed to a particular "style" as compared to other traditional powers. They didn't really get into the birthrate issue (though this is a problem for most older European countries), but with 81+ million in population, the lower rate probably won't affect Germany as much as other countries. Also, Germany's central location makes increases due to immigration easier than in places like Spain and England that are more isolated.
But immigrants still have to want to come there. In recent years, Germany has not been a very popular immigrant destination. It's true that many other European countries face the issue of low birthrate as well (including Spain and Italy), but France and the UK are doing better in that area. Despite having around 20 million fewer total people, they actually have more children under the age of 10 than Germany does.
That is true, but if Germany's economy remains strong while those around it suffer, immigrants will start to enter in far greater numbers. I am not aware of the exact birthrate numbers, but Germany has probably been the most consistent European nation in recent memory when it comes to international soccer, and they seem poised to keep that crown at least for the next decade or two (unless Turkey makes a major leap).
Granted, I haven't seen those younger ages in about a third of a century, but my impression is that plenty of good athletes are exposed to soccer when they're young now. I don't think that's much of the problem at all.
I think that part of the problem (I agree on lack of "developmental soccer" as a career and would add the lack of a world-class league hinders US development) is simply that there are too many highly lucrative alternatives. Even if soccer eventually draws closer to football and baseball in popularity, top athletes will still be able to make big money playing the latter sports, or basketball, or hockey, or tennis, or golf, or auto racing... you get the idea. (I don't mean to argue that a NASCAR driver is or is not an athlete, simply to point out that the career exists here and can be extremely lucrative.) How many other countries divide their attention among so many sports?
Is it an "excuse" for not performing well on an international stage? Well, no, not really. The fact that we split that talent in more directions doesn't mean we shouldn't be able to find 25-30 guys who can play soccer really well. (Obviously it works for the women's game, shallowness of international talent pool be damned. How many sports provide lucrative careers for women? What stops a top US female soccer player from sticking with soccer?) Well, okay, we should find 25-30 and mold them into a top national team. And then get them to perform to that level internationally ....
... and that's the rub. There are so few opportunities to compete internationally, so few matches involved (well, maybe not in South America), and so little continuity that it's difficult to determine whether or not there's even a problem. Maybe we really do have good teams and we just have bad luck. Maybe they're good but just not good enough.
I love this rebuttal at http://gladwell.typepad.com/gladwellcom/2009/11/pinker-on-what-the-dog-s...
One of his arguments against a person's evidence is that the person supposedly argued that black people are intellectually inferior to white people. It's a classic ad hominem.
As I understand the comment, Gladwell was perhaps taking a shot at Sailer's credibility, then pointing out that his conclusion was based on some weak statistical model - http://isteve.blogspot.com/2009/11/pinker-v-gladwell-on-nfl-quarterbacks.html. While I won't disagree that the Gladwell comment might be incindiery, reading some of Sailer's posts it is hard not to get a sense that race is a major component of his thinking and "schtick." Gladwell pointing this out isn't necessarily the same as an ad hominem attack.
Regardless of whether he goes on to legitimately discredit the argument, that irrelevant point about race is clearly designed to prejudice the reader against Sailer's argument. It's the same as if I added in some disparaging comment about the Bronx being associated with arson in my reply, regardless of whether I subsequently add in legitimate reasons or not.
I read it more as Gladwell pointing out that Sailer is perhaps best known for the weak argument concerning intellectual capacities of various races. It might be a poor shot (I agree that it is inflammatory, even if it is true), but at the same time it does provide some context for the reader to guage the relative merit of Mr. Sailer's argument. I will agree with you that it was in bad taste for Gladwell to preface his statement with it, but it sounds like Mr. Sailer is not above draggin race into his comments when it is convenient (and in my opinion, when you do that consistently, you have to expect that some people will generalize a bit).
Gladwell arguing with Steve Sailer without pointing out his racial crankery, which Sailer has made himself well known for among people who read leftwing political blogs, would kinda be disadvantaging those reading Gladwell who don't also read Matthew Yglesias.
It doesn't make Sailer wrong about this and I guess it fits the dictionary definition of ad hominem, but I don't see it as a problem with it unless Gladwell actually hadn't even bothered to refute the argument. And it isn't the same as your example, Tha Stunna, because it's an attack on Sailer's prior published thinking and analysis, not an attack on him because of some random characteristic that no serious person would believe to be an impediment to clear thinking.
I once got into an extensive debate with an economist about the "environmentalist dogma" taught in schools--such as, "if we want to have trees, we should not cut them down." He explained that if the demand for trees is high, the supply will rise to meet it--so if we want lots of trees around, we should seek to consume as much fresh-logged timber as possible.
I wish I was kidding.
I've been doing quite a bit of investigation into the correlation between pass rush and pass defense, and running regressions a-plenty. With no formal training, I've had to continually step back and make sure I'm following the data, and not making the data follow me. I think, though, that two factors have been crucial in applying these statistical mechanisms to sports statistics:
1) Having a thorough understanding of what's actually happening on the field, both to filter out meaningless correlations, and to dig deeper when "common sense" relationships aren't found, and
2) Starting with a specific hypothesis that I didn't come up with. Testing someone else's hypothesis is much easier to do dispassionately than poking sacred cows with numbers and seeing if any pop.
"He explained that if the demand for trees is high, the supply will rise to meet it--so if we want lots of trees around, we should seek to consume as much fresh-logged timber as possible.
I wish I was kidding."
1) I don't understand what this has to do with the topic - is it that sometimes economists come to odd conclusions?
2) And if that is your point, perhaps you dismiss your economist friend's conclusion too hastily.
Sure, it's not from a disinterested party. But it has a lot of numbers that should withstand argumentum ad hominem. Plus, it has a quote by a Harvard professor, telling us to use more paper. Harvard may not be Princeton, but Harvard is used more in movies when they want us to the think the protagonist is smart, so it must be worth something.
I completely support you on the "peace" thing, though.
"If we want bigger commercial forests, then we should consume more paper, not less."
Most of our argument circled around his refusal to accept that tree farms run by paper mills are not fungible with old growth forests. Turning all of America into a parking lot, but consoling ourselves by knowing there are truly impressive stands of quick-growth pine somewhere in Florida, wouldn't be a Brave New World that puts the lie to "ecologist dogma," it'd just be destructive stupidity.
To me its an issue of presentation. If you say "maybe on-field experience for a qb is as important as talent", you deserve more respect than "coaches know nothing about ability".
Even if they're dead wrong, proposing alternative hypotheses is at least interesting. Interesting enough to warrant further analysis in some cases. e.g. Maybe teams who arne't going to make the playoffs should be more inclined to play young QBs even if they aren't top picks.
I wouldn't dismiss the entirety of economic analysis in sports just because of a few over-ambitious dickheads. Doing so comes off as overdoing it ( not unlike what Berri is accused of).
You touched on the problem here where statisticians view statistics in a vacuum and ignore what is actually happening in the game. I've heard countless arguments about why JD Drew is the best player in baseball and is underpaid based on his OPS. This ignores the fact that he has never had more than 70 RBI and is a terrible situational hitter. Another stat baseball statheads have been known to rag on is hitting with RISP. They all claim that it is a statistically insignificant stat and is just "luck." This ignores factors such as the pitcher having to change from a full windup to a stretch windup, distraction to the pitcher by the baserunners, and the heightened intensity of of the hitter/pitcher in a critical situation.
Stats are great tools but can never capture the full story of sports.
Drew has a better OPS of 905 with RISP than with bases empty 894. (279 vs 285 avg is nearly indistinguishable).
Its hard to make an argument for RBI being relevant if you're already considering hitting with RISP. RBI is largely a function of opportunity that is out of the hitter's control.
Stat heads can go too far, but so can judgements based on anecdotal evidence or media perception.
soooooo.... I thought we weren't talking about soccer anymore...
I kid I kid
I'm pretty sure Michael Jordan did not play during the 1998-1999 season. His last championship with the bulls was during the 1997-1998 season. He retired (again, sort of) afterward.
good sports economists have the ability to understand that the level of uncertainty in sports can lead to statistical oddities.
rock on brain, rock on
You are right that he is a hack. You may have even understated the case.
You are wrong that he has some affiliation with Princeton, or any other above-average institution of learning.
Berri gets cited because he has any school attached to his name. I am aware of only one professor from a leading school who has addressed Berri's work and it was Steve Pinker in this rebuttal to Malcolm Gladwell's pathetic counter-attack after Pinker beatifully trashed Gladwell's book in this gem. Pinker correctly concluded that Berri's work was pretty conclusively disproved by . . . bloggers.
sports is a topic in which any academic must answer to an army of statistics-savvy amateurs, and in this instance, I judged, the bloggers were correct
he really is the best. Phil's posts are inevitably very charitable to the intentions of the research to be vivisected, which makes them all the better.
...I've been listening to that book on audio in the car (tables in audiobooks are TEDIOUS) and agree that he is very ad-hoc. Left by his ad-hoc model to conclude that "Norwegians are the biggest soccer fans on earth" why does he not consider the alternative hypothesis "my model is ad-hoc and stupid". My chief problem is that he is using the Terrible Hammer of Regression Analysis on variables he picked up in the supermarket. There is no theoretical reason to include the stuff that he does, he just includes the variables BECAUSE HE HAS THE DATA.
He seems to have read up on regression on-line, with the same sort of success as that Times Square bomber dude who read up on bombs on the interwebs and ended up only melting his upholstery.
But "never right about anything?" I like your gusto but not your rhetoric there. If you want to play at cutting down their stats you have to do it using their language. Saying you are 100% about something is like saying you think Obama was born on Mars. It points you out as Not One of the Clan. Why not beat them at their own game, rather than make a statement that could be refuted with a single observation of any statistician ever being correct about anything?
I like Berri's book, but not because I agree with stuff in it - instead because it makes you think about the problem a different way. It's certainly sounder theoretically and mechanically than the ad-hoc Kuper approach. In the end, though, if the conclusion of the model is whack we need to go out and show why our model is better. That's the Scientific American Way.
it's posts like these that make me worry - worry that one BC will become a writer on the national stage at some point, leaving mgoblog behind for a bigger venue (if there is one :)
what an incredible piece of writing.
the real thing that should worry you: economists are likely doing this in *every* thing they study, not just sports. alas.
When I'm not writing sports posts on a blog that very few people read, my real job is as an economist. And while I take umbrage with some of the broad generalizations, Brian nails the issues with almost any study I've ever done.
There's an old joke that says somebody wanted to know what 2+2 equalled...a mathematician knows it's between 3.9999999 and 4.000001 while an economist will ask, "What do you want the answer to be?"
In 1806 Pfuel had been one of those responsible for the plan of campaign that ended
in Jena and Auerstadt, but he did not see the least proof of the fallibility of his theory
in the disasters of that war. On the contrary, the deviations made from his theory
were, in his opinion, the sole cause of the whole disaster, and with characteristically
gleeful sarcasm he would remark, `There, I said the whole affair would go to the
devil!’ Pfuel was one of those theoreticians who so love their theory that they lose
sight of the theory’s object—its practical application. His love of theory made him
hate everything practical, and he would not listen to it. He was even pleased by
failures, for failures resulting from deviations in practice from the theory only proved
to him the accuracy of his theory.