Support MGoBlog: buy stuff at Amazon

# math

## Reasons for MSU hopelessness: from Saban to Dantonio (vs LC to RR)

In a previous thread titled “Reasons for Hope” (for UM), I looked at the trends in average victories from LC to RR (based on an average of four consecutive years). The conclusion was: that RR--after a significant hemorrhage that occurred during his first year of surgery on the program--is close to stopping the slow bleeding that actually began after LC's first three years.

One critic objected in a heated manner to the methods I used. A few posters rebutted the critic, pointing out that his tunnel vision of only the worst possible portion of UM’s recent record ignored the bigger picture. I will not speculate on the motivations for this tunnel vision. However, one supportive poster--whom I thank-- suggested looking at the record of MSU compared to UM. So, taking this excellent suggestion, I tried using similar methods to look at the trends in average win pct at MSU under various head coaches.

I found that under Nick Saban, the trend in average victories was positive (with an increase of .06 victories per year, much as occurred during LC’s initial years). But after that, the trends were consistently negative (a decrease of .17 victories per year under Williams and Smith and a decrease of .25 victories per year under Mark Dantonio).* So, MSU declined at a pretty steady rate.

The only way that Dantonio can stop the bleeding and just stay even with the average victory record of his esteemed predecessor, John Smith, is to win 2 out of the 3 next games. So, this analysis does not support the often voiced idea—some will call it wishful thinking—that MSU has turned the program around under MD.

To stop the bleeding (decline in average), RR also needs to become bowl eligible (winning 2 out of the next 4 games including a bowl). To be fair, however, his task is much more formidable. UM’s current average, which is at a low point for UM during this period (7.5 victories per season) is still 3 victories per season more than MSU’s average (4.5).

Methods of Analysis (repeated)

I looked at the trends since Saban took over in 1995 (based on a moving average involving each four year period).

Data

Toal wins and average wins for four successive seasons beginning in 1995 to present.

1995 6.5, 6, 7, 6 avg 6.25 Nick Saban trend +.06 per year

1996 6, 7, 6, 10 avg 7.25

1997 7, 6, 10,5 avg 7.0

1998 6, 10,5,7 avg 7.0

1999 10,5,7,4 avg 6.5

2000 5,7,4,8 avg 6.0 Bobby Williams -.17 per year

2001 7,4,8,5 avg 6.25

2002 4,8,5,5 avg 5.5

2003 8,5,5,4 avg 5.5 John Smith -.17 per year

2004 5,5,4,7 avg 5.5

2005 5,4,7, 5 avg 5.5

2006 4,7, 5,4 avg 5.0

2007-8 5,4 avg 4.5 until Mark Dantonio only -.25 per year (not including this year)

5,4,6 avg 5.0 -0.0 per year (assuming two more victories = 6 total this year)

*considering only his complete seasons---only if we assume he gets two more victories this year does he stay even with John Smith’s average when Smith left.

## One more thought on going for two

*Sorry for being a little late to the party, but here's a couple of other thoughts on the going for two question. If you're sick of the whole bloody mess, you should probably stop reading about now.*

First off, I largely agree with ikestoys's diary (http://mgoblog.com/diaries/down-14-and-going-2). I have often thought that football is a game that rewards aggressive play calling, like going for two and on fourth down more often, and fake punts from your own 20... Eh...

Anyway, I disagree with a couple of points ikestoys made, both explicitly and implicitly, and I thought I'd chuck 'em out here.

**Trials are not independent**

This point was made by a commenter in the original diary, but the basic idea of treating the different sorts of trials (going for 2, going for 1, overtime) as independent events (and therefore as amenable to the application of the mathematics of garden-variety probability theory) is flawed.

In football the outcome of one trial affects the probability of another trial even occurring, and not in predictable ways. Let's say UM had made the first two-point conversion. Would State have played their next drive differently than they did? Maybe, maybe not. Perhaps they would have come out throwing and scored a field goal to go up by nine. We have no way of knowing how things would have unfolded in that alternative universe.

**Relative frequencies are not probabilities**

Second, and another point made by a commenter, is that ikestoys treats relative frequencies (the proportion of successful two-point conversions) as the same thing as probabilities of success. They are not. That's like saying that because 1% of adults die of lung cancer, you have a 1 in 100 chance of dying of lung cancer. Do you smoke? If so, then your probability is surely higher. If not, it's lower. The point here is that the probability of success of a two point conversion depends on many factors, as various people have noted.

Nevertheless...

Nevertheless...

Because relative frequencies =/= probabilities, I thought it would be interesting to see how the probabilities of winning fared if you didn't assume the probability of a successful two-point conversion was 0.44. So, two graphs for your viewing pleasure. The y-axis is the probability of winning the game after all events have unfolded (post-touchdown try after TD 1, TD 2, and possibly overtime). The x-axis is the probability of success of the two-point conversion (I limited the range of this probability to between 20 and 80%).

*Graph the first*

In the first graph, I have plotted the cumulative probabilities of winning for two strategies: going for 2 after scoring a TD to be down by 8 (iketoys's strategy--the black line), and going for 1 (RichRod's strategy--the blue line). The only thing I have allowed to vary is the probability of success of a two-point conversion (on the x-axis).

Couple things:

- Note that I have reproduced the probability ikestoys does, where the dashed red line intersects with the black curve at about 57% when Pr(success) for a two-point conversion is 44%.
- Note also that despite ikestoys's implicit claim that going for two is always the better move, if the probability of success falls below 35.5%, it is better to go for 1, as RichRod did. I'm not suggesting that this is what the probability would have been, though people's comments about a dog-tired Tate, a driving rain, etc., make this idea not too farfetched).

*Graph the second*

There are two other variables in the process: the probability of a successful PAT (which I held constant at 0.95), and the probability of winning in OT. The latter probability doesn't change the black curve below much, so I left it at 50/50, as did ikestoys.

In the graph below, the three non-black curves represent three different probabilities of winning in overtime: 40% (orange), 50% (blue), and 60% (green).

The only thing to take away here is that if you believe your probability of winning in overtime is high (based on your style of play, being at home, etc.) and if you believe your probability of a successful 2-point conversion is less than 44%(ish), then you should adopt RichRod's strategy. If you believe that your chances of winning in OT is 50/50, and you believe your chances of scoring on a two-pointer are > 35%, then you should follow ikestoys's strategy.

**In conclusion (I know, finally)**

Of course, coaches don't think this way in the heat of a game. Again, I basically agree with ikestoys, but the story is a bit more complex.

## Down 14 And Going for 2

**Disclaimer:**This is not an attack on RichRod or any specific coach. This is more of a criticism of the 'common knowledge' of upper level football.

The situation: You are down 14 and probably only have 2 possessions left. Obviously, it will take two touchdowns to get back into the game. My question for you is, what combination of 2 point and 1 point conversions should you take to maximize your chance of winning the game?

Let's start off with a few assumptions. According to this rivals article, the average 2pt conversion rate in the NFL is 44%. I'll assume that it's about the same for CFB and that our team's conversion rate will be about the same in whatever specific situation we're in. We'll assume that we can estimate a PA kick as a sure thing. We'll also assume that we have a 50-50 chance of winning in OT.

So working with these assumptions, what is the optimal combination of 1pt/2pt tries?

**Kicking 1pt tries only**

This one is easy. Assuming we get 2 TDs to come back, taking 1pt each time will give us a 50-50 chance to win

**Going FTW!**

In this situation, we get the first TD and take the 1pt. On the second TD, we 'man up' and go FTW BABY! Our chances of winning are equal to the chance of converting obviously, so 44%.

**Best Answer**

In this situation, we'll go for 2 after the first TD. If we convert, then we'll kick a 1pt try. If we do not convert, then we'll go for 2 again.

This is a slightly more complicated calculation, but here we go:

1.) 44% of the time we make the first 2pt conversion and go on to win the game.

2.) (.56)*(.56) = 31% of the time we miss both 2pt tries and lose despite making two TDs

3.) (.56)*(.44) = 25% of the time we miss the first but make the second 2pt. This ties the game and we go to overtime.

So what is our final equity? It is:

.44*1 + .31*0 + .25*(.5) = .57 or 57%

A quick explanation of this equation. We basically multiply the probability of an event by the outcome of the event. So 44% of the time we win (1), 31% of the time we lose (0) and 25% of the time we go to OT with a 50-50 shot (.5).

Now why isn't this done in the real world? Well part of it is that some of our assumptions aren't known. However, mostly it is coaches covering their ass. No one gets criticized for taking the safe route to force OT, only to lose. If you go for 2 twice and don't make it, you'll be torn apart in the press. Not to mention that football coaches don't focus much of their time on equity calculations.

The common belief of kicking 1pt to tie or going FTW! at the end with a 2pt conversion is clearly wrong, even if it is most commonly done.

## Whistling Dixie

While I am a relative neophyte when it comes to understanding how recruiting works, the one aspect that has really interested me is how the concentration of D-1 prospects breaks down amongst the states. Anecdotally, states like Florida, California, and Texas always seemed to create top-notch prospects, but that kind of made sense - those are three of the four most populous states in America. I always presumed, erroneously at it turns out, that fast, strong kids exist everywhere, and that the percentage of the population which embodied these desirable characteristics was pretty constant across the board. Thus, the reason the Big 3 fielded more D-1 football recruits than, say, Utah was more the result of population and "math" than something in the drinking water or the focus certain states place on football. Of course, there also seemed to be two glaring holes with this logic - the fact that many states in the Southeast (Alabama, Georgia, Mississippi, Louisiana, etc.) produce an inordinate number of recruits compared to their populations, and the fact that relatively populous states in the Northeast (New York and Massachusetts) produce far fewer recruits than their populations predicted. But was this really true, or did these two anomalies exist more as a figment of recruiting services and media hype than reality.

Now, I was going to do all of this research myself, but then I was luckily able to stumble upon this page that broke down each state by number of recruits, population, and ratio of people to recruits for 2004-2008. I then wondered how this translated to the NFL - in other words, were the states that produced a large number of D-1 prospects also sending kids to the NFL. So after some more scouring of the interwebs, I came upon this page, which provided a really awesome user-friendly chart. After some more finagling and Excel-assisted sorting, I came upon this chart:

**Big Chart of recruits/NFL players home states 2004-2008**

State | College Recruits | State Pop. | State Citizens/Per Recruit | NFL Players | State Citizens/Per Pro |

Florida | 981 | 18,328,340 | 18,683 | 126 | 145,463 |

Alabama | 245 | 4,661,900 | 19,028 | 40 | 116,548 |

Mississippi | 149 | 2,938,618 | 19,722 | 22 | 133,574 |

Georgia | 481 | 9,685,744 | 20,137 | 64 | 151,340 |

District of Columbia | 27 | 591,833 | 21,920 | 3 | 197,278 |

Louisiana | 184 | 4,410,796 | 23,972 | 54 | 81,681 |

Texas | 974 | 24,326,974 | 24,976 | 135 | 180,200 |

Hawaii | 49 | 1,288,198 | 26,290 | 6 | 214,700 |

South Carolina | 169 | 4,479,800 | 26,508 | 39 | 114,867 |

Oklahoma | 117 | 3,642,361 | 31,131 | 23 | 158,364 |

Ohio | 362 | 11,485,910 | 31,729 | 65 | 176,706 |

Arkansas | 87 | 2,855,390 | 32,821 | 17 | 167,964 |

Kansas | 77 | 2,802,134 | 36,391 | 6 | 467,022 |

Virginia | 209 | 7,769,089 | 37,173 | 48 | 161,856 |

New Jersey | 232 | 8,682,661 | 37,425 | 24 | 361,778 |

Maryland | 145 | 5,633,597 | 38,852 | 18 | 312,978 |

North Carolina | 229 | 9,222,414 | 40,273 | 33 | 279,467 |

Nebraska | 43 | 1,783,432 | 41,475 | 8 | 222,929 |

Tennessee | 149 | 6,214,888 | 41,711 | 20 | 310,744 |

Pennsylvania | 281 | 12,448,279 | 44,300 | 37 | 336,440 |

California | 826 | 36,756,666 | 44,500 | 172 | 213,702 |

Kentucky | 92 | 4,269,245 | 46,405 | 10 | 426,925 |

Iowa | 61 | 3,002,555 | 49,222 | 12 | 250,213 |

Missouri | 118 | 5,911,605 | 50,098 | 9 | 656,845 |

Washington | 117 | 6,549,224 | 55,976 | 11 | 595,384 |

Arizona | 103 | 6,500,180 | 63,109 | 16 | 406,261 |

Illinois | 194 | 12,901,563 | 66,503 | 31 | 416,179 |

Michigan | 150 | 10,003,422 | 66,689 | 41 | 243,986 |

Connecticut | 50 | 3,501,252 | 70,025 | 9 | 389,028 |

Colorado | 69 | 4,939,456 | 71,586 | 14 | 352,818 |

Indiana | 86 | 6,376,792 | 74,149 | 21 | 303,657 |

Minnesota | 66 | 5,220,393 | 79,097 | 11 | 474,581 |

Delaware | 11 | 873,092 | 79,372 | 2 | 436,546 |

Oregon | 44 | 3,790,060 | 86,138 | 15 | 252,671 |

Wisconsin | 55 | 5,627,967 | 102,327 | 12 | 468,997 |

West Virginia | 15 | 1,814,468 | 120,965 | 2 | 907,234 |

Nevada | 21 | 2,600,167 | 123,817 | 6 | 433,361 |

Idaho | 12 | 1,523,816 | 126,985 | 5 | 304,763 |

Utah | 20 | 2,736,424 | 136,821 | 13 | 210,494 |

Massachusetts | 46 | 6,497,967 | 141,260 | 8 | 812,246 |

New York | 112 | 19,490,297 | 174,021 | 25 | 779,612 |

South Dakota | 4 | 804,194 | 201,049 | 2 | 402,097 |

Montana | 4 | 967,440 | 241,860 | 4 | 241,860 |

New Mexico | 7 | 1,984,356 | 283,479 | 3 | 661,452 |

North Dakota | 2 | 641,481 | 320,741 | 2 | 320,741 |

New Hampshire | 4 | 1,315,809 | 328,952 | 0 | 0 |

Alaska | 2 | 686,293 | 343,147 | 4 | 171,573 |

Rhode Island | 2 | 1,050,788 | 525,394 | 3 | 350,263 |

Vermont | 1 | 621,270 | 621,270 | 0 | 0 |

Total: |
7484 | 302,210,600 | 1251 | ||

Average |
40380.8926 | 241,575 |

So that really wasn't that surprising. Presuming that the distribution of football players was constant across the population (i.e. for every x people, y recruits exist), the ratio should be 1:40,380 - in other words, the population at large holds about 1 D-1 recruit per 40,000 people. Similarly, of those kids who went to the pros, the number was truly astronomical - 1:241,575, an astounding number considering that some of those positions are held by international players that were not listed on my chart. And yes, this statistic is not perfect, since the actual number of high school boys every year who could become D-1 athletes, and thus future NFL players, is far less than the population at large, people move in and out of states, etc. But for illustrative purposes I think it still supports my points, and I don't have the time or inclination to peruse government population numbers for a more true number. Plus, I doubt the ratios would be so greatly skewed as to dramatically alter the clear trends present.

So these results alone somewhat shocked me, but it has more to do with the illogical hopes so many kids even becoming D-1 college recruits, let alone professional football players. To put this into perspective, there are about 3 people sitting in the stands during a Michigan home game, on average, who have or will become D-1 recruits in their lifetimes. In another way, my hometown of Royal Oak has a little over 60,000 people in it, or about 1.5 D-1 football recruits per year if the model holds true. As for those who go on to play in the NFL, the entire state of Vermont, if my model held true, would produce 3 NFL-quality players per year - and that really isn't even true over the 2004-2008 span (0 players over that span).

But clearly, football talent is not evenly distributed across the country. While some more populated states come pretty close to the proposed distribution, such as California, Pennsylvania, and North Carolina, outliers exist in the expected regions of plenty (Southeast) and barren (NY, MA). Both Michigan and Illinois also seemed to produce far fewer recruits than their populations suggest while places like Hawaii and D.C. seem more fertile than expected, but not to an extreme degree that you see with some other states. And in Hawaii's case, a large percentage of those recruits are taken by University of Hawaii, so that situation is clearly atypical.

**So what does this mean? - college**

For one thing, some traditional "hotbeds" of talent may actually "under"perform their expected ratio of recruits given a linear distribution - I'm looking at you, Pennsylvania and California. At the same time, maybe some people are underselling certain areas, such as Virginia and Oklahoma/Kansas, who have decent-to-great in-state programs that recruit nationally but also seem to have pretty fertile backyards to pick from as well. But the real focus, though, must fall on the Southeast, where states like Mississippi, Alabama, and Georgia continually churn out top-notch kids at a far greater rate than their populations suggest.

Despite what some Freep "columnists" opine as RR's apparent idiocy in not recruiting in-home talent at MSU's rate, it clearly makes sense to focus more of the staff's efforts on Florida and the Southeast compared to other regions in America. Sure, California and Texas are hotbeds that should be scoured, but the Southeast is where the money tends to be. Michigan produces a decent amount of recruits, but it is clear that outside of Ohio, the rustbelt just isn't a fount of top-notch talent the way some envision it. I'm sure there are a millions reasons why this may be, and I'll leave it to people in the comments to hash them out. My guess is that high school/college football has always been a more communal activity in areas of the South compared to the North, especially considering how few professional teams used to be located below the Mason-Dixon line compared to the population. Simply put, people "care" more about football down there, and that fervor translates to the youngest of children. They see football as a way to make a living, as a way to succeed and be a "god" in the community, and their environments seemed geared around making this dream a reality.

I don't think it has that much to do with the weather - sure, it helps to be able to play and practice outside more than in the north, but receivers can still catch balls, RBs can still squat and run wind sprints, and linemen can still work on their techniques indoors just as easily as outdoors. Plus, warm-weather states like New Mexico and Arizona produce recruits at a lower rate than expected, while some cold-weather states are able relative factories. To put it bluntly, I think kids in the Southeast "care" more about football than kids in the North. Now, that doesn't mean high school boys in Michigan and New York don't work hard or lack a will to win, but by and large I don't think the community rewards kids in the North as much for the success they experience on the football field as they do in places like Mississippi and Florida. I'm sure there are some socio-economic undertones to it, and some will say that kids in the Southeast see football as a way to escape the communities they are "trapped" in - see the Pahokee (?) pipeline as an example for crushing poverty pushing kids toward sports. But irrespective of the cause, it is clear that if you want the biggest payoff for your recruiting efforts, learning to whistle Dixie might as well become a requirement for major college recruiters. Now, that might not seem like a revelation to some, but it is interesting to see that anecdote play out in the numbers. I'm interested, though, to see how others feel.

**So what does this mean? - NFL**

As I mentioned above, I think a big reason more D-1 recruits emerge from the Southeast and Texas has to do with the relative importance the community places on football as a means to succeed. For better or for worse, a ticket to a D-1 school is viewed as a stepping-stone to playing in the NFL, and all the millions of dollars and notoriety that entails. So it shouldn't come as any surprise that the states which produce the most D-1 recruits per person also generate the most NFL players per person as well. Louisiana leads the way, with approximately every 82,000 residents producing an NFL player - a ratio about 3X greater than the expected! The same held true for most of the Southeast, with those states sending far more to pros than they have any business doing so. By comparison, Michigan is pretty average - it may be a little low on the D-1 recruits, but those who do emerge have a pretty average shot of making it to the NFL. So kudos to the Wolverine state.

By comparison, a pair of Ks - Kentucky and Kansas - seem to be the biggest "frauds" of the group in terms of overvaluing its D-1 recruits - both have pretty average or above-average number of D-1 recruits per population, but about half as many of those recruits wind up making it to the NFL as expected. So once again, Kentucky and Kansas underwhelm. As for New York and Massachusetts, they might as well focus on baseball - they just don't know how to create top-notch football talent.

But overall, this analysis proved what I expected - the Southeast produces a disproportionate number of D-1 recruits, and an inordinate number of these recruits are high-caliber enough to break into the NFL. Again, I have no scientific proof for the cause of this inequity, but I have stated my guesses. I am intrigued to see what other people believe is the cause, and I welcome anyone with more statistical knowledge than my one 400-level probability and statistics course to prove me wrong/drill down deeper.

What I'd like to do in the future:

* Breakdown for each state by high-school-aged boys, not the state population as a whole.