How likely are we to revert to the mean?

Submitted by Bo Glue on April 2nd, 2018 at 12:08 PM

Been crunching some numbers to see just how bad our luck has been on 3 point shooting this postseason. It's been a bit of a shock to me because most of the looks were open; these are not heavily contested attempts or bad flow for the most part. For context, here's our shooting in the regular season versus the postseason:

3 point shooting
  3PM 3PA 3P%
Regular Season 289 778 37.1%
Big Ten Tourney 31 90 34.4%
March Madness 38 120 31.7%

It's been awful. So it seems we are due for some regression to the mean. That begs the question: how likely are we to return all the way to the mean? TL;DR - it's a long shot.

Just to simplify, in the below I assume each attempt is an independent trial. I also assume each shot has a 37.1% chance of going in (to way more decimal points). Furthermore, I assume we can just look at our odds of hitting the right number of attempts out of 11, 12, etc. and summing them.

I admit I had to google around for the math and used the Binomial Probability model outlined here. Specifically, to calculate the odds of having at least X Makes for a given Y Attempts, I summed for X through Y the value of:

(makes choose attempts) * (odds to make ^ makes) * (odds to miss ^ misses)

Raw numbers are here. Caveats aside, this analysis is meant to provide some perspective. I computed the likelihood we get our March Madness shooting percentage back up to the regular season average as about 1 in 237.

Game 3PM
Game 3P%
9 N/A N/A 0.00000% 0.07% 0.00000%
10 N/A N/A 0.00000% 0.12% 0.00000%
11 11 100.00% 0.00186% 0.22% 0.00000%
12 12 100.00% 0.00069% 0.37% 0.00000%
13 12 92.31% 0.00590% 0.60% 0.00004%
14 12 85.71% 0.02717% 0.94% 0.00026%
15 13 86.67% 0.01157% 1.41% 0.00016%
16 13 81.25% 0.04054% 2.03% 0.00082%
17 13 76.47% 0.11337% 2.80% 0.00317%
18 14 77.78% 0.05351% 3.70% 0.00198%
19 14 73.68% 0.13356% 4.69% 0.00626%
20 15 75.00% 0.06528% 5.69% 0.00372%
21 15 71.43% 0.14983% 6.63% 0.00994%
22 15 68.18% 0.30926% 7.41% 0.02292%
23 16 69.57% 0.16229% 7.94% 0.01289%
24 16 66.67% 0.31925% 8.16% 0.02605%
25 16 64.00% 0.58234% 8.04% 0.04683%
26 17 65.38% 0.32397% 7.60% 0.02462%
27 17 62.96% 0.57351% 6.89% 0.03950%
28 17 60.71% 0.95849% 5.99% 0.05737%
29 18 62.07% 0.55994% 4.99% 0.02793%
30 18 60.00% 0.91772% 3.99% 0.03659%
31 19 61.29% 0.54293% 3.06% 0.01659%
32 19 59.38% 0.87492% 2.25% 0.01965%
33 19 57.58% 1.35187% 1.58% 0.02140%
34 20 58.82% 0.83120% 1.07% 0.00890%
35 20 57.14% 1.26960% 0.69% 0.00881%
36 20 55.56% 1.87237% 0.43% 0.00807%
37 21 56.76% 1.19042% 0.26% 0.00306%
38 21 55.26% 8.74323% 0.15% 0.01285%
39 22 56.41% 1.11469% 0.08% 0.00090%
40 22 55.00% 1.61939% 0.04% 0.00069%

So it's a long shot, but better than 1 in 250. Now, if we include all of the postseason and just look at what our odds are to even finish with that shooting percentage for the tourney (assuming the up to date 36.2% is correct), it's about three times more likely: 1 in 78.

Game 3PM
Game 3P%
9 9 100.00% 0.01077% 0.07% 0.00001%
10 10 100.00% 0.00390% 0.12% 0.00000%
11 10 90.91% 0.02878% 0.22% 0.00006%
12 10 83.33% 0.11604% 0.37% 0.00043%
13 11 84.62% 0.04927% 0.60% 0.00030%
14 11 78.57% 0.15410% 0.94% 0.00145%
15 11 73.33% 0.38807% 1.41% 0.00549%
16 12 75.00% 0.18463% 2.03% 0.00375%
17 12 70.59% 0.42052% 2.80% 0.01177%
18 13 72.22% 0.20743% 3.70% 0.00767%
19 13 68.42% 0.43907% 4.69% 0.02057%
20 13 65.00% 0.83999% 5.69% 0.04782%
21 14 66.67% 0.44669% 6.63% 0.02963%
22 14 63.64% 0.82078% 7.41% 0.06083%
23 14 60.87% 1.40388% 7.94% 0.11149%
24 15 62.50% 0.79304% 8.16% 0.06472%
25 15 60.00% 1.32425% 8.04% 0.10650%
26 15 57.69% 2.09407% 7.60% 0.15915%
27 16 59.26% 1.24321% 6.89% 0.08563%
28 16 57.14% 1.93690% 5.99% 0.11594%
29 16 55.17% 2.88962% 4.99% 0.14416%
30 17 56.67% 1.78852% 3.99% 0.07131%
31 17 54.84% 2.64348% 3.06% 0.08079%
32 18 56.25% 1.64936% 2.25% 0.03705%
33 18 54.55% 2.41783% 1.58% 0.03828%
34 18 52.94% 3.42848% 1.07% 0.03669%
35 19 54.29% 2.21119% 0.69% 0.01534%
36 19 52.78% 3.11931% 0.43% 0.01345%
37 19 51.35% 4.27743% 0.26% 0.01100%
38 20 52.63% 2.83934% 0.15% 0.00417%
39 20 51.28% 3.88152% 0.08% 0.00313%
40 20 50.00% 5.17738% 0.04% 0.00219%

Thanks very much to J. for catching some errors in my initial analysis.



April 2nd, 2018 at 12:22 PM ^

I aprpeciate what you've laid out here. I hope things go well tonight.

Sidebar: Could/should regression to the mean in a positive direction (like higher 3pt percentage) be considered progression toward the mean? Positive regression seems odd to me.


April 2nd, 2018 at 12:38 PM ^

One flaw in the logic is in the regular season we got to face teams like Iowa, Chaminade, Minnesota, and Alabama A&M. It is just like how the softball players are hitting .400 going into their tournament, only to hit 1 for 3 on a good day. Tougher competition is going to bring the averages down.


April 2nd, 2018 at 12:46 PM ^

typically refers to shooting at the season mean in that particular game. I don't think anyone expects us to shoot so well that we boost our overall tourney 3-point shooting percentage back up to the mean. And, as someone else already pointed out, the average defense we're facing is much better than the average defense we faced during the season. In other words, what's done is already done. All that being said, I'm hoping like heck for that 1/700whatever chance we catch absolute fire.


April 2nd, 2018 at 1:25 PM ^

This is correct.  Properly used, "regression to the mean" indicates that a subpar performance is likely to be followed with a better performance, and an excellent performance is likely to be followed by a worse performance.  That's because a poor performance is worse than the mean, an excellent performance is better than the mean, and for normal distributions (bell curves), the mean and the mode (most frequest / most likely) are the same.

Improperly used, "regression to the mean" is often taken to mean that a subpar performance should be followed by an excellent performance in order to get the overall totals to reach the original mean value.  That is not true -- random events have no memory; the most likely thing going forward is whatever the mean is, not whatever it takes to get back to the mean.

Happily, OP doesn't seem to have made the second mistake; it may just be a matter of phrasing.  The correct question is "How likely is Michigan to get its tournament shooting up to pre-tournament levels (or better)?"

Here's a calculator you can use to try the math:

OP:  Your intuition is correct; you made a mistake somewhere in your math.  A couple of things.  First of all, if you actually want "at least X Makes for a given Y Attempts," you'd need to include not only the values for X (which is what your formula is attempting to calculate) but also X+1, X+2 ... Y.  Your formula is correct -- (X choose Y) * p^X * (1-p)^(Y-X).  However, I can't replicate your results.  For example, on the 22/40 calculation, the correct answer is that there is just shy of a 1% chance to make exactly 22 of 40 threes.*+(289%2F778)%5E22…

Looking at your spreadsheet: 

That's incorrect.  The last term should be POW(1-E$2,B7-A7).

Anyway, the second problem you have is that you can't simply add all of the probabilities.  If you did that, you'd eventually reach a > 100% chance (extending the table to, say, 1000 attempts).  Instead, you need to normalize the results by calculating an approximate percentage that they will attempt that number of times.  You can probably do that by applying a normal distribution to the number of 3s Michigan has attempted in a game.  You'd need to calculate the mean (looks like about 25) and the standard deviation of the number of attempts and plug that into a normal distribution.  Then, you need to multiply the probability for each possible number of attempts by the chance that number will occur, and you're good.

BTW -- because I think we'd all consider it a success if Michigan finished with more than 37%, shooting on the year, you should use the cumulative probability function to include not just the minimum number of makes for each number of attempts, but each case where they have more as well.  That would be the last line on the calculator that I sent.

Hope this helps :)

Bo Glue

April 2nd, 2018 at 1:59 PM ^

I'm not saying statistics owe us one, although karma might if it exists. Derp, thanks for finding where the error was, I was using probability to make in both spots! I'll do some updates, not sure I'd be able to make all those changes by 9PM.

Not sure I follow your penultimate paragraph entirely, need to reread a few more times.


April 2nd, 2018 at 2:18 PM ^

No problem. :)

I'll re-word.  Basically, in order to get the probability of the event that you're interested in -- the chance that Michigan finishes the tournament shooting 37.1% or better -- you need to multiply the success rate for each case (# of attempts) with the probability of that case occurring.  So, if S(X) represents the chance of a successful trial utiliizing X shot attempts, and P(X) is the chance of Michigan getting exactly X shot attempts in the game, you want the sum of P(X)*S(X) for X=0..∞, with an understanding that S(X) is 0 for X < 11 and P(X) is essentially zero for X > ~40.

You can compute P(X) by assuming that the number of 3-point attempts Michigan will take is a normal distribution centered about the mean.  You'd need to calculate the standard deviation and mean by taking the number of three-point attempts in each game this season (ideally, it'd be tempo-adjusted, but that adds a ton of complexity and this is really a mind exercise anyway).  Based on your numbers (778 attempts in 31 regular season games) the mean is about 25.

You can compute S(X) using the calculator that I provided, or if Google has a cumulative probability function for the binomial distribution -- which it might, I haven't looked.

It'll make more sense when you see the corrected numbers for S(X), as the numbers for 30+ shot attempts are going to be several percentage points each time.  Think of it this way -- if Michigan were to shoot 300 3s tonight, there's a pretty good chance that they'd wipe out the blip caused by the first few tournament games.  They won't, of course, but that's an example of why you can't just sum up the S(X) values from 11+ without adjusting them. :)

Hope this helps.  Go Blue!

Bo Glue

April 2nd, 2018 at 3:12 PM ^

There's a 99.89% chance (according to normal distribution) we make between 9 and 40 attempts. I set up a weighted average, and it shows a much more favorable outlook than what i initially had (mostly because I was using the wrong probability to miss). Updating the tables and analysis now.


April 2nd, 2018 at 5:32 PM ^

if your "success rate" is 37.1% (regular season 3-pt %) and you have 120 "trials" (NCAA 3-point attempts), you'll make 38 or less 12.7% of the time.

That percentage isn't THAT low.  Especially given that the "natural success rate" is inherently going to be lower given the level of competition in the NCAA.

One could argue U-M has been "unlucky" on 3s in the postseason - but that argument wouldn't be completely bullet-proof either.


April 2nd, 2018 at 3:19 PM ^

When I was a kid my parents made trail mix for our backpacking trips.  Part of it was M&M's.  Me and my sibblings would eat most of the peanuts, rasins, and other stuff so that we had mostly M&M's at the end.  Yes we would eat a few M&M's.  But we would have a great treat at the end as we would gobble down M&M after another.

If only stats were that way.  Just like flipping a coin, rolling dice, or pulling the lever on a slot machine, the past has no impact on the future.  The odds of a great shooting output or another flameout are exactly what they were in the Loyola game adjusting for competion.  

So the question to me is how many threes in how many attempts do we think we need to beat Loyola?  Presume we get off 25 attempts, how many do we feel this team has to make?  If we get 10 or more we have a shot.  Now figure out the odds of that.  May have to account for the fact our opposition is a bit better a stopping treys.


April 2nd, 2018 at 3:47 PM ^

The statistics at the top here say we did not shoot particularly well in the BTT (34%), i.e. not much better than the crappy shooting in this tournament (31%, although the NCAA tournament stats were worse in games other than Texas A&M, so perhaps that is the answer to the question).  So how did we win every game but Iowa in dominating fashion?  We took out MSU and Purdue (full strength), a 2 and 3 seed, and those games were not close.  With only average 3 point shooting.  What does that mean.


April 3rd, 2018 at 2:55 AM ^

We were a brutal 3-23. Your analysis blows.


But seriously 3/23 after shooting 37% for the season, what the actual fuck. If Michigan shot 38% (9/23) they'd have won.


April 3rd, 2018 at 10:48 AM ^

Did it seems to anyone else that players had problems with the basket to the right of the TV camera?  In the Loyola game we were on offense on that side of the court first, and cold as ice in the first half.  Then we switched, and Loyola had a much harder time hitting shots in the second half.  In the Villanova game, they started on offense on that side, and missed a lot of shots in the first ten minutes.  Obviously, they got better.  Our three point drought started in the first half when we were shooting at the good basket, but it kept right on in the second half when it seemed to me we were shooting more threes out of necessity (and we missed every one).  I didn't watch much of the Villanova-Kansas game, and obviously NOTHING bothered Villanova's shooting in that game.

I'm not saying that is whjy Villanova won.  Clearly they adapted better than anyone to the same conditions for all.  But, I wonder if there was something about that basket that made the optics/depth perception more challenging, or a slight cross breeze at just the wrong angle.


April 3rd, 2018 at 10:53 PM ^

I sort of relate this to Brian's note about 3PP in the last six games on neutral or away courts vs. the games in the NCAA tournament.  Those last six games were against opponents we were seeing for the second or third time.  I wonder if we just do a better job of exploiting matchups seeing other teams the second or third time around than typical teams?  Opponents are not randomly distributed over time.  Of course, not everyone plays better the second time around but maybe some coaches (and players) are better at making adjustments between games with the same opponent than other coaches..