NCAA Tournament Scoring Averages (Domes/Stadiums vs Arenas)

Submitted by stopthewnba on

This is my first diary, and the statistical analysis isn't normalized as much as I'd like (just gathering the data was tedious enough).  Ironically, I put this together Monday, only to see Brian's DOME post on Tuesday.  He graciously upped my MGoPoints so I could post this.

Be kind - constructive criticism is much apprecited.

Now that we're facing the Regional Semifinals/Finals, I thought I'd try to quantify the effect of the venue on scoring totals.   For this exercise, I complied a list of all Sweet Sixteen teams over the past 5 NCAA Tournaments (2008 - 2012).  I also included this year's teams.    I looked at the regular season scoring avererags for the individual teams*, the individual team scoring average for the Tournament thus far (including all games not played at football stadium/dome sites), and then the average scoring for those teams during the Regional Semifinals/Finals and Final Four games.

*Taken from the Wednesday prior to NCAA Tournament games

 

LIMITATIONS:  Obviously, the data is going to be affected by the quality of opponents and individual matchups.  It follows that the Sweet Sixteen teams typicaly score more during the first weekend, as opposition isn't as elite as the teams they may face the rest of the tournament.  My hope is including a larger sample size and including regular season averages helps mitigate that impact to some degree.  The regular season scoring average is also the raw statistic, not adjusted for tempo-free.  Last caveat is that overtime periods (especialy for tournamet games) may impact final numbers (there have been 18 OT games since 2008 - not all in the first weekend or involving Sweet Sixteen teams - vs. 160 total games for my sample size)

 

Before I get into that analysis, another interesting trend emerged.   From comparing a team's regular season scoring average to the team's tournament (non-football site) average, it becomes possible to rank the Sweet Sixteen teams against their increase or departure from their regular season scoring average.  In four of the past five seasons, among Sweet Sixteen teams, one of the top two teams that increase their scoring average in the tournament over their regular season average made the Final Four.  Similarly intersting is that in four of the past five seasons, one of the bottom two teams who score LESS in the tournament than their regular season average also made the Final Four:

 

 

YEAR TEAM SCORING DECREASE TOURNEY PPG (1st Weekend) REG SEASON PPG
2008 UCLA 1st  / -13.5 60.5 74.0
2010 Duke 2nd / -7.5 70.5 78.0
2011 Kentucky 1st / -11.4 65.0 76.4
2012 Kansas 1st / -11.5 63.5 75.0

 

 

YEAR TEAM SCORING INCREASE TOURNEY PPG (1st Weekend) REG SEASON PPG
2008 UNC 1st / +21.8 110.5 88.7
2009 UConn 2nd / +20.2 97.5 77.3
2011 VCU 2nd / +9.5 81.0 71.5
2012 Kentucky 1st / +7.3 84.0 76.7


This year, the teams with the biggest scoring increase are ohio state* (87.5 ppg tournament, 69.3 reg season) and FGCU (79.5 ppg tournament, 72.3 ppg reg season)

The teams with the biggest scoring decrease this year are Indiana (70.5 ppg tournament, 80.0 ppg reg season) and Oregon (62.5 ppg tournament / 71.7 ppg reg season)

* Personally, I do not capitalize ohio state or osu.   Out of spite.

 

So, back to the overall point of this exercise.  Do football stadiums/domes negatively affect scoring more than basketball arenas?   Based on my research, no.

 

In the past five tournaments, there have been 11 basketball-arena sites hosting the second weekend of the tournament and 9 football-stadium sites.  

  • Overall, scoring is down:  -8.1% the second weekend vs the first weekend; -8.4% from a team's regular-season scoring average.  
  • True basketball sites have a larger drop in scoring:  -9.9% from tournament average, -10.5% from regular season average.  
  • Football stadiums see a drop of only 6.2% and 6.3%, respectively.

 

All Final Fours have been played in football stadiums over the past five tournaments.  Scoring is down 15.0% from previous tournament performance and down 14.9% from regular season performance.

There were a few outlier games/teams/seasons which impact the analysis (full chart - ED-S: I put it as a Google Chart here).  Breaking it down by venue shows further impact (also gives wise readers some insight to Vegas totals for the East Region at Lucas Oil):

VENUE VAR / TOURNEY PPG VAR / REG SEASON PPG YEAR
FORD FIELD -15.74% -10.73% 2009 FF, 2008 MW REG
LUCAS OIL -14.81% -15.40% 2013 MW REG, 2010 FF, 2009 MW REG
RELIANT STADIUM -11.67% -13.19% 2011 FF, 2010 S REG. 2008 S REG
SUPERDOME -11.39% -14.11%

2012 FF

ALAMODOME -8.04% -9.15%

2011 SW REG, 2008 FF

EDJONES DOME -7.84% -10.40% 2012 MW REG, 2010 MW REG
PHOENIX STADIUM -4.11% +4.77% 2009 W REG
GEORGIA DOME +9.11% +8.21% 2012 S REG

(Cowboy Stadium has never hosted NCAA Regionals/Final Four)

Comments

wolfman81

March 27th, 2013 at 11:55 AM ^

You asked for it, you got it.

A few thoughts:

  1. I think that this is an interesting (and relevant) topic to study, and I think you have made a good start at it.
  2. Your mega-table at the end should be a Google Doc and you can give it to us as a link.  As it stands right now, I'm not sure why you put it in there.  It seems like it is simply your dataset, and if you want to encourage me to play with it, then put it in a play-able format and share it that way.
  3. I'm not sure of the value of the Tourney PPG stat.  It is an average of 2 numbers (3 in the case of VCU and LaSalle...first 4 teams that have made the sweet 16).  I would guess that the average should go up for 2 reasons.
    1. Mostly you get higher seeds.  #1 seeds often crush #16 seeds and usually make the sweet 16.
    2. They just won 2 games.  I'm guessing that if you randomly sample two wins from a team's season and average them you will often get a number higher than the season scoring average.**
  4. This seems like it might be a good application for a generalized linear modeling.  Your basic question is:

    Does the number of points scored matter on whether a game is played in a dome or an arena?

    So you'd probably build a model like this:

         pts in S16 = avg pts + opp (def) avg pts + ARENA

    And ask if the ARENA variable is significant.

** I decided to investigate this statement further.  I took Michigan's results this season to look at stability of averages.  Here are a few facts I found:

  1. The average number of points Michigan scored this season (75.1 ppg) was not significantly different from the average number of points Michigan scored in their wins (77.9 ppg).  A Student's T-test of the hypothesis that avgppg in wins > avgppg is FALSE (p=0.1603)
  2. If we randomly sample two winning scores and compare that average to the average ppg, we find that the hypothesis avg2winppg > avgppg is TRUE (p<10^-15).  This tells me that the mean of 2 data points is worthless.
  3. The max point differential in my bootstrap sample (I sampled 1000 trials) was +22.4, and the minimum point differential in my bootstrap sample was -15.1.  These are similar to the values that you found in your data (+21.8 and -13.5).

In the end, my point is that I think you should try a some sort of generalized linear model to support your analysis. Going back as far as you have, I believe, gives you the data set to explore this question.  I think you should do it!

stopthewnba

March 27th, 2013 at 1:12 PM ^

Edited so table is now a link to GoogleDoc

Will attempt to use generalized linear model later if I get some time

 

Tournament scoring is a small sample (2 games, sometimes 4 for Final Four metrics if Regional round played at basketball arena).  Statistical coincidence about the top and bottom teams (higher than reg. season avg, lower than reg. season avg) making Final Four in 4 of past 5 seasons?  That result surprised me.