You seem to have forgotten to take into account home field in your predictions, or are you saying home field doesn't matter?
Predictive Win Model for Week 9
With some spare time before the NCAA tournament this year, I developed a predictive model to pick basketball games for my NCAA bracket pool (figured it was better than me picking) using a descriptive discriminant analysis, which essentially assesses the variables that discriminates between categorical variables (in this case, wins and losses). I experienced success with my NCAA basketball model (predicted 80-85% of the NCAA tournament games correctly), so I thought I would see how applicable it would be to college football. So, for the last few weeks I have been validating the model week to week against the Sagarin rating and have had the exact same predictive accuracy (65-70%...not as great as it could be, but I’m in the process of improving upon the model) in terms of expected outcomes (winners vs. losers). I figured it’s a good time to share with fellow MGoBloggers and I hope to make this as concise and readable as possible. Apologies ahead of time if some of the tables don’t show up right, as I’m not too sure how to embed the tables within the diary as well as others.
After assessing a variety of team statistics from the past few weeks (SOS, win percentage, turnover margin, offensive yards per play, defensive yards per play, having a home game, and so on…you name it I have it and have looked at it) on a national level (Division 1-A - FBS only), the team statistics that best predict weekly winners and losers are, in order of importance:
- Point Differential (avg points scored – avg points given up)
- Offensive Yards Per Play
- Defensive Yards Per Play
- Win Percentage
- Turnover Margin
Notable variables that were not important in determining weekly winners are 1) having a home game and 2) strength of schedule (probably too fluid of a variable right now, but could be predictive for bowl game winners at the end of the season).
Big Ten Rankings:
The Big Ten rankings for Week 9 are below. All of the variables in my model are presented in z-scores (-3 to 3) that were computed on a national level, with the higher the score the better for variables for which positive results are better (offensive yards per play, win percentage, turnover margin, and point differential). For the lone variable (defensive yards per play) that is inversely related to winning, having a lower value is better. The variable PREDSCOR is the output of the model, and the game winner is determined solely by the higher of the score between the two teams.
- My model does have us ranked a little higher, and Penn State a little lower than Sagarin. Sagarin indicates this game should be closer, while my model says there is more separation between Michigan and Penn State
- Illinois is ranked lower
- We’re in the middle of the pack in Big Ten (where we expected we might be)
Predictive Model Results for Week 9:
Michigan (1.21) at Penn State (-.05) = Michigan
Michigan State (3.40) at Iowa (2.65) = Michigan State
Northwestern (.43) at Indiana (-.97) = Northwestern
Purdue (-.88) at Illinois (.26) = Illinois
Ohio State (3.97) at Minnesota (-2.46) = Ohio State
I remember this in stats last year (yes, I'm a sophomore in College. Don't laugh because my life is better than yours.....<---that's humor. I had 4 midterms last week.) that I z scores were (-3,3) Why does your PREDSCOR have 2 teams with < 3 scores?
Nice analysis. Can you give a few more details on how you assessed the team statistics. Sounds like its some sort of linear classifier?
Since your model is updated every week, did it predict all the Big Ten games correctly this past weekend?