If I Was Designing a Poll...
Also, warning: it's long. For those who like their baseball games in ESPN highlights, and their Melville in Cliff Notes, I put bullet points under each heading.
I wanted to generate a discussion on different polling strategies, and come to a consensus on what we expect from NCAA polls.
First, assumptions:
- Polls are not and will never be exact, even at the end of the season. There is no "right answer." Comparing over 100 teams with hideously unbalanced schedules with absolute accuracy is nigh impossible.
- We want polls anyway.
- A higher-ranked team is considered better than one ranked below it.
- Even if we produced that theoretical "perfect poll" there would be plenty of people who disagree on it.
- To a degree, there is an unstated general consensus that some teams are better than others, i.e. the masses can agree on certain things, like Florida is in the Top 3, and Michigan isn't.
- We will know more as the season progresses.
- The perfect poll would be the exact same in the preseason and at the end of the season, and still be entirely justifiable.
- Consensus is the ultimate goal -- corollary: fewer polls is better.
Resume Voting
- Best At: Being a ranking on this year's performance that actually has its basis in this year's performance
- Worst At: Providing a non-laughable poll before November
- Primary Gripe: Small sample = useless
I have respect for resume voters because they have the same standard throughout the season. The downside is their polls take awhile to come together. Resumes grow more demonstrative only after there's experience on there. If I showed you the resumes of two 16-year-olds and you had to pick which one will end up making the most money by the time they are 50, we would be clueless.
At least it's a metric that makes some sense. But the wild variance defeats the purpose of having these polls in the first place: it's not to generate discussion, it's to provide a frame of reference for assessing the difficulty of beating one team or another. If Cincy loses next week, nobody's going to believe it if you say "oh wow, they beat the No. 1 team in the country."
It also, when it's used in concert with other voting metrics, has the unintended effect of compounding things like an overrated conference. A great example is the Big East a few years ago, when South Florida, Rutgers, Louisville and West Virginia took advantage of some early season flukes and an incredibly soft middle of the schedule to leap-frog each other to the top of the polls. This was the primary culprit in the short-lived appearance of USF at No. 2 in the BCS poll -- any ranking that has South Florida second in the nation in anything beside STDs is a travesty.
The upside of resume voting is that every week it gets more and more feasible. The BCS poll has been, in many of its incarnations, essentially a resume poll, which had the good sense to begin releasing data late in the season. Ultimately, resume voting is a justifiable system so long as it remains pure, but isn't very useful early in the year at providing a poll's primary objective: to provide a plausible ranking of NCAA's best teams.
Suggestion for improvement: Stay out of it until near the end. I want resume to determine who plays for the National Championship, but I'd rather not half-finished resumes affecting the mid-season polls. Other words: I'm with you if you wanna put '03 LSU and '03 Oklahoma in the Championship, but let's call '03 USC No. 1 right up until the end of the Rose Bowl, just so we're clear that Michigan is facing the hardest team in the country. Make sense?
Roster Voting
- Best At: Pre-Season Poll that passes credulity test, Mid-season difficulty rankings
- Worst At: End-of-Season Poll that passes credulity test
- Primary Gripe: Not enough data, plays down this year's performance, which, like, isn't that what the poll is about?
Early in the season, this is most polls, including the AP and Coaches. Since no games have been played, it's a vote based primarily on how good the team was last year, with plusses for returning players, minuses for returning players.
This does a much better job of placating the masses in the pre-season. As the season progresses, however, as opposed to resume voting, this metric tends to disappear almost entirely, which I think is a major disservice to these polls.
Essentially, they fall victim early on to resume voting, rather than stick to their guns. This means big drops for teams as they lose. The downside, of course, is that if there's a consensus No. 1 team that loses its only two games early in the year, you'll see a major shift in that team's ranking -- big drop, steady incline, etc. This hurts the usefulness of the poll, since it changes its base metric mid-way through, essentially calling out its own initial justification.
A roster-based poll shouldn't be oblivious to the unfolding season, but it also shouldn't abandon its basis. Updates would be based on roster shifts, such as Oregon losing Dixon, Pat White losing a finger, or Michigan discovering one of its 4-star freshman recruits is already a more-than-serviceable and perhaps awesome college QB. This does not seem to generate much shift, but revelations abound in college football -- if someone pays close attention, we could end up with a fairly decent poll insofar as showing how much of a challenge each team should present.
Like resume polling, a roster poll is justifiable -- last year's performance, injuries, player statistics: these are all available metrics.
However, as the year progresses, such a poll would require A TON of input to remain accurate. Barring a UFR for every team, a roster poll seems unfeasible.
I can't think of a poll that keeps this metric throughout the season. I'd like to see one in the blog poll. It would wrack up a lot of Mr. Stubborns, and a few other outliers as other voters respond to season upsets, etc. And more importantly, while it's very useful at showing which team is the hardest to beat talent-wise early in the year, the more the season progresses, the more you'll have major incongruities, like a highly talented 4-loss team in the Top 5 while a lucky, scrappy, undefeated mid-Major team lingers at the bottom of the Top 25.
After about 8 weeks, a roster-voted poll would get lapped by the resume voters in placating the general populace, and take a lot of flack along the way. And at the end of the year, it would be totally useless.
Suggestion for improvement: This needs statistics, or it's as bupkis as pre-season polls. One day (I'm already looking into it) there will be UFR-like statistics kept for every player on every team. This will facilitate player and position rankings. And coaching ratings, too. And team rankings (offensive/defensive efficiency, etc.) The more info compiled and thrown in, the more this type of polling becomes feasible. Never going to be useful for who belongs in a championship, but I, for one, would find such a stat very interesting when having one team go up against another.
Predictive Voting
- Best At: Pre-Season Polling
- Worst At: BCS Selection, Precision
- Primary Gripe: Factors are compounded
This is a straight-up attempt to get the final poll right in Week 1. A lot of AP voters fall into this trap, as evidenced by the justification they give for their preseason ballots.
e.g.
"I ranked Ohio State 1st because the lolBigTen is so weak the Buckeyes can knock off a freshman-quarterbacked USC, then tapdance to the BCS championship again."In this example, does this hypothetical
Predictive voting does have a strategy for keeping itself in line, which makes it somewhat useful, if still inaccurate, for mid-season and late-season polling. Essentially, teams are not down-rated at all when they lose something they were expected to lose in the fashion in which they were expected to lose it. They play against their expectations.
Predictive voting is often used in concert with another metric, most often as a correction to Roster Voting ballots that generally have mid-Majors and giants in weak BCS conferences underrated. It generally has a lot of opportunity to look stupid as the season progresses, since the swings after unexpected wins and losses, in practice, are never truly in line with expectations. It also doesn't account for surprises, like Notre Dame losing to Michigan (not expected) but demonstrating that its offense is for real (i.e. they're not worthy of a major fall).
Predictive voting is, however, not a bad way, conceptually, to achieve the goal of a preseason ballot that bears some resemblance to the end of the season. Of course, it's hideous at providing an accurate ranking of teams' actual ability. But it does a fair job of passing the eyeball test, and remains a well-used tool for college polling.
Suggestion for improvement: Accuracy is the problem, because all changes are totally subjective. So use computers. Run 10,000 simulations of every game left in the season. This becomes the base prediction for each team, and should provide a solid framework for an initial season. Derivation from expectation down-ranks them or up-ranks them as the season progresses. Easier way: use the spread -- gamblers know what they're doing.
Hype Voting
- Best At: Wooooo!!! Tate Forcier is a god!!! I'm gonna go online now and see if the national consensus agrees! Woooo!!! They agree! We Rock!!!!
- Worst At: NCAA Polling
- Primary Gripe: Loose grip on reality
Accordingly.
This metric is among the least justifiable of the non-biased metrics, but is also rampant. Except it's also the easiest way to create a poll that readers generally agree with mid-season. It's basically rearranging teams each week based on carrots like "so-and-so deserves a 10-slot bump" or "Team X defeated Team Y so team X should go above Team Y."
It passes the eyeball test, which is the whole point of hype voting. But it also generates a goodly chunk of the eyeball rolling from other pollsters who want something more concrete behind their polls.
Suggestion for improvement: This basically comes down to faking it to get the results you wanted when solid metrics fail. I'm of a mind to either improve metrics or believe them before turning to pre-conceived notions out of convenience.
Bias Voting
- Best At: No. 3 Notre Dame @ No. 1 USC. TONIGHT on NBC!!!
- Worst At: Honesty
- Primary Gripe: Subversion of polling for selfish gain
Brian uses the Coulter/Kos Award to keep the bloggers honest about their own teams, but I don't know how much he's watching what they do to their rivals and opponents. Just because you wear your bias on your sleeve, that doesn't mean you're immune from it (e.g. Coulter, Kos).
Suggestion for improvement: Not that Brian hasn't said it 1,000 times, but this bears repetition upon repetition: MAKE ALL VOTES PUBLIC AND HOLD VOTERS ACCOUNTABLE.
What's Best?
Obviously, aside from a few resume polls, most polls are a combination of many of these metrics, all of which have major holes in them that strain credulity, over/under-reward scheduling and biases and notoriety, etc. At any given point during the season, and depending on the function a poll is meant to serve at that point in the season, there are better metrics than others. So let's go back to our suppositions, and pick out what it is we want from a poll at any given time:
- Preseason: Closest as possible to the final poll, plus something that passes the eye test, i.e. readers can generally agree with it. For this, I suggest a combination of Roster and Predictive polling. Both are in dire need of better statistics, but the stats are out there already, and currently being employed to good effect by oddsmakers, who have a stake in getting it right (although they move their bets based on hype). We know who's on what team, and who will most likely be playing X amount of time at each position. We have a record of play for every year prior for every player on every team. We know the recruiting value of incoming freshmen, and we know the base value of freshmen to keep the recruiting value in perspective. As the season progresses, we have more records of play, which should make us more accurate. Transcribing this to a statistical value is not impossible, just very time-consuming.
- Early Season: Still, I would stick to exclusively Roster and Predictive polls, for reasons shown above. I think one consensus poll would be best for this period.
- Week 8 to Bowls: Start publishing a second poll, sort of like the BCS numbers, but not really, because it would be entirely Resume based (note: would also be used to determine playoff spots). This poll would show teams ranked by their resume If they were to win every game left on their schedule. It seems counter-intuitive, since, yeah, a lot of them play each other. But actually, that keeps it cleaner -- those that play each other get credit for doing so based on where each is at before the inevitable down-ranking of each other.*
- End of Season: Publish a final Resume-based poll.
It would be awesome for fans, as major programs try to schedule each other early to build a high resume before Week 8. Then, as injuries deplete rosters and cold sets in, each team is in do-or-die mode every week, or else risk losing their place in line.
Okay, I've said my piece. As with everything else I write, I ask you to please find as many holes in it as you can (except typos, which I plan to go back and fix when time allots).
September 16th, 2009 at 5:27 PM ^
September 16th, 2009 at 10:18 PM ^
September 16th, 2009 at 6:06 PM ^
September 16th, 2009 at 10:53 PM ^
September 16th, 2009 at 6:10 PM ^
September 16th, 2009 at 6:29 PM ^
September 16th, 2009 at 11:38 PM ^
September 16th, 2009 at 8:17 PM ^
September 16th, 2009 at 10:15 PM ^
September 17th, 2009 at 10:58 AM ^
September 17th, 2009 at 11:51 PM ^
September 16th, 2009 at 11:57 PM ^
September 17th, 2009 at 12:30 AM ^
September 17th, 2009 at 12:38 AM ^
September 17th, 2009 at 8:00 AM ^
September 17th, 2009 at 2:40 AM ^
September 17th, 2009 at 8:10 AM ^
Comments