Resume vs. Predictor

Submitted by joeyb on November 15th, 2018 at 1:29 PM

There has been a lot of talk about the CFP rankings recently with Michigan's chances for a playoff berth hinging on how the committee ranks teams. Joel Klatt has had several interviews where he talks about how ACC and SEC teams are being overrated to try to bolster the resumes of Alabama and Clemson to ensure that they get into the playoff regardless of whether the lose a game. He then goes on to compare their CFP rankings to FPI as a way of showing how overrated these teams are. This was a misstep on his part in my opinion as he is comparing apples to oranges. I've seen a lot of people follow his lead on here, reddit, and twitter. I've tried to head this off in the comments, but to no avail. Hopefully, something with a wider audience can help.

To start, let's take a look at ESPN's thoughts on FPI vs. the committee rankings:

To help determine the “most deserving” teams for the College Football Playoff, ESPN’s Stats & Information Group has developed some in-depth analytical tools to evaluate résumés. These metrics look “backward” at what a team has accomplished to date and are therefore categorically different from our Football Power Index. FPI is a forward-looking system that evaluates who is “most powerful” and helps predict specific matchups as well as the rest of the season. Some of ESPN’s own break down the difference between “best” (which we measure using FPI) and “most deserving” (where the résumé metrics come in) here.

It goes on to describe SOR (Strength of Record), which was created for the purpose of comparing resumes similar to what the committee is doing. They explicitly state that FPI and SOR are very different measures for looking at the "best" teams vs. "most deserving". So, to continue, there needs to be some discussion about what the goals of the committee is trying to do. ESPN has some more thoughts on this:

Despite the committee’s mantra of selecting the “four best teams in the country,” it appears that in the first two years of playoff selection, the committee favored team accomplishment over team strength. So if you are trying to predict what the committee will do, take a look at strength of record, because seven of eight teams to make the playoff ranked in the top four of that metric before playoff selection. Then FPI can be used to predict which teams will ultimately come out on top.

To put it plainly, the committee is looking at "most deserving" teams Klatt and a lot of others online are talking about "best". This is the misstep. We all know that it is possible for the better team to lose a game. Pretty much any predictor that I've seen has Michigan as a better team than Notre Dame, but Notre Dame won the head-to-head matchup, so they are more deserving as long as they keep winning out. Purdue put the beatdown on OSU, but I don't think that a 5-5 team is better than a 9-1 team, nor do I think that they are the more deserving.

So, what we're looking at is a spectrum of evaluation. On the left, you have the qualitative, backward-looking resume rankings judging "most deserving" by looking at your SOS without MOV and, on the right, you have the quantitative, forward-looking predictors judging "best" by looking at your stats and not even looking at the score, let alone winner or loser of the game. On the left, you have CFP and SOR. On the right you have S&P+, FEI, FPI, Sagarin, Massey, etc.

Bill Connelly tried to bridge the gap between the two sides of the spectrum when he created Resume S&P+. This method starts with S&P+ and adds in some MOV to try to generate a resume. On our spectrum above, where 0 is "most deserving" and 10 is "best", this is probably an 8. There are some major flaws in this methodology. For example, Notre Dame is a top 5 team by just about any resume ranking and a top 10 team by just about any predictor, but Resume S&P+ has them at 14. Penn State, with 3 losses, is listed at 8 despite not being in the top 10 on either end of the spectrum. In my opinion, this is probably due to a high weight on SOS and not enough on how you performed against that schedule. I think this makes it not a great tool for looking at resumes and a useless tool for predictions, which leaves it without a use in ranking teams. I think where it might have use is in its outliers as those teams are likely being over/under ranked due to their SOS.

The argument that SEC or ACC teams are ranked too high seem to be largely based on a mismatch of philosophies. To me, the question comes down to this: what is the purpose of the playoff? Is it to reward the 4 best/most deserving teams, to find the best/most deserving team? While it may seem backwards, if the goal is to find the best team, pitting the 4 best teams per S&P+ is not the best way to do so, unless the assumption is that the better team always wins. Hence, the reason for looking at "most deserving" teams. With the lack of data points that Bill Connelly refers to in the Resume S&P+ article, what makes sense, from an analytics perspective, is to try to take the most deserving teams (the teams that played the harder schedule and still won) and pit them against each other. This is what the CFP is set up to determine. The best team may not win, and we may still not know with statistical certainty which team is the best, but there will be no doubt that the team winning the national championship is the most deserving.

So, if you feel that the S&P+ is closer to how teams should be ranked than a standard resume ranking, that's fine. Just know that your argument should be that the criteria for selection is wrong as opposed to the CFP being a bad implementation of the criteria. However, if your argument is that teams are ranked too high/low based on the criteria specified for selection, use the proper tool to show that which is another resume ranking tool such as SOR.

For what it's worth, Syracuse (+9), Kentucky (-5), Boston College (+6), and Miss St (+10) are the biggest differences between CFP and SOR.

Comments

J.

November 15th, 2018 at 2:29 PM ^

Pretty much any predictor that I've seen has Michigan as a better team than Notre Dame, but Notre Dame won the head-to-head matchup, so they are more deserving as long as they keep winning out.

This is a bad take and you should feel bad.  It is possible for Michigan to put together a more deserving season than Notre Dame, despite the head-to-head results, if Michigan plays a significantly more impressive schedule.  S&P+ thinks they have -- Michigan is #3 in S&P+ Résumé and Notre Dame is 14.  ESPN thinks they haven't -- Michigan is #5 (edit: I had it wrong originally) in SOR and Notre Dame is #3.

We need to end the nonsense of valuing the number of losses over all other metrics, lest we resign ourselves to Septembers full of Delaware State and SDSU games in the Big House.

The Maizer

November 15th, 2018 at 3:03 PM ^

Digging into this a touch more. SOR is based on the likelihood that an average top 25 team would achieve the same record against that schedule. Are the probabilities for victory used in SOR based on FPI? I think they must be.

In that case, then Michigan will play 6th, 41st, and 63rd ranked FPI teams while Notre Dame plays 36th and 43rd ranked FPI teams. The FPI rank and likelihood of beating a team are nowhere close to a linear relationship, so huge advantage to Michigan playing the 6th ranked team in addition to the edge of having a third remaining game. I would think Michigan certainly ends up with the better SOR if both teams win out.

joeyb

November 15th, 2018 at 3:27 PM ^

I used a specific case, that everyone here is familiar with, as an example and you are inferring more than I intended it to mean. I agree it's possible for Michigan to play a significantly more impressive schedule in a general sense, but I don't think they've done it this season if Notre Dame wins out. I don't think that SOR or most other rankings will show that either.

I explained my thoughts on Resume S&P+ and its flaws and uses.

And, the fact is that, without looking at head-to-head, Notre Dame beat a top 5 team, according to most predictors, and Michigan has not. That weighs significantly on resume rankings and will buoy their schedule over ours. It's possible, if Michigan State, Wisconsin, PSU, etc. win out that our strength of schedule improves and pushes us past them, but the same can be said about their schedule.

TrueBlue2003

November 15th, 2018 at 5:00 PM ^

Your take is worse and you should fell horrible.

It's true that number of losses and head-to-head shouldn't be everything but your take is horrible and your facts are wrong.

First, the difference between the #3 schedule and the #14 is not significant.  Those are both tough schedules, one slightly moreso.  And when you consider that's just one metric and others have NDs schedule as more difficult than Michigan's, you're claim that ND has played (or even will play) a more difficult schedule is very wrong.

The OP is absolutely correct that if the criteria is "most deserving" which the committee shows year after year that it is, ND absolutely deserves to be in ahead of Michigan if it goes undefeated against its schedule with the win over Michigan.  Imagine the tables being turned.  There will be rioting in Ann Arbor if an undefeated Michigan with a solid SoS got beat out by a team it defeated.  Just no.

If the committee valued the number of losses above all else, as you claim, UCF would be 4th.  But they are not. ND's schedule has been plenty difficult enough to be where they are.

J.

November 15th, 2018 at 6:12 PM ^

You're misunderstanding what the numbers are telling you.  They're not saying Michigan had the #3 toughest schedule and Notre Dame the #14.  They're saying Michigan performed as well or better against its schedule than all but two other teams would have; for Notre Dame, there are 13 others who would have performed at least that well.

#3 vs #14 is absolutely significant when you're trying to pick four teams.

Notre Dame's schedule is not solid.  That's the entire point.  In their defense, they tried, but they picked a bunch of teams having down seasons.  Their second-best win is Northwestern.

As for UCF: there appears to be an exception to loss-ranking for Group of 5 teams.

Here's my biggest problem with this argument: if Michigan had kept their original schedule, and beaten Arkansas, they would be #3 in the country right now, ahead of Notre Dame at #4.  If 12-1 Michigan ends up getting shut out of the playoff in favor of Notre Dame and 12-1 Alabama, the only reason will be that they attempted to play a challenging game that the fans would enjoy.

If you continue to reward teams for playing bad schedules, and punish them for losing games against good teams, you will continue to see fewer good games.  Michigan would never schedule Notre Dame again; you'd probably see them cancel the Texas and Oklahoma series; etc.  That benefits nobody and makes the sport less interesting.

And, finally, I'm tired of the "imagine the tables being turned" argument.  If Notre Dame can do more in 12 games, including a loss to Michigan, than UM can do in 13, then I'll tip my cap to them.  I certainly won't be saying "but Michigan won by 7 on our home field in the first game of the season; that proves that Michigan is better despite the fact that Notre Dame has steamrolled opponents since then and we've played nobody."

joeyb

November 15th, 2018 at 7:11 PM ^

Resume S&P+ is kind of a weird metric that is hard to define what it is telling you. It's not telling you strength of schedule, so you can't use it to say that Notre Dame's schedule is not solid. It is, however, using strength of schedule and margin of victory. If anything, it's saying that 13 teams would theoretically have a better margin of victory against their schedule. I have a few problems with this approach.

The first is that it doesn't seem to take into account wins and losses, e.g. a 21-point win and 7-point loss is not the same as two 7-point wins against the same two teams.

The second is that the difference between a 1-point win and a 7-point win is much greater than the difference between a 7-point win and 14-point win, which is also much greater than the difference between a 14-point win and 21-point win. He caps wins at 50 points, but he need to do something like the log or sqrt of the score to prevent teams that run up the score from pushing teams like win games by close-to-comfortable margins down too far.

The third is that there doesn't seem to be a distinction made between which team those points came against. So, a 7-point win against a top 5 team should mean more than a 7-point win over a top 25 team.

ND's second best win is Stanford, who is currently 25th in S&P+, and their third best is USC at 36. We have wins over 13, 17, and 32, although OSU is 8.

You are right that had Michigan played Arkansas, we likely wouldn't be having any conversation about this right now assuming that we won the rest of our games. I look at the schedule like the routine in gymnastics or figure skating. Each move/game adds potential to your score, but if you falter because you made it too difficult, then you get deducted points. Faltering once on an otherwise extremely difficult routine/schedule can still allow you to win, but a competitor that nails theirs with slightly less difficulty is still likely to pass you. Had we beat ND, but lost to OSU and gone 12-1 still, we might still be in. Had we beat Arkansas, but lost to OSU and gone 12-1, we are probably tied with Washington State.

If Michigan gets shut out of the playoff, the reason will be that the playoff is not sized properly. Brian has done a post in the past about the proper size of the playoff and decided 6 would be best most years. The fact that there is a good chance at multiple 12-1 P5 teams and a 13-0 G5 team are shut out of the playoff is a problem.

J.

November 15th, 2018 at 7:40 PM ^

The fact that you can make a sensible comparison to gymnastics should tell you everything you need to know about the playoff system.

6 isn't nearly big enough; it just pushes the arguments out a bit further.  16 minimum; 24 preferred, with every conference champion getting an automatic bid, just like the basketball tournament.

We got Michigan / Villanova last night because each team could gain a résumé-building win without sacrificing anything.  These games are happening less and less in college football because it's a race to the bottom.  Almost nobody is willing to schedule a game that can be lost, and there's no penalty to be paid.  Alabama scheduled Louisville, Arkansas State, Louisiana-Lafayette, and The Citadel, and there's absolutely no penalty for them.  Georgia has Austin Peay, Middle Tennessee, UMass, and Georgia Tech, and they'll still be in the playoff if they win the SEC.

Sadly, there won't be a bunch of 12-1 / 13-0 teams shut out of the playoff, because upsets always happen, and we use that to justify the current system.

The Maizer

November 15th, 2018 at 8:24 PM ^

I agree with most of your sentiment, but going to 24 teams also means those out of conference resume-building games aren't as meaningful. Villanova-Michigan is not even remotely as meaningful as Notre Dame-Michigan because the loser is not very affected. A 24 team tournament would be a blast, but the cost is the heart-wrenching losses and thrilling wins of the regular season, at least to an extent.

Creedence Tapes

November 16th, 2018 at 2:09 AM ^

I would really like it if they would just get rid of all non conference games altogether, and instead have the top 32 teams after conference play make it to a play off system. They could also figure out a way for the teams that don't qualify for a play off spot to play 4 more games against other teams that also didn't qualify. 

joeyb

November 15th, 2018 at 8:41 PM ^

I think that 16 is too many. The reason that college football is so fun is that it every game is seemingly life or death for playoff hopes. Yes, this perpetuates this issue that you're illustrating, but a 2 or 3 loss team making the playoffs just means that a loss means nothing. The more teams that you add to the playoff, the less claim that the last team in is going to have to being the best team in the country. It's like when the 18-0 Patriots lost to the Giants. Everyone knows that the Patriots were the better team (and more deserving), but had the giants been 17-1 going into the game, the winner of that game might have actually been thought of as the better team. More games outside the playoff with a limited number of playoff games reduces variance and leads to better claim for the champion.

I'm partial to an 8-team playoff with auto-bids to conference champions in the top 12 or 15 and fillers with the remaining best teams getting in. This year, that would mean Bama/Georgia, Clemson, ND, Michigan/OSU, Oklahoma/WVU, WSU, UCF, and probably a team that lost in a conference championship game (current top 6, 8, 11). Last year it would have been Clemson, Oklahoma, Georgia, OSU, USC, UCF, Bama, and Wisconsin (top 6, 8, 12).

I am actually hoping for Georgia, Oklahoma or WVU, Michigan, WSU, ND, Clemson, and UCF to win out so that as many deserving teams get left out of the playoff as possible, which would probably lead to an immediate increase in the size of the playoff.

All of this is kind of beside the point, though.

TrueBlue2003

November 16th, 2018 at 2:52 PM ^

Oh, I actually read your statement wrong and now that you clarified, it makes your take even worse.

I read this statement here: "if Michigan plays a significantly more impressive schedule.  S&P+ thinks they have -- Michigan is #3 in S&P+ Résumé and Notre Dame is 14."

As S+P+ strength of schedule, since that's the context of your claim: that Michigan has played a significantly more impressive schedule.

But as joeyb explained, S+P+ Resume isn't a SoS metric. Not at all.  It's margin based. And it's a pretty bad margin based at that, considering that it thinks ND is far worse than his actual predictor metric despite the goal to make it a combo resume/quality metric. 

No one is denying that Michigan has performed better than ND when margin is taken into account and that Michigan would be favored on a neutral field if they played again today.  But that's not what the committee does, which is the point of this diary, and it's probably a good thing.

ND has absolutely, without question performed better than Michigan when Ws and Ls and who those Ws and Ls came against are considered.  i.e. they have a better resume.  They have a better record against a close enough SoS and hence are ahead of Michigan in every resume metric that only considers results and not scores (ESPN's SoR, Colley Matrix, etc).

TrueBlue2003

November 16th, 2018 at 3:11 PM ^

The irony of the fact that the SoR metric uses FPI to determine the odds of achieving a given W/L record against a given schedule is that a team would be better suited (in the context of the metric) to win its games by small margins rather than large margins because then that team's opponents would have better FPIs and would thus be considered more difficult to beat.

So if my matrix math is correct, you want your opponents to win by a lot, but for just this particular metric, winning by narrow margins can boost your SoR compared to winning by large margins.

Not sure if that's a significant affect though.

The Maizer

November 15th, 2018 at 2:30 PM ^

This is as thorough a description of the differences in rankings based on accomplishment (resume) and performance (predictor) as I have seen. Well done.

It is very clear the committee uses resume almost exclusively in their determinations. And while SOR has bragging rights about getting 15 out of 16 of the playoff teams so far correct, the down-the-ranks 5 through 25 positions are off. I think they're probably looking at metrics that are also nearly as simple as SOR though, because simple methodologies (LINK) can predict their rankings with surprising accuracy.

Because of this, the railing against the committee's corruption, bias, incompetence, hidden agenda, etc. is misplaced.

joeyb

November 15th, 2018 at 3:52 PM ^

Because of this, the railing against the committee's corruption, bias, incompetence, hidden agenda, etc. is misplaced.

This is what I'm getting at without trying to say it because the committee could be corrupt, biased, and incompetent still. If Michigan moves down in the rankings for a 1-loss SEC or ACC team in the coming weeks, then I could see arguments of bias still potentially being valid. Until then, the CFP rankings match pretty well with calculated results and there isn't much to justify a conspiracy theory at this point.

The Maizer

November 15th, 2018 at 4:36 PM ^

You think like a scientist. Providing empirical evidence for your argument (which is itself a call for objectivity) while maintaining the view that a lack of proof of the counterargument is not proof that said counterargument is invalid. Kudos.

You've convinced me to weaken my stance that there is no conspiracy.

Ecky Pting

November 15th, 2018 at 4:38 PM ^

For what it's worth, Syracuse (+9), Kentucky (-5), Boston College (+6), and Miss St (+10) are the biggest differences between CFP and SOR.

Looks like two ACC teams are significantly overrated, which reinforces Joel Klatt's take regarding the ACC - not to mention the ACC team with the largest disparity is playing ND this weekend, so one can easily extrapolate from this backward-looking metric what effect this might have on ND's SOR going forward.

I still believe that a a bias is inherent in the system, a bias known as Footballing While Notre Dame.

The Maizer

November 15th, 2018 at 4:52 PM ^

Your argument has some merit, but remember that Klatt's point was that the ACC teams were over-ranked at the expense of the B1G teams. Those teams are now Wisconsin (44th in SOR), Purdue (47th in SOR), Iowa (36th in SOR), and MSU (20th in SOR). In this framework, MSU has a legitimate gripe, but they have 4 losses, how made can we be about that? Further, BC and Syracuse would still be top 25 (or one spot outside) teams if they were not over-ranked according to SOR. There are only 3 teams in the CFP rankings that are not top 25 SOR.

joeyb

November 15th, 2018 at 5:11 PM ^

Again, I'm not trying to say that there isn't bias, just that if you want to point out the bias it helps to use the proper tools. That's why I included that last bit.

Syracuse is in a weird position because they and LSU are the only 2-loss P5 teams and, while they don't have any wins over very good teams, their two losses are on the road to the two teams probably playing the the ACC championship game. I don't really have a problem with them where they are at because it will be sorted out when the play ND and BC over the next two weeks. Assuming that Syracuse loses to ND and BC beats FSU, one of Syracuse and BC will be ranked in the 20-25 range, which is fine. That's not going to make a difference for Clemson if they lose a game to Duke, South Carolina, or Pittsburgh.

The one that I have a problem with is Mississippi State as their best two wins are likely to finish the season at 7-5.

TrueBlue2003

November 15th, 2018 at 5:43 PM ^

The good thing about ESPN's SoR metric (assuming that metric is available to the committee) is it determines the odds of winning against a given team based on FPI which is a better indicator of how difficult it was (or how likely it was) to win that game.

So at least for that metric, ND will get credit for beating the 36th most difficult team to beat (per FPI).  That they're ranked 21st in SoR doesn't mean they're the 21st most difficult team to beat to that algorithm.

But yes, to the humans in the room that apparently think Syracuse is the 12th best team, they may overrate the impressiveness of a potential ND win.

Other Andrew

November 16th, 2018 at 9:00 AM ^

Great post JoeyB. You have objectively pulled in a ton of useful details.

In my opinion, "deserve's got everything to do with it." Because regardless of rankings, eye test, game control, or whatever other criteria one prefers, all of that may end up being wrong when the teams actually play. None of it is an exact science, and all of it faces major sample size challenges. There are 129 teams in FBS, and they don't all play one another. Choosing the four best must inherently mean the most deserving because anything else is just a hypothetical based on shaky analysis.

It's complicated enough to value what "deserving" means, but at least it values actual results, not feelings or guesses.

joeyb

November 16th, 2018 at 10:02 AM ^

Thanks.

I used an analogy above comparing scheduling to a routine in gymnastics or figure skating. Another one that was rolling around in my head while I was creating my own resume metric was the idea of showing your work in math. Alabama might be the best team in the country, but scheduling and beating other top teams is showing your work. If Alabama played UCF's schedule I don't think that Alabama would deserve to be in the CFP either, regarless of if they beat each of those teams 100-0. So, to me, "deserving" is a measurable showing how well you tested your team with your schedule and how you fared against it.

The Maizer

November 16th, 2018 at 10:29 AM ^

One problem with those analogies (especially the showing your math work one) is that you also can't objectively value how good an opponent is. In gymnastics, there is a known value for a particular... is trick the right word? Anyway, in cfb, the valuation of beating a "good" team is subject to all of the other subjectivity that plagues every cfb ranking method. So even in measuring "deservedness" based on resume, you still have a wishy washy method to decide how good defeated teams are when you "show your work."

FWIW, I think resume is a fine way to determine things because you can use at least consistent rules. But while the rules could be consistent, the result of those rules will be fluid and the rules themselves are arbitrary. A step in the right direction is good, but there's no perfect way to do this.

TrueBlue2003

November 16th, 2018 at 5:48 PM ^

I agree with half of this statement: "The argument that SEC or ACC teams are ranked too high seem to be largely based on a mismatch of philosophies."

As you noted, the SEC is largely rated as they should be according to "resume" but the ACC is still pretty significantly overrated even based on "resume" if you compare to SoR and other non-predictive resume ranks.

Syracuse is 12th in CFP but their SoR is just 21st (19th in Colley).  They don't have a very good resume.  Their best win is over....unranked NC State? They have one (!!) other win over any team with a winning record and that's 6-5 WMU which is going to end up 6-6.  They lost to unranked Pitt, which makes them 1-2 against P5 teams with a winning record.  That barely indicates top 25 let alone top 12.

Boston College is 20th in CFP and but out of the top 25 in SoR and 35th in Colley.  They have literally zero wins over P5 teams with a winning record despite playing in the ACC where every team has an inflated record because they only have 8 conference games.  They have losses to mediocre Purdue and NC State teams.  When you've played a terrible schedule, lost to two clearly not top 25 team and have defeated exactly zero top 25 teams, there is no rational argument that you should be in the top 25.

The committee has had NC State and UVA ranked in recent weeks with even worse cases to be ranked.

Klatt should have been referencing SoR in his rants because it still supports his true assertion that the ACC is wildly overrated by the committee.

maquih

November 17th, 2018 at 6:57 PM ^

Yeah sorry you spent so much time on this because you missed the point entirely of what Klatt is saying.  Has nothing to do with forward-looking versus backward looking.  It has to do with preseason rankings biasing the current rankings. Computer rankings of all types that ignore preseason rankings all say that the ACC and SEC is overrated, even if they also all have Alabama and Clemson as 1 & 2.