Predicting how the College Football Playoff Selection Committee will rank teams each week is a classic supervised machine learning problem. Heading into the 2016 season we have two years (13 weeks) of examples that show how the committee evaluates teams. This allows us to fit a model of the process by which they make decisions. We know what each team looked like at each point in the season (our inputs) and we know how the committee ultimately ranked those teams each week (our desired output). Machine learning allows us to establish relationships between these inputs and outputs, and ultimately feed in new data and predict the committee’s rankings given these new inputs.
In the first half of the season, when the committee does not release rankings, we are predicting how the committee WOULD have ranked teams that week. In the second half of the season, we are predicting what they WILL do. Do note that we are not attempting to say what the committee SHOULD do (in fact we think some of their criteria is illogical).
What criteria does the model consider?
CFPMachine uses both measures of on-the-field performance (e.g. ESPN’s Football Power Index, Game Control, and Strength of Record) as well off-the-field information, some of which seek to incorporate potential biases of the committee. Examples include average kickoff time (are West Coast teams disadvantaged because the committee isn’t staying up until 1AM EST to watch their games?), program prestige, conference strength, momentum (does the committee have a recency bias?), and preseason rankings (is the committee too slow in updating its priors in the face of new information?).
How can you predict the committee's behavior on only two years of data?
The model will definitely get better with time, as we get more data and better understand how the committee evaluates teams and makes decisions. But even with only two years of data, it is pretty clear what the committee values, and our predictions in 2015 were fairly accurate...
How accurate has the model been?
Predictions for the 2015 seasons (weeks 9 through 14) had a mean absolute error of 1.68 and a root mean square error of 2.44. In other words, any given prediction was off by about two spots on average compared to the committee's actual ranking (e.g. we predicted Oklahoma State to be ranked 12th in the final week of 2015 -- they were actually ranked 16th, so the error for that prediction = 4).
You have two teams from the same conference in the top 4 this week -- not gonna happen...
These are rankings as it stands TODAY. Obviously some conference opponents have yet to play each other and will take care of the rankings on the field. Also note that in the committee's first ever rankings it had THREE SEC teams in the top 4! (Miss St, Auburn, Ole Miss).
More info to come...
Contact: cfpmachine [at] gmail . com