I have always been a big sports fan. So being a developer, and an avid sports fan, I have always been intrigued by computer rankings and how to remove the subjectivity from determining the best team.
My interest grew through the years partly because of the fact that until 3 years ago college football determined their champion based off of the winner of the BCS National Championship game. The game was played by the top two teams in its poll, which were determined by a mixture of human polls and computer polls. This caused many analysts and sports fans to cringe, because how could a computer know who the best teams were. Also besides college football’s use of computer polls, college basketball has always used RPI rankings and other computer rankings, such as Kenpom and Sagarin, in determining the teams that should make the field of 64, now 68.
Over the past two years I have created various ranking/rating systems to both help predict the outcome of a game, while also determining the best team. It was through researching and dabbling in the statistical data that I came to really enjoy one rating system that is both simple and powerful in determining the outcome of a game. This is the Elo Rating system.
What is Elo?
The Elo Rating system was created in the 1960’s by a Hungarian-born physics professor, Arpad Elo. As a master chess player he was looking for a better way to rate/rank chess players. It was in 1970 that the FIDE, The World Chess Federation, decided to implement his ratings.
The Elo Rating is very easy to calculate and simple to use, making it popular and less controversial than other rating systems. It uses the assumption that while from one game to the next a player might show varying results, in the long run the player’s rating only changes slightly. It takes two players and gives the winner some of the loser’s points. The greater the difference in rating, the more the lower rated player takes if they win, and the less the higher rated player takes if they win. Because of this it also gives an idea of the likelihood one individual might win against another.
There are two formulas we will look at for Elo - The win probability and the calculation of the new rating values for both opponents after a game.
Determine the probability one team beats another:
(we will look at two teams with the following ratings - Team A: 2109 and Team B: 1979)
# Team A's probibility to win
team_a_win_prob = 1.0/(10.0**((team_b - team_a)/400.0) + 1.0)
# team_a_win_prob = 1.0/(10.0^((1979.0 - 2109.0)/400.0)+1.0) = .6788
# Team B's probability to win
team_b_win_prob = 1.0 - team_a_win_prob
# team_b_win_prob = 1.0-.6788 = .3212 or we can do
team_b_win_prob = 1.0/(10.0**((team_a - team_b)/400.0) + 1.0)
# team_b_win_prob = 1.0/(10.0^((2109.0 - 1979.0)/400.0)+1.0) = .3212
What this is saying is that for every 400 points they differ, the probability of winning changes by a factor of 10. 400 is the standard used for finding the probibility of a win.
Determine the new rating for a team after a game:
(we continue using the following ratings - Team A: 2109 and Team B: 1979 and will give a K factor of 35)
# If Team A wins
team_a = team_a + k_factor*(1.0 - team_a_win_prob)
team_b = team_b - k_factor*(0.0 - team_b_win_prob)
# Team A's new rating is 2109.0 + 35.0*(1.0-.6788) = 2120
# Team B's new rating is 1979.0 + 35.0*(0.0-.3212) = 1968
# If Team B wins
team_a = team_a + k_factor*(0.0 - team_a_win_prob)
team_b = team_b + k_factor*(1.0 - team_b_win_prob)
# Team A's new rating is 2109.0 + 35.0*(0.0-.6788) = 2085
# Team B's new rating is 1979.0 + 35.0*(1.0-.3212) = 2003
The K factor is just a variable that effects the rate of change. The higher the K factor the more a win or lose effects your rating and the more variable the rating becomes. The lower the K factor the less a win or loss effects your rating and the more stable the ratings become. There is no standard K factor and some institutions use staggered K factors based on the teams ratings involved.
So how do the ratings play out over time?
Every player or team will start with the same rating. It does not matter what the starting point is just that everyone coming into the rating system starts at the same rating. I often see 1500 or 2000 being used as starting rating values. After a few games the new person/team rating will even out and you will see their true rating start to come out.
The Elo Rating will become even more accurate the more games that are played. For sports like college football and basketball it will be more accurate through the years, as ratings will carry over from one year to the next. What does this mean?
# Take the base rating of 1500 and the previous seasons rating for that team, lets say 2100.
# Take a percentage of the base and of the previous seasons rating to determine the new years starting rating.
previous_year_end_rating * 0.60 + base_rating * 0.40
# 2100.0 * .60 + 1500.0 * .40 = 1860
# So the starting rating for this year would be 1860.
You would then continue through that seasons games using the above formula to calculate their rating after every game.
There is of course much more that goes into a rating system using an Elo like formula if you are trying to determine more than just the likelihood of a win. I hope however this gives a good background of Elo and how to get started with your own personal ranking/rating system.