Elo Ranking System

Hello, folks! This is Kaighn Kevlin here to introduce the new Elo rating system. This is a universal rating system for head to head competitive games where only the outcome of the games is recorded. It’s THE standard for chess ratings, and it’s used in many other games like in online games, other board games, and even team sports. If you’d like to read more about it, click here.

Here’s a nice little explanation: Everyone’s rating starts at 1000. The rating system models each player’s chance of winning against any other player. The way it will do this is by using something called the logistic distribution, which is extremely similar to the normal distribution.

Let’s look at a game between player A and player B. Let’s say player A’s rating is Ra, and player B’s rating is Rb. For now, assume Player A is more highly rated.

Now, player A will probably win, that’s obvious because he/she has a higher rating. But what is that chance? According to our Elo model, the chance of player A winning against player B is:

Ea = 1/(1+10^((Rb-Ra)/400). (expected value for player A)

Note that it Ra = Rb, then this is 1/(1+10^0), which is ½, exactly what we want. Two equally rated players have 50% chance of winning.

If Ra>Rb, then 1/(1+10^(negative #)), and 10^(negative #) is less than 1, so 1/(1+something less than 1) would be something in the range (.5,1).

If Ra<Rb, then 1/(1+10^(positive #)), and 10^(positive #) is more than 1, so 1/(1+something more than 1) would be something in the range (0,.5).

What will the new ratings be after a game? If player A wins:

Ra’ = Ra + K(1-Ea)

If player A loses:

Ra’ = Ra + K(0-Ea). Where K is some constant.

(1-Ea) and (0-Ea) are the difference between the expected value of the game and the actual result of the game. For instance, if Tyler has a 70% chance of beating you, and then he wins, then the game result is a 1 for Tyler and a 0 for you. The expected value of the game for Tyler was .7, and the expected value of the game for you was .3. So Tyler’s rating goes up by K*(1-.7) = K*.3, and your rating change by K*(0-.3) = K*(-.3). So as we’ve devised the ratings now, the winner’s rating goes up by an amount, and the loser rating goes down by the same amount. Since a player can only go up or down by a percentage multiplied by K, K is the absolute max a player could go up or down by after one game. Assume K = 32 for now. (but we actually use 3 different K values depending on the player’s rating, more on this later).

So if an 1100 player plays a 1300 player, the chance of the 1100 player winning is the same as a 900 player versus an 1100 player. Look at the formula, the 10 is raised to a power of the difference between the two players’ ratings divided by 400. So the difference between player ratings is all the matters. The number 400 is a standard used by most organizations. The choice of that number doesn’t change anything other than the variance of the distribution. The number being 400 leads to the following interpretation:

A player playing someone that is x points below him/her has a win percentage of:

50 Less: 57%

100 Less: 64%

150 Less: 70%

200 Less: 76%

250 Less: 81%

300 Less: 85%

So if you beat an equally rated opponent, you get K*.5 = 10 points added to your rating. If you beat a higher rated opponent, you get K*(1-chance of winning) = K(1-something less than .5) > 10 points. If you beat a lesser rated opponent, you get K*(1-chance of winning) = K(1-something bigger than .5) < 10 points. The exact amount is determined by your exact chance of winning. The above chart is a good rule of thumb. In the table, the chance of beating someone x points below you is Ea, but the chance of that player beating you is 1-Ea. For instance, beating a player 50 points below you has chance 57%, but the chance of someone beating someone that is 50 points higher is 1-.57 = 43%.

Details about K: K is not 32 for everyone. If K was constant, then this would be a zero sum game, where if someone goes up by a certain amount of points, someone else went down. Instead, we want a system where the elite bumper pool players cannot just keep beating people and getting more and more points (even though the amount of points he/she gets each game goes down as he/she gets a higher rating). We want a system where K values for elite players are less than worse players’ K value. So if K=16 for elite players, they will have a harder time getting points. Notice as well that a lower K value makes it harder to lose points too (this is okay because we assume that once a player becomes good, he/she doesn’t regress. Even if he/she did, the system would eventually correct it). To this end, we currently have a 3 – Tiered K values:

K = 32 for player’s with ratings <1010

K = 24 for player’s with ratings >1010 and <1100

K = 16 for player’s with ratings 1100+

''' Note that, when determining your new rating after a game, only worry about your K value. If you, a 950 player, play Tyler, a 1250 player, and Tyler wins, then: '''

Your K-value is K = 32 (for rating <1010)

Tyler's K-value is K = 16 (for rating 1100+)

Your expected win chance was : 15%

Your rating goes down by 32*(0-.15) = 32*-.15 = -4.8

His rating goes up by 16*(1-.85) = 16*.15 = +2.4

 Alternatively, if you win: 

Your expected win chance was : 15%

Your rating goes up by 32*(1-.15) = 20*.85 = +27.2

His rating goes down by 5*(0-.85) = 5*-.85 = -13.6

 Pros about the Elo system: 

0) After a game, you could calculate your new rating with a phone calculator! This is nice and transparent.

1) Notice that your rating isn’t tied to how many games you play. It’s dependent upon your consistent recent performance. If you play 1000 games mediocrely, you’ll have a very bad rating. But you’re ratings goes down less and less every game (because you’re a worse rated player). If you suddenly become Tyler-level, you’ll be beating much higher rated opponents, and you’ll quickly rise in ratings.

2) Losing to Tyler won’t hurt you by much. He’s much better rated than all of us, so he’s expected to win by a lot. You’ll lose few points. Alternatively, if you can manage to beat a good opponent, you’ll get a nice boost. Losing to a bad player hurts you more, so try not to lose to Shaunak.

3) The theory of the elo system is that everyone has a true rating, and over time, this rating system will hone in on everyone’s exact rating. If the system somehow underestimates your ability to win, then your performance will lead to an increase in rating, and the system will appropriately adjust. If the system overrates your chance of winning, then you’re performance on average will lead to a decrease in your rating. In this way, the system will eventually reach equilibrium, and everyone’s win chance against everyone else will be accurate (assuming that if you’re better than player B, and he’s better than player C, then you’re better than player C [this called transitivity property. It assumes the game isn’t like rock paper scissors, which it isn’t in this case]). Of course, everyone is constantly (slightly) improving, so it may never reach true equilibrium, but because these improvements are slight and over time, the system does well.

An illustration of #3: Let’s say Ben has a 40% win chance against Jeremy. This would correspond to a rating difference of 70 (Ben has 70 less points than Jeremy). If they played 100 games, and Ben won 40 of them, then both Ben and Jeremy’s ratings would stay the same. Their performance matched the model’s prediction. In math: Rb’ = Rb + K*(40-40) = Rb. Ben’s expected performance is 40, his observed performance is 40, so his rating goes up by K*(40-40) = 0. Similar result for Jeremy. Isn’t this cool?