Chess Intrinsic Rating applied to Lichess

21 Jul 202273 viewsEnglish (US)

I recently found an interesting model by K. W. Regan that can assign ratings to players based on their games and ratings by a chess engine. I tried to use it on Lichess rapids and blitz games. Here are the results.

The original articles are pretty interesting reading. (here and here). I decided to try to implement their model - and I would like to share some results with the community.

Method

I used Lichess rapid and blitz games 06-2022. From each rating band (1000-1100, 1100-1200, etc) I took 3500 moves from their games (excluding opening and losing positions). Regan models each player with coefficients c, s. From these coefficients and the engine's evaluations of all possible moves, it then assigns probabilities the player will play these moves. We can invert this and fit how likely it is that the moves come from a player with given coefficients c, s.

Regan gives his fit for USCF rating 1600-2700. Lot of the Lichess rapid and blitz games are much weaker players than that, so I had to try to fit the model again. Unlike the original articles, I used Bayesian likelihood in the space of parameters c and s.

In the following plot, we can see where most probably lie players with different Lichess blitz rating in the c-s space of the intrinsic rating model. (Based on fit from 3500 moves). The better the player is the smaller both c and s parameters are, but particularly for the weak players, the relationship is not linear.

The model seem to be quite sensitive to the c-parameter. It would probably need more games to fit it reliably. We can work around this instability by constraining the fit to 1 dimensional subspace in the c-s space. (The red line). It is not a fit, rather I chose the curve to visually go through the maxima of the likelihoods in the 2D plot. With the constrain, we obtain 1-dimensional distributions of the players of different strengths.

We can notice the general trend that better players have smaller parameter s. The width of the distributions gives margins of error using 3500 moves. (Less moves would make the distributions wider and more uncertain). By calibrating the values of s, we would get the intrinsic rating.

The Comparison

Instead of giving players particular intrinsic rating number, I extracted maxima of the 1D distributions from the previous plots both for blitz and rapid games. Games played with the same parameter s should be of equal quality of play.

Results

This comparison leads to interesting conclusion!

It looks like the quality of the game is the same for the same rating between blitz and rapids! So, for example, if player with rapid rating of 2000 would have 15+10 time control and played againts blitz players, he should have blitz rating 2000. (Typically, he will have smaller rating as he makes more mistakes due to shorter time control though.)
Parameter "s" drops far more than it should according to a linear trend for the best players of rapid. The rapid top players are probably much better players, but they have a lower rating than they should have because there is "noone to bounce off".
Players +-100 rating points are quite recognizable on 3500 moves just based on their games.
When I tried this on 100 moves only (typically 2 games), the player intrinsic rating would be cca +-600 points, so it is not so easy to recognize player from a single game. (By this method).

Discuss this blog post in the forum

Your network blocks the Lichess assets!

Chess Intrinsic Rating applied to Lichess

Method

The Comparison

Results