How is the SF evaluation used? Is it the eval that appears highest in the game before a certain move?
The evaluation is usually highest near or at the last move played in the game, and often climbs during a winning game.
Is it maybe you look first if the game is won and then look at what point the evaluation didnt fall below a certain high evaluation level? So for example it was +0.5 at some point of the game for the winning side, and from that point it never fell again below +0.5?
SF evals are often exaggerated for a human. I often see eval of +1.5 given by SF, but that is only valid if the human would follow a deep line sometimes 20 or more moves deep until the material advantage is actually on the board, and it is hard to say if SF based its eval on such material advantage 20 moves prior or if even SF could not see that for sure and the eval was based on positional advantage.
Sometimes the eval raises suddenly by +2.0 or more pawn units, just to drop again if the very best move was missed. But such line often enough isnt a winning pawn tactic, and it is indeed very difficult to understand. While other +2.0 evals are a pawn winning tactic. The eval of SF is of a kind that +1.0 isnt really "a pawn up", but if you have really a clean pawn advantage on the board, the Eval is at that point already much higher than +1.0. Probably because SF sees already the future endgame 20 moves deeper ahead.
Anyway, good work, interesting graphs.
A similar study evaluated the average error per move compared to the best move suggested by an old chess engine.
The less the best moves deviated for the best chess engine move, the stronger the GM was, and you could estimate the elo strength based on the average error deviation. However, this study was made by a much weaker engine than nowadays is available, and the depth for evaluation was only 12 or 13 ply moves deep. Could have been Fritz engine, I cant remember.
Such analysis made it possible to compare the strength of old grandmasters like Lasker or Capablanca with modern GMs.
This analysis came to the conclusion that Magnus Carlsen is indeed the strongest chess player that ever walked the earth. Which is not that surprising - I think that modern GMs are indeed stronger than the best chess players 100 years ago, who didnt have chess engines and chess databases at their disposal.
Such analysis eliminates elo deflation/inflation (ie was fischer's Fide elo worth the same as nowdays Fide Elo? And alone due to the influx of New players during Covid years we saw some deflation (drop) in ratings at every level.
