@Toscani said ^
One way to look at engine analysis is simply to measure divergence from the engine’s top recommendation rather than treating accuracy as a judgment of playing strength. If the engine’s suggested move does not match the move played, that can be counted as a non-engine move. White’s and Black’s non-engine moves can be counted separately, producing a ratio of non-engine moves to total moves played. This does not label moves as mistakes, inaccuracies, or blunders; it only measures how often a human choice differs from the engine’s first choice.
Accuracy percentages should be interpreted with care. A chess game is played by two players, and the more one player deviates from engine recommendations, the more accurate the other player may appear by comparison. Complex positions tend to lower accuracy because there are fewer clearly winning moves, while simple or already winning positions often inflate accuracy because many moves are evaluated as acceptable. For this reason, accuracy numbers primarily reflect comparison to an engine’s evaluation model rather than human performance or understanding.
Rather than relying on a single graphical user interface (GUI) and its specific evaluation algorithms, it can be useful to run analysis through a second GUI or engine. This helps reduce over-dependence on one evaluation model and highlights how different engines rank candidate moves. Moves can also be evaluated relative to PV1, PV2, PV3, and so on, with user-defined weighting if desired.
As an example, I sometimes reanalyze games using the Lucas Chess GUI and count how many moves do not match the engine’s top choice. I analysed a game [GameId "8lqjNL5h"] and White had 21 moves that differed from the engine’s recommendation out of 35 played, while Black had 9 such moves out of 35. I made no distinction between inaccuracies, mistakes, or blunders.
Lucas Chess GUI was showing this on my computer:



This does not indicate to me a perfect game. Accuracy algorithms attempt to provide perspective relative to an engine’s expectations, not a definitive measure of human play quality. Whether one finds these metrics useful or not depends on how they are interpreted and applied.
Chess analysis tools and GUIs represent significant work by their developers. They offer structured perspectives on games, but like any analytical tool, they reflect the assumptions and limitations of the models they are built on. How much value they provide ultimately depends on how they are used.
Yes, engine accuracy is a metric that has its uses. The post is a warning for players not to misinterpret it
@Toscani said [^](/forum/redirect/post/DgqK0XEG)
> One way to look at engine analysis is simply to measure divergence from the engine’s top recommendation rather than treating accuracy as a judgment of playing strength. If the engine’s suggested move does not match the move played, that can be counted as a non-engine move. White’s and Black’s non-engine moves can be counted separately, producing a ratio of non-engine moves to total moves played. This does not label moves as mistakes, inaccuracies, or blunders; it only measures how often a human choice differs from the engine’s first choice.
>
> Accuracy percentages should be interpreted with care. A chess game is played by two players, and the more one player deviates from engine recommendations, the more accurate the other player may appear by comparison. Complex positions tend to lower accuracy because there are fewer clearly winning moves, while simple or already winning positions often inflate accuracy because many moves are evaluated as acceptable. For this reason, accuracy numbers primarily reflect comparison to an engine’s evaluation model rather than human performance or understanding.
>
> Rather than relying on a single graphical user interface (GUI) and its specific evaluation algorithms, it can be useful to run analysis through a second GUI or engine. This helps reduce over-dependence on one evaluation model and highlights how different engines rank candidate moves. Moves can also be evaluated relative to PV1, PV2, PV3, and so on, with user-defined weighting if desired.
>
> As an example, I sometimes reanalyze games using the Lucas Chess GUI and count how many moves do not match the engine’s top choice. I analysed a game [GameId "8lqjNL5h"] and White had 21 moves that differed from the engine’s recommendation out of 35 played, while Black had 9 such moves out of 35. I made no distinction between inaccuracies, mistakes, or blunders.
>
> Lucas Chess GUI was showing this on my computer:
>
> 
>
> 
>
> 
>
> This does not indicate to me a perfect game. Accuracy algorithms attempt to provide perspective relative to an engine’s expectations, not a definitive measure of human play quality. Whether one finds these metrics useful or not depends on how they are interpreted and applied.
>
> Chess analysis tools and GUIs represent significant work by their developers. They offer structured perspectives on games, but like any analytical tool, they reflect the assumptions and limitations of the models they are built on. How much value they provide ultimately depends on how they are used.
Yes, engine accuracy is a metric that has its uses. The post is a warning for players not to misinterpret it