@EVmaximiser said:
7n/6n1/5n2/4N3/8/1k1KB2B/8/8 b - - 0 40 - white needs exactly 50 moves to win (you need to disable the 50 moves rule and you must have the KBNvKN Syzygy file and possibly others), but Stockfish will show a low evaluation for a very long time. (Source: https://drive.google.com/drive/folders/1hhqMEklD5D-GDIP5DIUGrfPuGCkwXDTI.)
And you don't need to go to 8-man endings. There is even this 5-man ending which is won for white (not even cursed), but Stockfish evaluates it as 0.00 up to a triple digit depth with hours of analysis if you turn off tablebases: 8/8/8/8/5N2/1B6/2K2n2/k7 b - - 0 2.
Another example, from the op1 set: 6k1/6b1/5p2/4nP2/8/8/4P3/3QK3 w - - 0 1: The position is a fortress, but Stockfish evaluates it as 1.51 after 2-15 minutes of analysis. Leela gets this one right, but not the other two examples.
Thanks for these - cool stuff!
The first one is a genuine counterexample, although it's not a position that would ever occur in a game. The second position my setup would evaluate correctly thanks to TBs being installed on my local machine, but I agree that it demonstrates that pure SF can be fooled (albeit rarely). Same with the third position - my human eyes would immediately be suspicious of all the engine lines having the same flat evaluation, and I would suspect a fortress, which is also simple to confirm by eyeballing, but yes it does fool SF.
If you can find positions from real games that can fool my complete centaur setup of strong human + SF + tablebases, I'll be really impressed. I doubt there are more than a tiny handful of them.
Thanks again!
@EVmaximiser said:
> 7n/6n1/5n2/4N3/8/1k1KB2B/8/8 b - - 0 40 - white needs exactly 50 moves to win (you need to disable the 50 moves rule and you must have the KBNvKN Syzygy file and possibly others), but Stockfish will show a low evaluation for a very long time. (Source: https://drive.google.com/drive/folders/1hhqMEklD5D-GDIP5DIUGrfPuGCkwXDTI.)
>
> And you don't need to go to 8-man endings. There is even this 5-man ending which is won for white (not even cursed), but Stockfish evaluates it as 0.00 up to a triple digit depth with hours of analysis if you turn off tablebases: 8/8/8/8/5N2/1B6/2K2n2/k7 b - - 0 2.
>
> Another example, from the op1 set: 6k1/6b1/5p2/4nP2/8/8/4P3/3QK3 w - - 0 1: The position is a fortress, but Stockfish evaluates it as 1.51 after 2-15 minutes of analysis. Leela gets this one right, but not the other two examples.
Thanks for these - cool stuff!
The first one is a genuine counterexample, although it's not a position that would ever occur in a game. The second position my setup would evaluate correctly thanks to TBs being installed on my local machine, but I agree that it demonstrates that pure SF can be fooled (albeit rarely). Same with the third position - my human eyes would immediately be suspicious of all the engine lines having the same flat evaluation, and I would suspect a fortress, which is also simple to confirm by eyeballing, but yes it does fool SF.
If you can find positions from real games that can fool my complete centaur setup of strong human + SF + tablebases, I'll be really impressed. I doubt there are more than a tiny handful of them.
Thanks again!
@OjaiJoao said:
@EVmaximiser said:
7n/6n1/5n2/4N3/8/1k1KB2B/8/8 b - - 0 40 - white needs exactly 50 moves to win (you need to disable the 50 moves rule and you must have the KBNvKN Syzygy file and possibly others), but Stockfish will show a low evaluation for a very long time. (Source: https://drive.google.com/drive/folders/1hhqMEklD5D-GDIP5DIUGrfPuGCkwXDTI.)
And you don't need to go to 8-man endings. There is even this 5-man ending which is won for white (not even cursed), but Stockfish evaluates it as 0.00 up to a triple digit depth with hours of analysis if you turn off tablebases: 8/8/8/8/5N2/1B6/2K2n2/k7 b - - 0 2.
Another example, from the op1 set: 6k1/6b1/5p2/4nP2/8/8/4P3/3QK3 w - - 0 1: The position is a fortress, but Stockfish evaluates it as 1.51 after 2-15 minutes of analysis. Leela gets this one right, but not the other two examples.
Thanks for these - cool stuff!
The first one is a genuine counterexample, although it's not a position that would ever occur in a game. The second position my setup would evaluate correctly thanks to TBs being installed on my local machine, but I agree that it demonstrates that pure SF can be fooled (albeit rarely). Same with the third position - my human eyes would immediately be suspicious of all the engine lines having the same flat evaluation, and I would suspect a fortress, which is also simple to confirm by eyeballing, but yes it does fool SF.
If you can find positions from real games that can fool my complete centaur setup of strong human + SF + tablebases, I'll be really impressed. I doubt there are more than a tiny handful of them.
Thanks again!
There is this endgame from a private Chess 324 (double FRC with the usual kings' and rooks' placements) engine game in 2023: 5k2/5b2/5p2/1K3P2/3bB3/7R/8/8 b - - 0 162. This is a cursed win, which I suspected at the time (Stockfish won when repeatedly resetting the counter every 20-30 moves), but a draw even under current ICCF rules. However, if you play on until 50 moves remain, you get this position: 8/4K1k1/5p2/3b1P2/8/3R4/2B5/b7 b - - 0 184. Stockfish evaluates this 0.00 at depth 124 after 25 GN of analysis. I don't have all the relevant 7-man tablebase files locally, but in this case it seems like a safe assumption that win does not go through e.g. a cursed KRBvKBBP or KRBPvKBB without the cursed part being exclusively in 6-man positions or smaller (which I have in full).
I suspect you will find more examples from the Maximum DTC list at https://op1-tables.info/.
Of course, Stockfish will still find almost all the correct moves (finding exceptions here would be another challenge), even if evaluating incorrectly. A 7-man exception assuming no tablebases building directly on the above 5-man one is 8/8/6n1/8/5P2/1B6/2K1Nn2/k7 b - - 0 1, where my SF wants Nxf4?? (0.00). It has no notion that other 0.00 lines are more likely to be drawn. Other examples might be found from a set of positions where TCEC's administrator Aloril had found deviations between tablebases and Stockfish's move preferences. Granted, Stockfish is a lot better now.
@OjaiJoao said:
> @EVmaximiser said:
> > 7n/6n1/5n2/4N3/8/1k1KB2B/8/8 b - - 0 40 - white needs exactly 50 moves to win (you need to disable the 50 moves rule and you must have the KBNvKN Syzygy file and possibly others), but Stockfish will show a low evaluation for a very long time. (Source: https://drive.google.com/drive/folders/1hhqMEklD5D-GDIP5DIUGrfPuGCkwXDTI.)
> >
> > And you don't need to go to 8-man endings. There is even this 5-man ending which is won for white (not even cursed), but Stockfish evaluates it as 0.00 up to a triple digit depth with hours of analysis if you turn off tablebases: 8/8/8/8/5N2/1B6/2K2n2/k7 b - - 0 2.
> >
> > Another example, from the op1 set: 6k1/6b1/5p2/4nP2/8/8/4P3/3QK3 w - - 0 1: The position is a fortress, but Stockfish evaluates it as 1.51 after 2-15 minutes of analysis. Leela gets this one right, but not the other two examples.
>
> Thanks for these - cool stuff!
>
> The first one is a genuine counterexample, although it's not a position that would ever occur in a game. The second position my setup would evaluate correctly thanks to TBs being installed on my local machine, but I agree that it demonstrates that pure SF can be fooled (albeit rarely). Same with the third position - my human eyes would immediately be suspicious of all the engine lines having the same flat evaluation, and I would suspect a fortress, which is also simple to confirm by eyeballing, but yes it does fool SF.
>
> If you can find positions from real games that can fool my complete centaur setup of strong human + SF + tablebases, I'll be really impressed. I doubt there are more than a tiny handful of them.
>
> Thanks again!
There is this endgame from a private Chess 324 (double FRC with the usual kings' and rooks' placements) engine game in 2023: 5k2/5b2/5p2/1K3P2/3bB3/7R/8/8 b - - 0 162. This is a cursed win, which I suspected at the time (Stockfish won when repeatedly resetting the counter every 20-30 moves), but a draw even under current ICCF rules. However, if you play on until 50 moves remain, you get this position: 8/4K1k1/5p2/3b1P2/8/3R4/2B5/b7 b - - 0 184. Stockfish evaluates this 0.00 at depth 124 after 25 GN of analysis. I don't have all the relevant 7-man tablebase files locally, but in this case it seems like a safe assumption that win does not go through e.g. a cursed KRBvKBBP or KRBPvKBB without the cursed part being exclusively in 6-man positions or smaller (which I have in full).
I suspect you will find more examples from the Maximum DTC list at https://op1-tables.info/.
Of course, Stockfish will still find almost all the correct moves (finding exceptions here would be another challenge), even if evaluating incorrectly. A 7-man exception assuming no tablebases building directly on the above 5-man one is 8/8/6n1/8/5P2/1B6/2K1Nn2/k7 b - - 0 1, where my SF wants Nxf4?? (0.00). It has no notion that other 0.00 lines are more likely to be drawn. Other examples might be found from [a set of positions](https://pastebin.com/v2kJWtdC) where TCEC's administrator Aloril had found deviations between tablebases and Stockfish's move preferences. Granted, Stockfish is a lot better now.
Great, then i wonder what's next? maybe we can add DTZ50 to op1 8 men tablebase, or we can introduce new type of tb8, like both sides have a Bishop/Knight/Rook/Queen, (oB1, oN1, oR1 and oQ1 tablebase)? I think these positions also happen much often in endgames
Great, then i wonder what's next? maybe we can add DTZ50 to op1 8 men tablebase, or we can introduce new type of tb8, like both sides have a Bishop/Knight/Rook/Queen, (oB1, oN1, oR1 and oQ1 tablebase)? I think these positions also happen much often in endgames
<Comment deleted by user>
@OjaiJoao said:
There are almost no 8-man positions for which the WDL evaluation is not already strongly suggested by the engine's numerical evaluation according to the model:
In the vast majority of positions, the engine will get it right. It's still valuable to have TBs to get perfect evaluation in corner cases, but I think the most interesting application would be using a deterministic engine search (single-thread, fixed depth, fixed version) as a predictor and only storing TB positions where the engine fails to provide a correct move or evaluation.
That would essentially be a compression scheme for the TB that trades probing efficiency (running even a low-depth search would be computationally costly), but in exchange offers much lower storage and RAM requirements.
@OjaiJoao said:
> There are almost no 8-man positions for which the WDL evaluation is not already strongly suggested by the engine's numerical evaluation according to the model:
In the vast majority of positions, the engine will get it right. It's still valuable to have TBs to get perfect evaluation in corner cases, but I think the most interesting application would be using a deterministic engine search (single-thread, fixed depth, fixed version) as a predictor and only storing TB positions where the engine fails to provide a correct move or evaluation.
That would essentially be a compression scheme for the TB that trades probing efficiency (running even a low-depth search would be computationally costly), but in exchange offers much lower storage and RAM requirements.
@EVmaximiser said:
There is this endgame from a private Chess 324 (double FRC with the usual kings' and rooks' placements) engine game in 2023: 5k2/5b2/5p2/1K3P2/3bB3/7R/8/8 b - - 0 162. This is a cursed win, which I suspected at the time (Stockfish won when repeatedly resetting the counter every 20-30 moves), but a draw even under current ICCF rules. However, if you play on until 50 moves remain, you get this position: 8/4K1k1/5p2/3b1P2/8/3R4/2B5/b7 b - - 0 184. Stockfish evaluates this 0.00 at depth 124 after 25 GN of analysis. I don't have all the relevant 7-man tablebase files locally, but in this case it seems like a safe assumption that win does not go through e.g. a cursed KRBvKBBP or KRBPvKBB without the cursed part being exclusively in 6-man positions or smaller (which I have in full).
I suspect you will find more examples from the Maximum DTC list at https://op1-tables.info/.
Thanks, that engine-game position is cool and yes definitely a counterexample.
What's your estimated percentage of 8-man positions that fool SF? 1 in 10k, 100k, 1M?
@EVmaximiser said:
>
> There is this endgame from a private Chess 324 (double FRC with the usual kings' and rooks' placements) engine game in 2023: 5k2/5b2/5p2/1K3P2/3bB3/7R/8/8 b - - 0 162. This is a cursed win, which I suspected at the time (Stockfish won when repeatedly resetting the counter every 20-30 moves), but a draw even under current ICCF rules. However, if you play on until 50 moves remain, you get this position: 8/4K1k1/5p2/3b1P2/8/3R4/2B5/b7 b - - 0 184. Stockfish evaluates this 0.00 at depth 124 after 25 GN of analysis. I don't have all the relevant 7-man tablebase files locally, but in this case it seems like a safe assumption that win does not go through e.g. a cursed KRBvKBBP or KRBPvKBB without the cursed part being exclusively in 6-man positions or smaller (which I have in full).
>
> I suspect you will find more examples from the Maximum DTC list at https://op1-tables.info/.
Thanks, that engine-game position is cool and yes definitely a counterexample.
What's your estimated percentage of 8-man positions that fool SF? 1 in 10k, 100k, 1M?
@OjaiJoao said:
@EVmaximiser said:
There is this endgame from a private Chess 324 (double FRC with the usual kings' and rooks' placements) engine game in 2023: 5k2/5b2/5p2/1K3P2/3bB3/7R/8/8 b - - 0 162. This is a cursed win, which I suspected at the time (Stockfish won when repeatedly resetting the counter every 20-30 moves), but a draw even under current ICCF rules. However, if you play on until 50 moves remain, you get this position: 8/4K1k1/5p2/3b1P2/8/3R4/2B5/b7 b - - 0 184. Stockfish evaluates this 0.00 at depth 124 after 25 GN of analysis. I don't have all the relevant 7-man tablebase files locally, but in this case it seems like a safe assumption that win does not go through e.g. a cursed KRBvKBBP or KRBPvKBB without the cursed part being exclusively in 6-man positions or smaller (which I have in full).
I suspect you will find more examples from the Maximum DTC list at https://op1-tables.info/.
Thanks, that engine-game position is cool and yes definitely a counterexample.
What's your estimated percentage of 8-man positions that fool SF? 1 in 10k, 100k, 1M?
Sorry, I realized that DTC is given in moves, while DTZ is given in plies, so it is not really a counterexample in the extreme way I thought. Stockfish gives 0.64 after five minutes in this DTC = 50 position and 1.32 after fifteen minutes (8.2 GN/s). It goes above your 1.50 after 30 minutes (depth 61 and seldepth 146). The only potential problem is that it needs extreme accuracy to avoid the 50 moves rule, while with a more relaxed rule, Stockfish would would win more comfortably.
As for the percentage, I have no idea, but probably an extremely low number, since the vast majority of positions of the full 8-man have an obvious result.
@OjaiJoao said:
> @EVmaximiser said:
> >
> > There is this endgame from a private Chess 324 (double FRC with the usual kings' and rooks' placements) engine game in 2023: 5k2/5b2/5p2/1K3P2/3bB3/7R/8/8 b - - 0 162. This is a cursed win, which I suspected at the time (Stockfish won when repeatedly resetting the counter every 20-30 moves), but a draw even under current ICCF rules. However, if you play on until 50 moves remain, you get this position: 8/4K1k1/5p2/3b1P2/8/3R4/2B5/b7 b - - 0 184. Stockfish evaluates this 0.00 at depth 124 after 25 GN of analysis. I don't have all the relevant 7-man tablebase files locally, but in this case it seems like a safe assumption that win does not go through e.g. a cursed KRBvKBBP or KRBPvKBB without the cursed part being exclusively in 6-man positions or smaller (which I have in full).
> >
> > I suspect you will find more examples from the Maximum DTC list at https://op1-tables.info/.
>
> Thanks, that engine-game position is cool and yes definitely a counterexample.
>
> What's your estimated percentage of 8-man positions that fool SF? 1 in 10k, 100k, 1M?
Sorry, I realized that DTC is given in moves, while DTZ is given in plies, so it is not really a counterexample in the extreme way I thought. Stockfish gives 0.64 after five minutes in [this](https://op1-tables.info/?fen=2K2k2/8/2B2p2/b4P2/8/1R6/4b3/8_b_-_-_0_234) DTC = 50 position and 1.32 after fifteen minutes (8.2 GN/s). It goes above your 1.50 after 30 minutes (depth 61 and seldepth 146). The only potential problem is that it needs extreme accuracy to avoid the 50 moves rule, while with a more relaxed rule, Stockfish would would win more comfortably.
As for the percentage, I have no idea, but probably an extremely low number, since the vast majority of positions of the full 8-man have an obvious result.
The post heading is inaccurate It should read: Lichess can now tell if the position is a win, loss, or draw BASED ON PERFECT PLAY.
We shouldn't assume that a human in any of these positions will always make the perfect move. If we could assume that, then chess wouldn't be any fun at all.
Are the authors just assuming that the position is based on the based on the play of two super engines? I really hope not, if that's the case, Bobby Fischer may have been correct about the future of chess.
The post heading is inaccurate It should read: Lichess can now tell if the position is a win, loss, or draw BASED ON PERFECT PLAY.
We shouldn't assume that a human in any of these positions will always make the perfect move. If we could assume that, then chess wouldn't be any fun at all.
Are the authors just assuming that the position is based on the based on the play of two super engines? I really hope not, if that's the case, Bobby Fischer may have been correct about the future of chess.
@vegan_cannibal said:
The post heading is inaccurate It should read: Lichess can now tell if the position is a win, loss, or draw BASED ON PERFECT PLAY.
We shouldn't assume that a human in any of these positions will always make the perfect move. If we could assume that, then chess wouldn't be any fun at all.
Are the authors just assuming that the position is based on the based on the play of two super engines? I really hope not, if that's the case, Bobby Fischer may have been correct about the future of chess.
By that logic, the very existence of chess engines is flawed, and yet almost all people use them to analyse games/ do prepwork. The tablebase is the same, as having a tablebase where it only shows sub-optimal lines is flawed and in most cases useless.
@vegan_cannibal said:
> The post heading is inaccurate It should read: Lichess can now tell if the position is a win, loss, or draw BASED ON PERFECT PLAY.
> We shouldn't assume that a human in any of these positions will always make the perfect move. If we could assume that, then chess wouldn't be any fun at all.
> Are the authors just assuming that the position is based on the based on the play of two super engines? I really hope not, if that's the case, Bobby Fischer may have been correct about the future of chess.
By that logic, the very existence of chess engines is flawed, and yet almost all people use them to analyse games/ do prepwork. The tablebase is the same, as having a tablebase where it only shows sub-optimal lines is flawed and in most cases useless.
@EVmaximiser said:
There is this endgame from a private Chess 324 (double FRC with the usual kings' and rooks' placements) engine game in 2023
By the way, thanks for introducing me to Chess324! Simple and brilliant, can't believe I've never heard of it before. All the advantages of 960 with none of the disadvantages. Unlike 960, I'd actually play 324!
@EVmaximiser said:
> There is this endgame from a private Chess 324 (double FRC with the usual kings' and rooks' placements) engine game in 2023
By the way, thanks for introducing me to Chess324! Simple and brilliant, can't believe I've never heard of it before. All the advantages of 960 with none of the disadvantages. Unlike 960, I'd actually play 324!