OpenBench

OpenBench Testing Framework

Finished
World	Reckless	weights_s2_same_as_s1	diff	N=25000	LLR: -2.38 (-2.25, 2.89) [0.00, 4.00] Games: 121982 W: 39904 L: 39555 D: 42523 Ptnml(0-2): 4020, 13849, 25114, 13778, 4230	S2{ 640: 1.0, 1536: 0.5, OB: 1.0}
Peregr	Reckless	more-raise-red	diff	8.0+0.08	LLR: -2.25 (-2.25, 2.89) [0.00, 4.00] Games: 18012 W: 4345 L: 4393 D: 9274 Ptnml(0-2): 70, 2194, 4518, 2162, 62
World	Reckless	datagen-openbench	diff	N=5000	Elo: 179.34 +- 10.42 (95%) [N=40000] Games: 4000 W: 2679 L: 780 D: 541 Ptnml(0-2): 42, 83, 642, 400, 833	Reckless 5k SN vs. Reckless 5k
World	Reckless	[Stockfish] openbench	diff	N=5000	Elo: -127.79 +- 16.86 (95%) [N=40000] Games: 1156 W: 247 L: 654 D: 255 Ptnml(0-2): 145, 194, 182, 37, 20	Reckless 5k SN vs. Stockfish 5k
World	Reckless	see-threshold-test	diff	40.0+0.40	LLR: -2.28 (-2.25, 2.89) [0.00, 4.00] Games: 17714 W: 4033 L: 4080 D: 9601 Ptnml(0-2): 11, 2083, 4713, 2042, 8
Peregr	Reckless	fp_static	diff	8.0+0.08	LLR: -2.34 (-2.25, 2.89) [0.00, 4.00] Games: 26316 W: 6365 L: 6390 D: 13561 Ptnml(0-2): 74, 2777, 7484, 2746, 77
Styxdo	Reckless	weights_s2_same_as_s1	diff	8.0+0.08	LLR: -2.33 (-2.25, 2.89) [0.00, 4.00] Games: 20566 W: 4989 L: 5033 D: 10544 Ptnml(0-2): 77, 2557, 5057, 2517, 75	S2{ 640: 1.0, 1536: 0.5, OB: 1.0}
Peregr	Integral	[Reckless] main	diff	40.0+0.40	Elo: 28.54 +- 4.92 (95%) [N=10000] Games: 4954 W: 1503 L: 1097 D: 2354 Ptnml(0-2): 14, 404, 1234, 812, 13	new PT after some gains
Peregr	Integral	[Reckless] main	diff	40.0+0.40	Elo: 38.72 +- 5.01 (95%) [N=10000] Games: 4730 W: 1513 L: 988 D: 2229 Ptnml(0-2): 15, 330, 1143, 869, 8	Old PT
Styxdo	Reckless	weights_s1_v25_s2_equal	diff	8.0+0.08	LLR: -2.26 (-2.25, 2.89) [0.00, 4.00] Games: 6152 W: 1475 L: 1558 D: 3119 Ptnml(0-2): 16, 792, 1549, 697, 22	Same S1 as v25, S2 equal weights for 640 and 1536
Styxdo	Reckless	weights_s1_v25_s2_equal	diff	N=25000	LLR: -2.33 (-2.25, 2.89) [0.00, 4.00] Games: 11852 W: 3838 L: 3934 D: 4080 Ptnml(0-2): 443, 1325, 2454, 1293, 411	Same S1, S2 same weight for 640 and 1536
World	Reckless	fp-null-move	diff	8.0+0.08	LLR: -2.28 (-2.25, 2.89) [0.00, 4.00] Games: 33692 W: 8158 L: 8160 D: 17374 Ptnml(0-2): 111, 4112, 8408, 4098, 117
World	Reckless	reset-optimizer-with-warmup	diff	8.0+0.08	LLR: -2.28 (-2.25, 2.89) [0.00, 4.00] Games: 22370 W: 5378 L: 5414 D: 11578 Ptnml(0-2): 78, 2725, 5622, 2675, 85
World	Reckless	see-threshold-test	diff	8.0+0.08	LLR: -2.34 (-2.25, 2.89) [0.00, 4.00] Games: 4934 W: 1156 L: 1248 D: 2530 Ptnml(0-2): 20, 651, 1212, 569, 15
World	Reckless	see-threshold	diff	40.0+0.40	Tuning 15 Parameters 7511/7500 Iterations 15022/15000 Games Played
World	Reckless	futility-value-mc	diff	8.0+0.08	LLR: -2.27 (-2.25, 2.89) [0.00, 4.00] Games: 7674 W: 1791 L: 1870 D: 4013 Ptnml(0-2): 24, 972, 1918, 905, 18
Peregr	Reckless	ergodice_stuff	diff	8.0+0.08	LLR: -2.36 (-2.25, 2.89) [0.00, 4.00] Games: 19496 W: 4641 L: 4689 D: 10166 Ptnml(0-2): 63, 2388, 4908, 2312, 77	check ergodice idea
World	Reckless	ergodice_stuff_fast_unsound	diff	8.0+0.08	LLR: -2.33 (-2.25, 2.89) [0.00, 4.00] Games: 7680 W: 1840 L: 1923 D: 3917 Ptnml(0-2): 36, 974, 1890, 917, 23
Peregr	Reckless	less-tt-cut	diff	8.0+0.08	LLR: -2.31 (-2.25, 2.89) [0.00, 4.00] Games: 32864 W: 7907 L: 7913 D: 17044 Ptnml(0-2): 132, 3915, 8319, 3959, 107	less ~14%
World	Reckless	lmr-iir	diff	8.0+0.08	LLR: -2.29 (-2.25, 2.89) [0.00, 4.00] Games: 16084 W: 3891 L: 3947 D: 8246 Ptnml(0-2): 79, 1946, 4027, 1932, 58
World	Reckless	lmp-static-eval	diff	8.0+0.08	LLR: 3.04 (-2.25, 2.89) [0.00, 4.00] Games: 22280 W: 5555 L: 5350 D: 11375 Ptnml(0-2): 89, 2592, 5570, 2803, 86
World	Reckless	conthist-spsa-results	diff	8.0+0.08	LLR: -2.34 (-2.25, 2.89) [0.00, 4.00] Games: 20550 W: 4890 L: 4934 D: 10726 Ptnml(0-2): 75, 2474, 5219, 2434, 73
World	Reckless	conthist-spsa	diff	8.0+0.08	Tuning 2 Parameters 3915/4000 Iterations 7830/8000 Games Played	what are the chances spsa works for 2 params?
Styxdo	Reckless	v25-f71908d1	diff	40.0+0.40	LLR: 2.89 (-2.25, 2.89) [0.00, 4.00] Games: 31488 W: 7463 L: 7250 D: 16775 Ptnml(0-2): 28, 3602, 8267, 3823, 24	S1 Higher weight to 640, S2 higher weight to 1536 (S1: 640=1, 1536 =0.5 S2: 640=0.5, 1536=1)
World	Reckless	reset-optimizer-with-warmup	diff	N=25000	LLR: -2.44 (-2.25, 2.89) [0.00, 4.00] Games: 3090 W: 952 L: 1091 D: 1047 Ptnml(0-2): 129, 375, 646, 296, 99	Replace loading the AdamW optimizer state into S2 with warmup batches

1 2 3 309 310 311 312 313 451 452 453