AI Models¶
List¶
Note:
- Normal users have user level 1.
- Temporary IP-based users have user level 0.
| Name | Type | Trained Date | Required Minimum User Level |
|---|---|---|---|
baseline |
4-Player | 2023-12-17 | (removed) |
aggressive |
4-Player | 2024-01-02 | (removed) |
defensive |
4-Player | 2024-02-14 | (removed) |
experimental-v0 |
4-Player | 2024-04-19 | (removed) |
experimental-v1 |
4-Player | 2024-04-13 | (removed) |
experimental-v2 |
4-Player | 2024-04-26 | 1 |
canary-v1 |
4-Player | 2024-06-09 | 1 |
finetuned-o1 |
4-Player | 2024-11-13 | (removed) |
finetuned-a1 |
4-Player | 2024-11-15 | 2 |
finetuned-b1 |
4-Player | 2024-11-18 | 2 |
finetuned-d1 |
4-Player | 2024-11-22 | 2 |
finetuned-r1 |
4-Player | 2024-12-01 | 2 |
finetuned-r2 |
4-Player | 2024-12-02 | 2 |
finetuned-r3 |
4-Player | 2024-12-02 | 2 |
finetuned-s1 |
4-Player | 2025-03-16 | 2 |
finetuned-s2 |
4-Player | 2025-03-18 | 2 |
3p-zero |
3-Player | 2025-01-20 | (removed) |
3p-alpha-1 |
3-Player | 2025-01-30 | (removed) |
3p-alpha-2 |
3-Player | 2025-02-02 | (removed) |
3p-beta-1 |
3-Player | 2025-02-13 | (removed) |
3p-beta-2 |
3-Player | 2025-03-01 | 1 |
3p-beta-3 |
3-Player | 2025-03-03 | 2 |
3p-beta-4 |
3-Player | 2025-03-04 | 2 |
3p-cross-1 |
3-Player | 2025-03-02 | 2 |
3p-cross-2 |
3-Player | 2025-03-22 | 2 |
3p-cross-3 |
3-Player | 2025-03-22 | 2 |
medium |
4-Player | 2024-12-20 | 1 |
mini |
4-Player | 2024-12-23 | 0 |
Specs¶
Note:
- All models have proprietary architectures except those with the Mortal architecture.
- Less training data rows do not necessarily mean that the models are weaker; instead, they imply that the training methods are improved or special.
| Name | Architecture | Training Data Rows | Inference Cost |
|---|---|---|---|
baseline |
Mortal v4 | 750 million | 0.06 |
aggressive |
Mortal v4 | 1.25 billion | 0.06 |
defensive |
Mortal v4 | 1.10 billion | 0.06 |
experimental-v0 |
Mortal v4 (modified) | 500 million | 0.06 |
experimental-v1 |
OG O | 1.00 billion | 0.05 |
experimental-v2 |
OG A | 700 million | 0.05 |
canary-v1 |
RR v0 | 250 million | 0.05 |
finetuned-o1 |
RF v0 | 5.25 billion | 0.05 |
finetuned-a1 |
RF v1 | 185 million | 0.05 |
finetuned-b1 |
RF v1 | 370 million | 0.05 |
finetuned-d1 |
RF v1 | 615 million | 0.05 |
finetuned-r1 |
RF v2 | 490 million | 0.05 |
finetuned-r2 |
RF v2 | 590 million | 0.05 |
finetuned-r3 |
RF v2 | 550 million | 0.05 |
finetuned-s1 |
RF v4 | 1.15 billion | 0.05 |
finetuned-s2 |
RF v4 | 1.35 billion | 0.05 |
3p-zero |
RF3 v0 | 2.05 billion | 0.05 |
3p-alpha-1 |
RF3 v3 | 615 million | 0.05 |
3p-alpha-2 |
RF3 v3 | 675 million | 0.05 |
3p-beta-1 |
RF3 v4 | 245 million | 0.05 |
3p-beta-2 |
RF3 v4 | 575 million | 0.05 |
3p-beta-3 |
RF3 v4 | 860 million | 0.05 |
3p-beta-4 |
RF3 v4 | 940 million | 0.05 |
3p-cross-1 |
RF3 v4 | 735 million | 0.05 |
3p-cross-2 |
RF3 v4 | 980 million | 0.05 |
3p-cross-3 |
RF3 v4 | 1.00 billion | 0.05 |
medium |
Lite I v1 | 160 million | 0.03 |
mini |
SuperLite I v1 | 185 million | 0.01 |
Performance¶
Note:
- The tests below are designed the same way as those of Mortal; i.e. 1 challenger model VS 3 champion models (or 2 champion models for 3-player tests), where each randomly generated hanchan is repeated 4 times (or 3 times for 3-player models) with all combinations of player IDs to reduce factors of luck.
- All statistics are in the perspectives of the challenger models.
- Average pt is calculated by the distributions \([90,45,0,-135]\) for 4-player models and \([135,0,-135]\) for 3-player models.
- A challenger model having better results against a specific champion model does not necessarily mean that the challenger model is stronger; it may be because the challenger model is better at exploiting the champion model, but the challenger model may be exploited in other ways as a cost. Having good results against multiple (ideally independent) models is desired, as it means that such models are robust against exploitations generally.
| Challenger | Champion | Hanchan Games | 1st Rate | 2nd Rate | 3rd Rate | 4th Rate | Average Rank | Average Pt |
|---|---|---|---|---|---|---|---|---|
canary-v1 |
aggressive |
100,000 | 23.8% | 25.6% | 27.5% | 23.1% | 2.498 | +1.817 |
canary-v1 |
defensive |
100,000 | 25.5% | 25.6% | 24.9% | 24.0% | 2.476 | +1.966 |
canary-v1 |
experimental-v2 |
100,000 | 24.0% | 25.3% | 27.5% | 23.2% | 2.498 | +1.734 |
canary-v1 |
akagi-v4-20240110-best |
100,000 | 26.5% | 26.4% | 24.6% | 22.5% | 2.431 | +5.385 |
canary-v1 |
akagi-v4-20240308-best |
200,000 | 25.0% | 26.4% | 25.1% | 23.5% | 2.471 | +2.651 |
finetuned-a1 |
akagi-v4-20240308-best |
100,000 | 25.5% | 26.2% | 25.1% | 23.2% | 2.461 | +3.335 |
finetuned-a1 |
canary-v1 |
100,000 | 25.4% | 24.9% | 25.0% | 24.7% | 2.489 | +0.798 |
finetuned-a1 |
experimental-v2 |
100,000 | 24.1% | 25.5% | 27.4% | 23.0% | 2.492 | +2.142 |
finetuned-b1 |
akagi-v4-20240308-best |
100,000 | 25.3% | 26.2% | 25.4% | 23.1% | 2.464 | +3.358 |
finetuned-b1 |
canary-v1 |
100,000 | 25.3% | 24.9% | 25.5% | 24.3% | 2.488 | +1.115 |
finetuned-b1 |
experimental-v2 |
100,000 | 24.1% | 25.6% | 27.7% | 22.6% | 2.488 | +2.723 |
finetuned-d1 |
akagi-v4-20240308-best |
100,000 | 25.1% | 26.4% | 25.6% | 22.9% | 2.463 | +3.602 |
finetuned-d1 |
canary-v1 |
100,000 | 25.0% | 25.3% | 25.6% | 24.1% | 2.488 | +1.332 |
finetuned-d1 |
experimental-v2 |
100,000 | 23.8% | 25.8% | 27.9% | 22.5% | 2.492 | +2.604 |
finetuned-r1 |
akagi-v4-20240308-best |
100,000 | 25.7% | 25.8% | 25.2% | 23.3% | 2.460 | +3.348 |
finetuned-r1 |
canary-v1 |
100,000 | 25.7% | 24.8% | 24.9% | 24.6% | 2.484 | +1.087 |
finetuned-r1 |
experimental-v2 |
100,000 | 24.4% | 25.3% | 27.4% | 22.9% | 2.487 | +2.447 |
finetuned-r2 |
akagi-v4-20240308-best |
100,000 | 25.5% | 26.1% | 25.3% | 23.1% | 2.460 | +3.465 |
finetuned-r2 |
canary-v1 |
100,000 | 25.3% | 25.1% | 25.0% | 24.6% | 2.489 | +0.896 |
finetuned-r2 |
experimental-v2 |
100,000 | 23.9% | 25.8% | 27.6% | 22.7% | 2.492 | +2.386 |
finetuned-r3 |
akagi-v4-20240308-best |
100,000 | 25.4% | 26.4% | 25.1% | 23.1% | 2.458 | +3.638 |
finetuned-r3 |
canary-v1 |
100,000 | 25.6% | 25.0% | 25.1% | 24.3% | 2.482 | +1.409 |
finetuned-r3 |
experimental-v2 |
100,000 | 24.3% | 25.5% | 27.5% | 22.7% | 2.487 | +2.644 |
finetuned-s1 |
finetuned-r3 |
160,000 | 25.0% | 25.0% | 25.0% | 25.0% | 2.500 | +0.046 |
finetuned-s1 |
finetuned-b1 |
160,000 | 25.4% | 24.9% | 24.7% | 25.0% | 2.494 | +0.222 |
finetuned-s2 |
finetuned-r3 |
160,000 | 25.0% | 25.1% | 24.9% | 25.0% | 2.498 | +0.112 |
finetuned-s2 |
finetuned-b1 |
160,000 | 25.4% | 24.9% | 24.6% | 25.1% | 2.495 | +0.119 |
3p-alpha-2 |
3p-zero |
300,000 | 34.5% | 34.4% | 31.1% | N/A | 1.967 | +4.475 |
3p-beta-1 |
3p-zero |
300,000 | 34.6% | 33.9% | 31.5% | N/A | 1.970 | +4.050 |
3p-beta-1 |
3p-alpha-2 |
300,000 | 33.4% | 33.0% | 33.6% | N/A | 2.002 | -0.293 |
3p-beta-2 |
3p-beta-1 |
300,000 | 33.5% | 33.2% | 33.3% | N/A | 1.998 | +0.300 |
3p-beta-2 |
3p-alpha-2 |
300,000 | 33.6% | 32.9% | 33.5% | N/A | 1.999 | +0.128 |
3p-beta-3 |
3p-beta-1 |
300,000 | 33.6% | 33.3% | 33.1% | N/A | 1.995 | +0.655 |
3p-beta-3 |
3p-alpha-2 |
300,000 | 33.7% | 32.8% | 33.5% | N/A | 1.998 | +0.292 |
3p-beta-4 |
3p-beta-1 |
300,000 | 33.7% | 33.2% | 33.1% | N/A | 1.994 | +0.868 |
3p-beta-4 |
3p-alpha-2 |
300,000 | 33.7% | 32.7% | 33.6% | N/A | 1.998 | +0.243 |
3p-cross-1 |
3p-beta-1 |
300,000 | 33.6% | 33.2% | 33.2% | N/A | 1.996 | +0.522 |
3p-cross-1 |
3p-alpha-2 |
300,000 | 33.8% | 32.9% | 33.3% | N/A | 1.995 | +0.661 |
3p-cross-2 |
3p-cross-1 |
300,000 | 33.6% | 33.0% | 33.4% | N/A | 1.998 | +0.277 |
3p-cross-2 |
3p-beta-4 |
300,000 | 33.4% | 33.3% | 33.3% | N/A | 1.998 | +0.220 |
3p-cross-2 |
3p-beta-2 |
300,000 | 33.6% | 33.0% | 33.4% | N/A | 1.999 | +0.190 |
3p-cross-2 |
3p-beta-1 |
300,000 | 33.7% | 33.1% | 33.2% | N/A | 1.994 | +0.747 |
3p-cross-3 |
3p-cross-1 |
300,000 | 33.6% | 33.2% | 33.2% | N/A | 1.996 | +0.515 |
3p-cross-3 |
3p-beta-4 |
300,000 | 33.5% | 33.1% | 33.4% | N/A | 1.999 | +0.161 |
3p-cross-3 |
3p-beta-2 |
300,000 | 33.6% | 33.2% | 33.2% | N/A | 1.995 | +0.638 |
3p-cross-3 |
3p-beta-1 |
300,000 | 33.8% | 33.0% | 33.2% | N/A | 1.994 | +0.779 |