Stateful AI · a learned ML core, playable
No strategy, no examples, no language model — just the rules of Kuhn poker and a few hundred thousand hands against itself. By minimizing regret it converged to a Nash equilibrium: it learned to bluff the Jack, value-bet the King about three times as often, and never bet the Queen — landing on game theory's known optimum.
You're Player 1 — you act first. Each of you antes 1 chip. The AI's card stays hidden until showdown. Watch it bluff you.
Kuhn's equilibrium is a one-parameter family, so we don't grade against a single table — we measure exploitability exactly (the most a perfect opponent could win) and it goes to zero. These are the frequencies the model actually learned:
| situation | card | P(bet / call) |
|---|