gomoku-9x9

AlphaZero for 9x9 free-style gomoku (five-in-a-row, no opening restrictions).

Trained on a single Apple Silicon machine using PyTorch + MPS. Source code: https://github.com/jasonyandell/gomoku

How to load

import torch
from huggingface_hub import hf_hub_download
from gomoku.model import load_checkpoint

path = hf_hub_download("jasonyandell/gomoku-9x9", "model.pt")
model, payload = load_checkpoint(path, device="cpu")
model.eval()

# payload["epoch"], payload["total_games"], payload["model_config"]

For a quick value/policy on the empty board:

from gomoku.game import GameState
x = torch.from_numpy(GameState.initial().to_planes()).unsqueeze(0)
with torch.no_grad():
    policy_logits, value = model(x)

Architecture

  • Input: (B, 3, 9, 9) โ€” own stones, opponent stones, side-to-move plane
  • Backbone: small ResNet (preset small: 64 filters x 4 residual blocks)
  • Policy head: 81-way logits over board squares
  • Value head: scalar in [-1, 1] via tanh
  • ~316k parameters at the small preset (see gomoku/model.py)

Training setup

  • Apple Silicon (M5 Max) using PyTorch with the MPS backend
  • Single-process AlphaZero loop (self-play -> replay buffer -> train)
  • 200 MCTS simulations per move during self-play
  • ~28 seconds per training epoch
  • Checkpoints written every 5 epochs to checkpoints/

Current strength (honest)

Training is ongoing. These checkpoints reflect a small single-machine run on Apple Silicon, not a production-grade AlphaZero engine. Strength scales with checkpoint epoch โ€” see training_state.json in this repo for the epoch this snapshot corresponds to.

Files

  • model.pt โ€” slimmed checkpoint (weights + model_config + provenance). Compatible with gomoku.model.load_checkpoint.
  • config.json โ€” model architecture config as JSON.
  • training_state.json โ€” {epoch, total_games, wandb_run_id?} for the current snapshot.

License

MIT.

Downloads last month
45
Video Preview
loading