whisper-chess-tiny-fr

Fine-tuned Whisper-tiny for chess move recognition in Français (French).

Part of the SpeakChess project — play chess by voice in EN / FR / DE / ES.

Performance

Test WER: 0.03% on a held-out set of 3,798 samples (75% synth / 25% human, May 2026 retrain)
Human-only val WER: 0.44%
Speaker-stratified eval (held-out contributor, 610 samples): 0.00% WER
Domain: chess moves only (notation like Nf3, exd5, O-O)
Optimized for browser inference via transformers.js (ONNX + INT8) and React Native / mobile via whisper.cpp (GGML)

Files

ONNX (root) — for transformers.js / onnxruntime-web:

onnx/encoder_model_int8.onnx — INT8 encoder (Conv layers kept FP32 for WASM compat)
onnx/decoder_model_merged_int8.onnx — INT8 merged decoder
Standard Whisper tokenizer/processor files

Total ONNX runtime download: ~50 MB.

GGML (ggml/) — for whisper.cpp / mobile (React Native, Flutter, native apps):

ggml/ggml-tiny.bin (77.7 MB) — FP16
ggml/ggml-tiny-q5_0.bin (29.9 MB) — Q5_0 quantized (recommended for mobile)

Usage

transformers.js (browser, Node)

import { pipeline } from "@huggingface/transformers";

const transcriber = await pipeline(
  "automatic-speech-recognition",
  "atamano/whisper-chess-tiny-fr",
  { dtype: { encoder_model: "int8", decoder_model_merged: "int8" } }
);

const result = await transcriber(audio, { language: "fr", task: "transcribe" });
console.log(result.text); // e.g. "fou d 5"

Python (transformers + ONNX Runtime)

from transformers import WhisperProcessor
from optimum.onnxruntime import ORTModelForSpeechSeq2Seq

processor = WhisperProcessor.from_pretrained("atamano/whisper-chess-tiny-fr")
model = ORTModelForSpeechSeq2Seq.from_pretrained(
    "atamano/whisper-chess-tiny-fr",
    encoder_file_name="encoder_model_int8.onnx",
    decoder_file_name="decoder_model_int8.onnx",
    use_cache=True,
)
model.generation_config.forced_decoder_ids = processor.get_decoder_prompt_ids(
    language="fr", task="transcribe", no_timestamps=True
)

whisper.cpp / React Native / Flutter (GGML)

Download ggml/ggml-tiny-q5_0.bin and pass it to your whisper.cpp binding of choice:

React Native: whisper.rn (initWhisper({ filePath: '.../ggml-tiny-q5_0.bin' }))
Flutter: whisper_ggml
Native: any whisper.cpp build

⚠️ Recommended post-processing — read this

The model outputs spoken French text ("fou d 5", "tour prend e huit", "petit roque"), not algebraic notation. To play the move on a board you need two more steps that this checkpoint does NOT do for you:

Parse the spoken text into algebraic notation (e.g. "fou d 5" → Bd5).
Validate against legal moves on the current board, with a fuzzy fallback for single-letter file/rank confusions (a/b/c/d/e/f/g/h sound very close in French and the small Whisper architecture confuses them on roughly 10% of utterances).

The SpeakChess web app ships an open-source TypeScript implementation of both steps at:

next-web/src/lib/chessParser.ts (MIT)

Key exports:

parseChessMove(text) — fuzzy speech → algebraic notation
findClosestLegalMove(text, legalSans, language) — when the parsed move isn't legal, picks the legal move whose canonical spoken form has the smallest edit distance to the transcription. Resolves ~7% of in-production misrecognitions without any model change.

The recommended pipeline:

audio
  ↓ Whisper (this model)
text ("fou g 5")
  ↓ parseChessMove
SAN ("Bg5")
  ↓ chess.js / python-chess legal moves filter
   │
   ├─ legal → play it
   └─ illegal → findClosestLegalMove(text, legalSans, "fr")
                  ↓ ("Bd5" with edit-distance 1)
                play it

A vocabulary file with all (notation, spoken form, language) triples used at training time is available at data/processed/training_moves.json — regenerate from the canonical vocab via python training/generate_moves.py.

Training

See github.com/atamano/speakchess for the full pipeline.

Base: openai/whisper-tiny (39M params, 98% trainable via full fine-tune)
Method: Full fine-tuning, no LoRA, no gradient checkpointing (MPS quirk)
Data: 9,582 synthetic (Edge TTS, France / Canada / Belgium / Switzerland accents) + 3,097 validated human recordings from speakchess.indiefoundry.com/contribute
Augmentation: SpecAugment + audiomentations at runtime (noise / pitch / time-stretch / EQ)
suppress_tokens whitelist: 164 chess-vocab tokens, all others suppressed at generation
5 epochs, batch 8 × grad-accum 2, LR 1e-4 linear, human-oversample 4×

License — Important

This model is licensed under CC BY-NC-SA 4.0. Summary:

✅ Free to use for personal projects, research, education, and non-commercial demos
✅ Free to share and adapt with attribution and same-license derivatives
❌ Commercial use is NOT permitted without a separate license
❌ Including the model in commercial products, paid services, or competing voice-chess offerings requires explicit written permission

For commercial licensing inquiries: antoine@darksquares.net

The full license text: https://creativecommons.org/licenses/by-nc-sa/4.0/

Downloads last month: 130

Model tree for atamano/whisper-chess-tiny-fr

Base model

openai/whisper-tiny

Quantized

(40)

this model