Instructions to use atamano/whisper-chess-tiny-fr with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use atamano/whisper-chess-tiny-fr with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="atamano/whisper-chess-tiny-fr")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("atamano/whisper-chess-tiny-fr") model = AutoModelForSpeechSeq2Seq.from_pretrained("atamano/whisper-chess-tiny-fr") - Notebooks
- Google Colab
- Kaggle
whisper-chess-tiny-fr
Fine-tuned Whisper-tiny for chess move recognition in FranΓ§ais (French).
Part of the SpeakChess project β play chess by voice in EN / FR / DE / ES.
Performance
- Test WER: 0.03% on a held-out set of 3,798 samples (75% synth / 25% human, May 2026 retrain)
- Human-only val WER: 0.44%
- Speaker-stratified eval (held-out contributor, 610 samples): 0.00% WER
- Domain: chess moves only (notation like
Nf3,exd5,O-O) - Optimized for browser inference via transformers.js (ONNX + INT8) and React Native / mobile via whisper.cpp (GGML)
Files
ONNX (root) β for transformers.js / onnxruntime-web:
onnx/encoder_model_int8.onnxβ INT8 encoder (Conv layers kept FP32 for WASM compat)onnx/decoder_model_merged_int8.onnxβ INT8 merged decoder- Standard Whisper tokenizer/processor files
Total ONNX runtime download: ~50 MB.
GGML (ggml/) β for whisper.cpp / mobile (React Native, Flutter, native apps):
ggml/ggml-tiny.bin(77.7 MB) β FP16ggml/ggml-tiny-q5_0.bin(29.9 MB) β Q5_0 quantized (recommended for mobile)
Usage
transformers.js (browser, Node)
import { pipeline } from "@huggingface/transformers";
const transcriber = await pipeline(
"automatic-speech-recognition",
"atamano/whisper-chess-tiny-fr",
{ dtype: { encoder_model: "int8", decoder_model_merged: "int8" } }
);
const result = await transcriber(audio, { language: "fr", task: "transcribe" });
console.log(result.text); // e.g. "fou d 5"
Python (transformers + ONNX Runtime)
from transformers import WhisperProcessor
from optimum.onnxruntime import ORTModelForSpeechSeq2Seq
processor = WhisperProcessor.from_pretrained("atamano/whisper-chess-tiny-fr")
model = ORTModelForSpeechSeq2Seq.from_pretrained(
"atamano/whisper-chess-tiny-fr",
encoder_file_name="encoder_model_int8.onnx",
decoder_file_name="decoder_model_int8.onnx",
use_cache=True,
)
model.generation_config.forced_decoder_ids = processor.get_decoder_prompt_ids(
language="fr", task="transcribe", no_timestamps=True
)
whisper.cpp / React Native / Flutter (GGML)
Download ggml/ggml-tiny-q5_0.bin and pass it to your whisper.cpp binding of choice:
- React Native: whisper.rn (
initWhisper({ filePath: '.../ggml-tiny-q5_0.bin' })) - Flutter: whisper_ggml
- Native: any whisper.cpp build
β οΈ Recommended post-processing β read this
The model outputs spoken French text ("fou d 5", "tour prend e huit", "petit roque"), not algebraic notation. To play the move on a board you need two more steps that this checkpoint does NOT do for you:
- Parse the spoken text into algebraic notation (e.g.
"fou d 5"βBd5). - Validate against legal moves on the current board, with a fuzzy fallback for single-letter file/rank confusions (
a/b/c/d/e/f/g/hsound very close in French and the small Whisper architecture confuses them on roughly 10% of utterances).
The SpeakChess web app ships an open-source TypeScript implementation of both steps at:
Key exports:
parseChessMove(text)β fuzzy speech β algebraic notationfindClosestLegalMove(text, legalSans, language)β when the parsed move isn't legal, picks the legal move whose canonical spoken form has the smallest edit distance to the transcription. Resolves ~7% of in-production misrecognitions without any model change.
The recommended pipeline:
audio
β Whisper (this model)
text ("fou g 5")
β parseChessMove
SAN ("Bg5")
β chess.js / python-chess legal moves filter
β
ββ legal β play it
ββ illegal β findClosestLegalMove(text, legalSans, "fr")
β ("Bd5" with edit-distance 1)
play it
A vocabulary file with all (notation, spoken form, language) triples used at training time is available at data/processed/training_moves.json β regenerate from the canonical vocab via python training/generate_moves.py.
Training
See github.com/atamano/speakchess for the full pipeline.
- Base:
openai/whisper-tiny(39M params, 98% trainable via full fine-tune) - Method: Full fine-tuning, no LoRA, no gradient checkpointing (MPS quirk)
- Data: 9,582 synthetic (Edge TTS, France / Canada / Belgium / Switzerland accents) + 3,097 validated human recordings from
speakchess.indiefoundry.com/contribute - Augmentation: SpecAugment + audiomentations at runtime (noise / pitch / time-stretch / EQ)
suppress_tokenswhitelist: 164 chess-vocab tokens, all others suppressed at generation- 5 epochs, batch 8 Γ grad-accum 2, LR 1e-4 linear, human-oversample 4Γ
License β Important
This model is licensed under CC BY-NC-SA 4.0. Summary:
- β Free to use for personal projects, research, education, and non-commercial demos
- β Free to share and adapt with attribution and same-license derivatives
- β Commercial use is NOT permitted without a separate license
- β Including the model in commercial products, paid services, or competing voice-chess offerings requires explicit written permission
For commercial licensing inquiries: antoine@darksquares.net
The full license text: https://creativecommons.org/licenses/by-nc-sa/4.0/
All rights reserved beyond the CC BY-NC-SA 4.0 grant.
- Downloads last month
- 130
Model tree for atamano/whisper-chess-tiny-fr
Base model
openai/whisper-tiny