YOLO Fourbooks v1 — Manchu Word Detection

Single-class YOLOv8s model for detecting Manchu words in historical bilingual manuscript page images (e.g. Qing-era Book of Changes and Mencius). Trained on high-resolution page scans with vertical Manchu script; optimized for downstream OCR and corpus digitization.

Model details

Architecture: YOLOv8s (11.1M parameters), pretrained on COCO
Task: Object detection — single class man (Manchu word)
Input resolution: 1920 px (recommended; preserves fine stroke detail)
Training run: sweep_04_yolov8s_coco_1920_m05 (mosaic=0.5)

Performance (validation)

Metric	Value
mAP@0.50	0.9949
mAP@0.50:0.95	0.9847
Precision	0.9942
Recall	0.9900
F1	0.9921
Match rate	99.00% (1,194 / 1,206 GT boxes)
False positives	7
False negatives	12

Recommended confidence threshold: 0.60 (best F1 on validation).

Dataset

Sources: Book of Changes Vol. 1 (56 pages) + Mencius Vol. 1 (153 pages)
Annotations: 8,536 Manchu word boxes (YOLO format, normalized coordinates)
Split: 85% train (178 pages, 7,330 words) / 15% val (31 pages, 1,206 words), page-level, seed=42
Class: Single class man; Chinese interlinear text excluded

Usage

Load from Hugging Face

from huggingface_hub import hf_hub_download
from ultralytics import YOLO

weights = hf_hub_download(repo_id="mic7ch/yolo_fourbooks_v1", filename="best.pt")
model = YOLO(weights)

# Inference (use imgsz=1920 for best results)
results = model.predict("page.png", imgsz=1920, conf=0.60, iou=0.50)

Recommended inference settings

imgsz: 1920
conf: 0.60
iou (NMS): 0.50

Training setup (reference)

Epochs: 100 (early stopping patience=30)
Batch size: 8
Optimizer: SGD, lr0=1e-3, lrf=0.01, warmup_epochs=3
Augmentation: Mosaic 0.5; no flip/rotation (Manchu script is directional)
Framework: Ultralytics v8.3.240, PyTorch 2.9.1+cu128

Citation

If you use this model, please refer to the experiment report in the project repository (e.g. enhanced_training/experiment_report.txt) for full methodology, metrics, and recommendations.

Downloads last month: 55