YOLO Fourbooks v1 โ Manchu Word Detection
Single-class YOLOv8s model for detecting Manchu words in historical bilingual manuscript page images (e.g. Qing-era Book of Changes and Mencius). Trained on high-resolution page scans with vertical Manchu script; optimized for downstream OCR and corpus digitization.
Model details
- Architecture: YOLOv8s (11.1M parameters), pretrained on COCO
- Task: Object detection โ single class
man(Manchu word) - Input resolution: 1920 px (recommended; preserves fine stroke detail)
- Training run: sweep_04_yolov8s_coco_1920_m05 (mosaic=0.5)
Performance (validation)
| Metric | Value |
|---|---|
| mAP@0.50 | 0.9949 |
| mAP@0.50:0.95 | 0.9847 |
| Precision | 0.9942 |
| Recall | 0.9900 |
| F1 | 0.9921 |
| Match rate | 99.00% (1,194 / 1,206 GT boxes) |
| False positives | 7 |
| False negatives | 12 |
Recommended confidence threshold: 0.60 (best F1 on validation).
Dataset
- Sources: Book of Changes Vol. 1 (56 pages) + Mencius Vol. 1 (153 pages)
- Annotations: 8,536 Manchu word boxes (YOLO format, normalized coordinates)
- Split: 85% train (178 pages, 7,330 words) / 15% val (31 pages, 1,206 words), page-level, seed=42
- Class: Single class
man; Chinese interlinear text excluded
Usage
Load from Hugging Face
from huggingface_hub import hf_hub_download
from ultralytics import YOLO
weights = hf_hub_download(repo_id="mic7ch/yolo_fourbooks_v1", filename="best.pt")
model = YOLO(weights)
# Inference (use imgsz=1920 for best results)
results = model.predict("page.png", imgsz=1920, conf=0.60, iou=0.50)
Recommended inference settings
- imgsz: 1920
- conf: 0.60
- iou (NMS): 0.50
Training setup (reference)
- Epochs: 100 (early stopping patience=30)
- Batch size: 8
- Optimizer: SGD, lr0=1e-3, lrf=0.01, warmup_epochs=3
- Augmentation: Mosaic 0.5; no flip/rotation (Manchu script is directional)
- Framework: Ultralytics v8.3.240, PyTorch 2.9.1+cu128
Citation
If you use this model, please refer to the experiment report in the project repository (e.g. enhanced_training/experiment_report.txt) for full methodology, metrics, and recommendations.
- Downloads last month
- 55