Image-to-Text
Transformers
Safetensors
vision-encoder-decoder
image-text-to-text
vision
ocr
trocr
handwriting-recognition
document-processing
Instructions to use WARAJA/Tzefa-Word-OCR-TrOCR with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use WARAJA/Tzefa-Word-OCR-TrOCR with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="WARAJA/Tzefa-Word-OCR-TrOCR")# Load model directly from transformers import AutoTokenizer, AutoModelForImageTextToText tokenizer = AutoTokenizer.from_pretrained("WARAJA/Tzefa-Word-OCR-TrOCR") model = AutoModelForImageTextToText.from_pretrained("WARAJA/Tzefa-Word-OCR-TrOCR") - Notebooks
- Google Colab
- Kaggle
Tzefa Word OCR Model (Fine-tuned TrOCR)
Fine-tuned TrOCR model for recognizing individual handwritten Tzefa keywords.
Architecture
- Base model:
microsoft/trocr-small-stage1 - Fine-tuned on: Handwritten Tzefa keywords (uppercase Hebrew programming commands, number words, variable names)
- Framework: HuggingFace Transformers (
VisionEncoderDecoderModel)
Vocabulary
Tzefa keywords include:
- Commands: MAKEINTEGER, MAKESTR, MAKELIST, MAKEBOOL, PRINTINTEGER, PRINTSTR, ADD, SUBTRACT, MULTIPLY, DIVIDE, WHILETRUE, IFTRUE, etc.
- Number words: ZERO through ONEHUNDRED
- User-defined variable names
Usage
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
from PIL import Image
# use_fast=False is required to prevent tokenizer conversion crash
processor = TrOCRProcessor.from_pretrained("microsoft/trocr-small-stage1", use_fast=False)
model = VisionEncoderDecoderModel.from_pretrained("WARAJA/Tzefa-Word-OCR-TrOCR")
image = Image.open("word_crop.png").convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values
generated_ids = model.generate(pixel_values)
text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(text)
Related
- Downloads last month
- 9