PP-LCNet_x1_0_doc_ori
Introduction
The Document Image Orientation Classification Module is primarily designed to distinguish the orientation of document images and correct them through post-processing. During processes such as document scanning or ID photo capturing, the device might be rotated to achieve clearer images, resulting in images with various orientations. Standard OCR pipelines may not handle these images effectively. By leveraging image classification techniques, the orientation of documents or IDs containing text regions can be pre-determined and adjusted, thereby improving the accuracy of OCR processing. The key accuracy metrics are as follow:
| Model | Recognition Avg Accuracy(%) | Model Storage Size (M) | Introduction |
|---|---|---|---|
| PP-LCNet_x1_0_doc_ori | 99.06 | 7 | A document image classification model based on PP-LCNet_x1_0, with four categories: 0°, 90°, 180°, and 270°. |
Model Usage
import requests
from PIL import Image
from transformers import AutoImageProcessor, AutoModelForImageClassification
model_path = "PaddlePaddle/PP-LCNet_x1_0_doc_ori_safetensors"
model = AutoModelForImageClassification.from_pretrained(model_path, device_map="auto")
image_processor = AutoImageProcessor.from_pretrained(model_path)
image = Image.open(requests.get("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/img_rot180_demo.jpg", stream=True).raw)
inputs = image_processor(images=image, return_tensors="pt").to(model.device)
outputs = model(**inputs)
predicted_label = outputs.logits.argmax(-1).item()
print(model.config.id2label[predicted_label])
- Downloads last month
- 211