Polish Twitter Emotion Classifier (ONNX INT8)
This is the INT8 quantized ONNX version of yazoniak/twitter-emotion-pl-classifier
This model is an INT8 dynamically quantized ONNX version of the Polish Twitter Emotion Classifier, optimized for 2.9x faster inference than ONNX FP32, 9.9x faster than PyTorch, and 51% smaller compared to the original. Perfect for edge deployment and cost-sensitive applications.
Quick Links
- 🔗 Original PyTorch Model: yazoniak/twitter-emotion-pl-classifier
- 🔷 ONNX FP32 Version: yazoniak/twitter-emotion-pl-classifier-onnx (2x faster)
- 📊 Dataset: yazoniak/TwitterEmo-PL-Refined
Model Description
This model predicts 8 emotion and sentiment labels simultaneously for Polish text:
- Emotions:
radość(joy),wstręt(disgust),gniew(anger),przeczuwanie(anticipation) - Sentiment:
pozytywny(positive),negatywny(negative),neutralny(neutral) - Special:
sarkazm(sarcasm)
Model Details
| Attribute | Value |
|---|---|
| Base Model | PKOBP/polish-roberta-8k |
| Original Model | yazoniak/twitter-emotion-pl-classifier |
| Architecture | RoBERTa for Sequence Classification |
| Task | Multi-label text classification |
| Language | Polish |
| Format | ONNX (INT8 Dynamic Quantization) |
| Quantization | AVX512_VNNI |
| ONNX Opset | 18 |
| Model Size | 825 MB (51% smaller than FP32) |
| License | GPL-3.0 |
Performance
Benchmark Results
| Metric | PyTorch | ONNX FP32 | ONNX INT8 | INT8 Improvement |
|---|---|---|---|---|
| Mean Latency (CPU) | 193.89 ms | 56.79 ms | 19.48 ms | 9.95x faster vs PyTorch |
| P95 Latency | 350.16 ms | 59.63 ms | 23.05 ms | 15.19x faster |
| Throughput | 5.16/sec | 17.61/sec | 51.33/sec | 9.95x higher |
| Std Deviation | 56.04 ms | 2.78 ms | 1.83 ms | Most consistent |
| Model Size | 1,690 MB | 1,690 MB | 825 MB | 51% smaller |
Summary:
- ⚡ 2.9x faster than ONNX FP32
- 🚀 9.9x faster than PyTorch
- 📦 51% smaller than FP32
- 🎯 Most consistent latency (std: 1.83ms)
Accuracy (100-sample evaluation)
Evaluated on 100 random samples from TwitterEmo-PL-Refined (seed=42, threshold=0.5).
ONNX FP32 produces identical predictions to PyTorch (100% agreement, prob diff ~10⁻⁸). All differences below are INT8 vs PyTorch.
INT8 Quality Difference (vs PyTorch baseline)
| Metric | INT8 Δ | Note |
|---|---|---|
| F1 Macro | −0.0054 | Negligible drop |
| Precision Macro | +0.0159 | Slightly better |
| Recall Macro | −0.0140 | Slightly lower |
| Exact Match | +1.00 pp | 52% vs 51% |
Per-Label F1 Difference (INT8 − PyTorch)
| Label | F1 Δ |
|---|---|
| radość | 0.0000 |
| wstręt | −0.0075 |
| gniew | −0.0197 |
| przeczuwanie | −0.0506 |
| pozytywny | +0.0250 |
| negatywny | 0.0000 |
| neutralny | +0.0152 |
INT8 vs PyTorch Agreement
| Metric | Value |
|---|---|
| Label Agreement (mean) | 98.57% |
| Exact Match | 92.00% |
| Mean Prob Diff | 0.0154 |
| Median Prob Diff | 0.0032 |
| Max Prob Diff | 0.3546 |
Key takeaway: INT8 quantization maintains 98.6% label agreement with a negligible F1 macro drop of just 0.005 — well within noise for a 100-sample evaluation.
For reference, the original model accuracy on the full validation set:
- F1 Macro: 0.8500
- F1 Micro: 0.8900
- F1 Weighted: 0.8895
See the original model card for detailed metrics.
Installation
pip install optimum[onnxruntime] transformers numpy
For GPU support (though INT8 is optimized for CPU):
pip install optimum[onnxruntime-gpu] transformers numpy
Usage
Quick Start (Command Line)
# Download the inference scripts
wget https://huggingface.co/yazoniak/twitter-emotion-pl-classifier-onnx-int8/resolve/main/predict.py
wget https://huggingface.co/yazoniak/twitter-emotion-pl-classifier-onnx-int8/resolve/main/predict_calibrated.py
# Basic inference
python predict.py "Wspaniały dzień! Jestem bardzo szczęśliwy :)"
# Calibrated inference (recommended for best accuracy)
python predict_calibrated.py "Wspaniały dzień! Jestem bardzo szczęśliwy :)"
Python API - Basic Inference
from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer
import numpy as np
import re
# Load model and tokenizer
model_name = "yazoniak/twitter-emotion-pl-classifier-onnx-int8"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = ORTModelForSequenceClassification.from_pretrained(
model_name,
provider="CPUExecutionProvider" # INT8 is optimized for CPU
)
# Preprocess text (anonymize @mentions - IMPORTANT!)
def preprocess_text(text):
return re.sub(r"@\w+", "@anonymized_account", text)
text = "@user To jest wspaniały dzień!"
processed_text = preprocess_text(text)
# Tokenize and run inference
inputs = tokenizer(processed_text, return_tensors="pt", truncation=True, max_length=8192)
outputs = model(**inputs)
# Get probabilities (sigmoid for multi-label)
logits = outputs.logits.squeeze().numpy()
probabilities = 1 / (1 + np.exp(-logits))
# Get labels above threshold
labels = [model.config.id2label[i] for i in range(model.config.num_labels)]
threshold = 0.5
predictions = {labels[i]: float(probabilities[i])
for i in range(len(labels)) if probabilities[i] > threshold}
print(predictions)
# Output: {'radość': 0.9573, 'pozytywny': 0.9721}
Python API - Calibrated Inference (Recommended)
For improved accuracy, use temperature scaling and optimal thresholds:
import json
from huggingface_hub import hf_hub_download
# Download calibration artifacts
calib_path = hf_hub_download(
repo_id="yazoniak/twitter-emotion-pl-classifier-onnx-int8",
filename="calibration_artifacts.json"
)
with open(calib_path) as f:
calib = json.load(f)
temperatures = calib["temperatures"]
optimal_thresholds = calib["optimal_thresholds"]
# Apply temperature scaling and optimal thresholds
calibrated_probs = {}
for i, label in enumerate(labels):
temp = temperatures[label]
thresh = optimal_thresholds[label]
# Temperature scaling
calibrated_logit = logits[i] / temp
prob = 1 / (1 + np.exp(-calibrated_logit))
if prob > thresh:
calibrated_probs[label] = float(prob)
print(calibrated_probs)
When to Use This Model
Use ONNX INT8 when:
- ✅ Edge deployment - Mobile, embedded devices
- ✅ Cost-sensitive - Reduce cloud inference costs
- ✅ High throughput - Need 50+ predictions/sec on CPU
- ✅ Limited storage - 825 MB vs 1.7 GB
- ✅ CPU inference - AVX512-optimized for modern Intel/AMD CPUs
- ✅ Slight accuracy loss acceptable
Consider alternatives:
- ONNX FP32: For full precision (same accuracy as PyTorch)
- Original PyTorch: For fine-tuning or GPU training
Important Notes
Text Preprocessing
⚠️ The model expects @mentions to be anonymized!
The model was trained with anonymized Twitter mentions. Always preprocess text:
text = re.sub(r"@\w+", "@anonymized_account", text)
The provided scripts (predict.py, predict_calibrated.py) handle this automatically.
Calibration
For best accuracy, use calibrated inference with:
- Temperature scaling (per-label)
- Optimized thresholds (per-label)
See predict_calibrated.py or the calibrated inference example above.
CPU Optimization
This INT8 model is optimized for CPUs with AVX512_VNNI support. It will work on other CPUs but may not achieve the same speedup.
Limitations
- Slight accuracy loss: Some predictions may differ from FP32 (usually minor probability differences)
- Twitter-specific: Optimized for informal Polish social media text
- Sarcasm detection: Lower performance - inherently difficult
- Context length: Optimal for tweet-length texts (up to 8,192 tokens)
- Formal text: May not generalize well to news or academic writing
For detailed limitations, see the original model card.
Files in This Repository
| File | Size | Description |
|---|---|---|
model.onnx |
825 MB | INT8 quantized ONNX model |
config.json |
2 KB | Model configuration |
tokenizer.json |
8.2 MB | Tokenizer vocabulary |
tokenizer_config.json |
12 KB | Tokenizer settings |
calibration_artifacts.json |
1 KB | Temperature scaling & optimal thresholds |
predict.py |
4 KB | Simple inference script |
predict_calibrated.py |
5 KB | Calibrated inference script (recommended) |
Citation
@model{yazoniak2025twitteremotionpl,
title={Polish Twitter Emotion Classifier (RoBERTa-8k)},
author={yazoniak},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/yazoniak/twitter-emotion-pl-classifier}
}
Also cite the dataset and base model:
@dataset{yazoniak_twitteremo_pl_refined_2025,
title={TwitterEmo-PL-Refined: Polish Twitter Emotions (8 labels, refined)},
author={yazoniak},
year={2025},
url={https://huggingface.co/datasets/yazoniak/TwitterEmo-PL-Refined}
}
@inproceedings{bogdanowicz2023twitteremo,
title={TwitterEmo: Annotating Emotions and Sentiment in Polish Twitter},
author={Bogdanowicz, S. and Cwynar, H. and Zwierzchowska, A. and Klamra, C. and Kiera{\'s}, W. and Kobyli{\'n}ski, {\L}.},
booktitle={Computational Science -- ICCS 2023},
series={Lecture Notes in Computer Science},
volume={14074},
publisher={Springer, Cham},
year={2023},
doi={10.1007/978-3-031-36021-3_20}
}
License
This model is released under the GNU General Public License v3.0 (GPL-3.0), inherited from the training dataset.
License Chain:
- Base Model (PKOBP/polish-roberta-8k): Apache-2.0
- Training Dataset (TwitterEmo-PL-Refined): GPL-3.0
- Original Model (yazoniak/twitter-emotion-pl-classifier): GPL-3.0
- This ONNX INT8 Model: GPL-3.0
Acknowledgments
- Original Model: yazoniak/twitter-emotion-pl-classifier
- Base Model: PKOBP/polish-roberta-8k
- Dataset: CLARIN-PL TwitterEmo
- Conversion & Quantization: Hugging Face Optimum
Model Version: v1.0-onnx-int8
Last Updated: 2026-01-29
- Downloads last month
- 20
Model tree for yazoniak/twitter-emotion-pl-classifier-ONNX-int8
Base model
PKOBP/polish-roberta-8k