Polish Twitter Emotion Classifier (ONNX INT8)

This is the INT8 quantized ONNX version of yazoniak/twitter-emotion-pl-classifier

This model is an INT8 dynamically quantized ONNX version of the Polish Twitter Emotion Classifier, optimized for 2.9x faster inference than ONNX FP32, 9.9x faster than PyTorch, and 51% smaller compared to the original. Perfect for edge deployment and cost-sensitive applications.

Quick Links

🔗 Original PyTorch Model: yazoniak/twitter-emotion-pl-classifier
🔷 ONNX FP32 Version: yazoniak/twitter-emotion-pl-classifier-onnx (2x faster)
📊 Dataset: yazoniak/TwitterEmo-PL-Refined

Model Description

This model predicts 8 emotion and sentiment labels simultaneously for Polish text:

Emotions: radość (joy), wstręt (disgust), gniew (anger), przeczuwanie (anticipation)
Sentiment: pozytywny (positive), negatywny (negative), neutralny (neutral)
Special: sarkazm (sarcasm)

Model Details

Attribute	Value
Base Model	PKOBP/polish-roberta-8k
Original Model	yazoniak/twitter-emotion-pl-classifier
Architecture	RoBERTa for Sequence Classification
Task	Multi-label text classification
Language	Polish
Format	ONNX (INT8 Dynamic Quantization)
Quantization	AVX512_VNNI
ONNX Opset	18
Model Size	825 MB (51% smaller than FP32)
License	GPL-3.0

Performance

Benchmark Results

Metric	PyTorch	ONNX FP32	ONNX INT8	INT8 Improvement
Mean Latency (CPU)	193.89 ms	56.79 ms	19.48 ms	9.95x faster vs PyTorch
P95 Latency	350.16 ms	59.63 ms	23.05 ms	15.19x faster
Throughput	5.16/sec	17.61/sec	51.33/sec	9.95x higher
Std Deviation	56.04 ms	2.78 ms	1.83 ms	Most consistent
Model Size	1,690 MB	1,690 MB	825 MB	51% smaller

Summary:

⚡ 2.9x faster than ONNX FP32
🚀 9.9x faster than PyTorch
📦 51% smaller than FP32
🎯 Most consistent latency (std: 1.83ms)

Accuracy (100-sample evaluation)

Evaluated on 100 random samples from TwitterEmo-PL-Refined (seed=42, threshold=0.5).

ONNX FP32 produces identical predictions to PyTorch (100% agreement, prob diff ~10⁻⁸). All differences below are INT8 vs PyTorch.

INT8 Quality Difference (vs PyTorch baseline)

Metric	INT8 Δ	Note
F1 Macro	−0.0054	Negligible drop
Precision Macro	+0.0159	Slightly better
Recall Macro	−0.0140	Slightly lower
Exact Match	+1.00 pp	52% vs 51%

Per-Label F1 Difference (INT8 − PyTorch)

Label	F1 Δ
radość	0.0000
wstręt	−0.0075
gniew	−0.0197
przeczuwanie	−0.0506
pozytywny	+0.0250
negatywny	0.0000
neutralny	+0.0152

INT8 vs PyTorch Agreement

Metric	Value
Label Agreement (mean)	98.57%
Exact Match	92.00%
Mean Prob Diff	0.0154
Median Prob Diff	0.0032
Max Prob Diff	0.3546

Key takeaway: INT8 quantization maintains 98.6% label agreement with a negligible F1 macro drop of just 0.005 — well within noise for a 100-sample evaluation.

For reference, the original model accuracy on the full validation set:

F1 Macro: 0.8500
F1 Micro: 0.8900
F1 Weighted: 0.8895

See the original model card for detailed metrics.

Installation

pip install optimum[onnxruntime] transformers numpy

For GPU support (though INT8 is optimized for CPU):

pip install optimum[onnxruntime-gpu] transformers numpy

Usage

Quick Start (Command Line)

# Download the inference scripts
wget https://huggingface.co/yazoniak/twitter-emotion-pl-classifier-onnx-int8/resolve/main/predict.py
wget https://huggingface.co/yazoniak/twitter-emotion-pl-classifier-onnx-int8/resolve/main/predict_calibrated.py

# Basic inference
python predict.py "Wspaniały dzień! Jestem bardzo szczęśliwy :)"

# Calibrated inference (recommended for best accuracy)
python predict_calibrated.py "Wspaniały dzień! Jestem bardzo szczęśliwy :)"

Python API - Basic Inference

from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer
import numpy as np
import re

# Load model and tokenizer
model_name = "yazoniak/twitter-emotion-pl-classifier-onnx-int8"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = ORTModelForSequenceClassification.from_pretrained(
    model_name,
    provider="CPUExecutionProvider"  # INT8 is optimized for CPU
)

# Preprocess text (anonymize @mentions - IMPORTANT!)
def preprocess_text(text):
    return re.sub(r"@\w+", "@anonymized_account", text)

text = "@user To jest wspaniały dzień!"
processed_text = preprocess_text(text)

# Tokenize and run inference
inputs = tokenizer(processed_text, return_tensors="pt", truncation=True, max_length=8192)
outputs = model(**inputs)

# Get probabilities (sigmoid for multi-label)
logits = outputs.logits.squeeze().numpy()
probabilities = 1 / (1 + np.exp(-logits))

# Get labels above threshold
labels = [model.config.id2label[i] for i in range(model.config.num_labels)]
threshold = 0.5
predictions = {labels[i]: float(probabilities[i]) 
               for i in range(len(labels)) if probabilities[i] > threshold}

print(predictions)
# Output: {'radość': 0.9573, 'pozytywny': 0.9721}

Python API - Calibrated Inference (Recommended)

For improved accuracy, use temperature scaling and optimal thresholds:

import json
from huggingface_hub import hf_hub_download

# Download calibration artifacts
calib_path = hf_hub_download(
    repo_id="yazoniak/twitter-emotion-pl-classifier-onnx-int8",
    filename="calibration_artifacts.json"
)

with open(calib_path) as f:
    calib = json.load(f)

temperatures = calib["temperatures"]
optimal_thresholds = calib["optimal_thresholds"]

# Apply temperature scaling and optimal thresholds
calibrated_probs = {}
for i, label in enumerate(labels):
    temp = temperatures[label]
    thresh = optimal_thresholds[label]
    
    # Temperature scaling
    calibrated_logit = logits[i] / temp
    prob = 1 / (1 + np.exp(-calibrated_logit))
    
    if prob > thresh:
        calibrated_probs[label] = float(prob)

print(calibrated_probs)

When to Use This Model

Use ONNX INT8 when:

✅ Edge deployment - Mobile, embedded devices
✅ Cost-sensitive - Reduce cloud inference costs
✅ High throughput - Need 50+ predictions/sec on CPU
✅ Limited storage - 825 MB vs 1.7 GB
✅ CPU inference - AVX512-optimized for modern Intel/AMD CPUs
✅ Slight accuracy loss acceptable

Consider alternatives:

ONNX FP32: For full precision (same accuracy as PyTorch)
Original PyTorch: For fine-tuning or GPU training

Important Notes

Text Preprocessing

⚠️ The model expects @mentions to be anonymized!

The model was trained with anonymized Twitter mentions. Always preprocess text:

text = re.sub(r"@\w+", "@anonymized_account", text)

The provided scripts (predict.py, predict_calibrated.py) handle this automatically.

Calibration

For best accuracy, use calibrated inference with:

Temperature scaling (per-label)
Optimized thresholds (per-label)

See predict_calibrated.py or the calibrated inference example above.

CPU Optimization

This INT8 model is optimized for CPUs with AVX512_VNNI support. It will work on other CPUs but may not achieve the same speedup.

Limitations

Slight accuracy loss: Some predictions may differ from FP32 (usually minor probability differences)
Twitter-specific: Optimized for informal Polish social media text
Sarcasm detection: Lower performance - inherently difficult
Context length: Optimal for tweet-length texts (up to 8,192 tokens)
Formal text: May not generalize well to news or academic writing

For detailed limitations, see the original model card.

Files in This Repository

File	Size	Description
`model.onnx`	825 MB	INT8 quantized ONNX model
`config.json`	2 KB	Model configuration
`tokenizer.json`	8.2 MB	Tokenizer vocabulary
`tokenizer_config.json`	12 KB	Tokenizer settings
`calibration_artifacts.json`	1 KB	Temperature scaling & optimal thresholds
`predict.py`	4 KB	Simple inference script
`predict_calibrated.py`	5 KB	Calibrated inference script (recommended)

Citation

@model{yazoniak2025twitteremotionpl,
  title={Polish Twitter Emotion Classifier (RoBERTa-8k)},
  author={yazoniak},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/yazoniak/twitter-emotion-pl-classifier}
}

Also cite the dataset and base model:

@dataset{yazoniak_twitteremo_pl_refined_2025,
  title={TwitterEmo-PL-Refined: Polish Twitter Emotions (8 labels, refined)},
  author={yazoniak},
  year={2025},
  url={https://huggingface.co/datasets/yazoniak/TwitterEmo-PL-Refined}
}

@inproceedings{bogdanowicz2023twitteremo,
  title={TwitterEmo: Annotating Emotions and Sentiment in Polish Twitter},
  author={Bogdanowicz, S. and Cwynar, H. and Zwierzchowska, A. and Klamra, C. and Kiera{\'s}, W. and Kobyli{\'n}ski, {\L}.},
  booktitle={Computational Science -- ICCS 2023},
  series={Lecture Notes in Computer Science},
  volume={14074},
  publisher={Springer, Cham},
  year={2023},
  doi={10.1007/978-3-031-36021-3_20}
}

License

This model is released under the GNU General Public License v3.0 (GPL-3.0), inherited from the training dataset.

License Chain:

Base Model (PKOBP/polish-roberta-8k): Apache-2.0
Training Dataset (TwitterEmo-PL-Refined): GPL-3.0
Original Model (yazoniak/twitter-emotion-pl-classifier): GPL-3.0
This ONNX INT8 Model: GPL-3.0

Acknowledgments

Original Model: yazoniak/twitter-emotion-pl-classifier
Base Model: PKOBP/polish-roberta-8k
Dataset: CLARIN-PL TwitterEmo
Conversion & Quantization: Hugging Face Optimum

Model Version: v1.0-onnx-int8
Last Updated: 2026-01-29

Downloads last month: 20

Model tree for yazoniak/twitter-emotion-pl-classifier-ONNX-int8

Base model

PKOBP/polish-roberta-8k

Finetuned

yazoniak/twitter-emotion-pl-classifier

Quantized

(2)

this model