Polish Twitter Emotion Classifier (ONNX INT8)

This is the INT8 quantized ONNX version of yazoniak/twitter-emotion-pl-classifier

This model is an INT8 dynamically quantized ONNX version of the Polish Twitter Emotion Classifier, optimized for 2.9x faster inference than ONNX FP32, 9.9x faster than PyTorch, and 51% smaller compared to the original. Perfect for edge deployment and cost-sensitive applications.

Quick Links

Model Description

This model predicts 8 emotion and sentiment labels simultaneously for Polish text:

  • Emotions: radość (joy), wstręt (disgust), gniew (anger), przeczuwanie (anticipation)
  • Sentiment: pozytywny (positive), negatywny (negative), neutralny (neutral)
  • Special: sarkazm (sarcasm)

Model Details

Attribute Value
Base Model PKOBP/polish-roberta-8k
Original Model yazoniak/twitter-emotion-pl-classifier
Architecture RoBERTa for Sequence Classification
Task Multi-label text classification
Language Polish
Format ONNX (INT8 Dynamic Quantization)
Quantization AVX512_VNNI
ONNX Opset 18
Model Size 825 MB (51% smaller than FP32)
License GPL-3.0

Performance

Benchmark Results

Metric PyTorch ONNX FP32 ONNX INT8 INT8 Improvement
Mean Latency (CPU) 193.89 ms 56.79 ms 19.48 ms 9.95x faster vs PyTorch
P95 Latency 350.16 ms 59.63 ms 23.05 ms 15.19x faster
Throughput 5.16/sec 17.61/sec 51.33/sec 9.95x higher
Std Deviation 56.04 ms 2.78 ms 1.83 ms Most consistent
Model Size 1,690 MB 1,690 MB 825 MB 51% smaller

Summary:

  • 2.9x faster than ONNX FP32
  • 🚀 9.9x faster than PyTorch
  • 📦 51% smaller than FP32
  • 🎯 Most consistent latency (std: 1.83ms)

Accuracy (100-sample evaluation)

Evaluated on 100 random samples from TwitterEmo-PL-Refined (seed=42, threshold=0.5).

ONNX FP32 produces identical predictions to PyTorch (100% agreement, prob diff ~10⁻⁸). All differences below are INT8 vs PyTorch.

INT8 Quality Difference (vs PyTorch baseline)

Metric INT8 Δ Note
F1 Macro −0.0054 Negligible drop
Precision Macro +0.0159 Slightly better
Recall Macro −0.0140 Slightly lower
Exact Match +1.00 pp 52% vs 51%

Per-Label F1 Difference (INT8 − PyTorch)

Label F1 Δ
radość 0.0000
wstręt −0.0075
gniew −0.0197
przeczuwanie −0.0506
pozytywny +0.0250
negatywny 0.0000
neutralny +0.0152

INT8 vs PyTorch Agreement

Metric Value
Label Agreement (mean) 98.57%
Exact Match 92.00%
Mean Prob Diff 0.0154
Median Prob Diff 0.0032
Max Prob Diff 0.3546

Key takeaway: INT8 quantization maintains 98.6% label agreement with a negligible F1 macro drop of just 0.005 — well within noise for a 100-sample evaluation.

For reference, the original model accuracy on the full validation set:

  • F1 Macro: 0.8500
  • F1 Micro: 0.8900
  • F1 Weighted: 0.8895

See the original model card for detailed metrics.

Installation

pip install optimum[onnxruntime] transformers numpy

For GPU support (though INT8 is optimized for CPU):

pip install optimum[onnxruntime-gpu] transformers numpy

Usage

Quick Start (Command Line)

# Download the inference scripts
wget https://huggingface.co/yazoniak/twitter-emotion-pl-classifier-onnx-int8/resolve/main/predict.py
wget https://huggingface.co/yazoniak/twitter-emotion-pl-classifier-onnx-int8/resolve/main/predict_calibrated.py

# Basic inference
python predict.py "Wspaniały dzień! Jestem bardzo szczęśliwy :)"

# Calibrated inference (recommended for best accuracy)
python predict_calibrated.py "Wspaniały dzień! Jestem bardzo szczęśliwy :)"

Python API - Basic Inference

from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer
import numpy as np
import re

# Load model and tokenizer
model_name = "yazoniak/twitter-emotion-pl-classifier-onnx-int8"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = ORTModelForSequenceClassification.from_pretrained(
    model_name,
    provider="CPUExecutionProvider"  # INT8 is optimized for CPU
)

# Preprocess text (anonymize @mentions - IMPORTANT!)
def preprocess_text(text):
    return re.sub(r"@\w+", "@anonymized_account", text)

text = "@user To jest wspaniały dzień!"
processed_text = preprocess_text(text)

# Tokenize and run inference
inputs = tokenizer(processed_text, return_tensors="pt", truncation=True, max_length=8192)
outputs = model(**inputs)

# Get probabilities (sigmoid for multi-label)
logits = outputs.logits.squeeze().numpy()
probabilities = 1 / (1 + np.exp(-logits))

# Get labels above threshold
labels = [model.config.id2label[i] for i in range(model.config.num_labels)]
threshold = 0.5
predictions = {labels[i]: float(probabilities[i]) 
               for i in range(len(labels)) if probabilities[i] > threshold}

print(predictions)
# Output: {'radość': 0.9573, 'pozytywny': 0.9721}

Python API - Calibrated Inference (Recommended)

For improved accuracy, use temperature scaling and optimal thresholds:

import json
from huggingface_hub import hf_hub_download

# Download calibration artifacts
calib_path = hf_hub_download(
    repo_id="yazoniak/twitter-emotion-pl-classifier-onnx-int8",
    filename="calibration_artifacts.json"
)

with open(calib_path) as f:
    calib = json.load(f)

temperatures = calib["temperatures"]
optimal_thresholds = calib["optimal_thresholds"]

# Apply temperature scaling and optimal thresholds
calibrated_probs = {}
for i, label in enumerate(labels):
    temp = temperatures[label]
    thresh = optimal_thresholds[label]
    
    # Temperature scaling
    calibrated_logit = logits[i] / temp
    prob = 1 / (1 + np.exp(-calibrated_logit))
    
    if prob > thresh:
        calibrated_probs[label] = float(prob)

print(calibrated_probs)

When to Use This Model

Use ONNX INT8 when:

  • Edge deployment - Mobile, embedded devices
  • Cost-sensitive - Reduce cloud inference costs
  • High throughput - Need 50+ predictions/sec on CPU
  • Limited storage - 825 MB vs 1.7 GB
  • CPU inference - AVX512-optimized for modern Intel/AMD CPUs
  • Slight accuracy loss acceptable

Consider alternatives:

  • ONNX FP32: For full precision (same accuracy as PyTorch)
  • Original PyTorch: For fine-tuning or GPU training

Important Notes

Text Preprocessing

⚠️ The model expects @mentions to be anonymized!

The model was trained with anonymized Twitter mentions. Always preprocess text:

text = re.sub(r"@\w+", "@anonymized_account", text)

The provided scripts (predict.py, predict_calibrated.py) handle this automatically.

Calibration

For best accuracy, use calibrated inference with:

  • Temperature scaling (per-label)
  • Optimized thresholds (per-label)

See predict_calibrated.py or the calibrated inference example above.

CPU Optimization

This INT8 model is optimized for CPUs with AVX512_VNNI support. It will work on other CPUs but may not achieve the same speedup.

Limitations

  • Slight accuracy loss: Some predictions may differ from FP32 (usually minor probability differences)
  • Twitter-specific: Optimized for informal Polish social media text
  • Sarcasm detection: Lower performance - inherently difficult
  • Context length: Optimal for tweet-length texts (up to 8,192 tokens)
  • Formal text: May not generalize well to news or academic writing

For detailed limitations, see the original model card.

Files in This Repository

File Size Description
model.onnx 825 MB INT8 quantized ONNX model
config.json 2 KB Model configuration
tokenizer.json 8.2 MB Tokenizer vocabulary
tokenizer_config.json 12 KB Tokenizer settings
calibration_artifacts.json 1 KB Temperature scaling & optimal thresholds
predict.py 4 KB Simple inference script
predict_calibrated.py 5 KB Calibrated inference script (recommended)

Citation

@model{yazoniak2025twitteremotionpl,
  title={Polish Twitter Emotion Classifier (RoBERTa-8k)},
  author={yazoniak},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/yazoniak/twitter-emotion-pl-classifier}
}

Also cite the dataset and base model:

@dataset{yazoniak_twitteremo_pl_refined_2025,
  title={TwitterEmo-PL-Refined: Polish Twitter Emotions (8 labels, refined)},
  author={yazoniak},
  year={2025},
  url={https://huggingface.co/datasets/yazoniak/TwitterEmo-PL-Refined}
}

@inproceedings{bogdanowicz2023twitteremo,
  title={TwitterEmo: Annotating Emotions and Sentiment in Polish Twitter},
  author={Bogdanowicz, S. and Cwynar, H. and Zwierzchowska, A. and Klamra, C. and Kiera{\'s}, W. and Kobyli{\'n}ski, {\L}.},
  booktitle={Computational Science -- ICCS 2023},
  series={Lecture Notes in Computer Science},
  volume={14074},
  publisher={Springer, Cham},
  year={2023},
  doi={10.1007/978-3-031-36021-3_20}
}

License

This model is released under the GNU General Public License v3.0 (GPL-3.0), inherited from the training dataset.

License Chain:

Acknowledgments


Model Version: v1.0-onnx-int8
Last Updated: 2026-01-29

Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yazoniak/twitter-emotion-pl-classifier-ONNX-int8

Quantized
(2)
this model