PCL RoBERTa-Large Ensemble

A 5-fold ensemble of roberta-large fine-tuned for binary Patronizing and Condescending Language (PCL) detection (SemEval 2022 Task 4, Subtask 1).

Model Description

This model detects whether a paragraph contains patronizing or condescending language toward vulnerable communities. It consists of 5 fold models trained via stratified cross-validation, whose predictions are combined using CAWPE-inspired weighted averaging.

Key techniques:

Focal Loss (alpha=0.85, gamma=2.0) to handle class imbalance
Keyword prepending: the target community keyword is prepended to the input text
Threshold optimization: optimal classification threshold (t=0.40) found via post-hoc sweep on CV predictions
Collapse detection: automatic reinitialization if a fold produces near-constant outputs

Training Details

Hyperparameter	Value
Base model	`roberta-large`
Max sequence length	512
Learning rate	1e-5
Batch size	8
Epochs	5
Folds	5 (Stratified K-Fold)
Optimizer	AdamW (weight_decay=0.01)
Scheduler	Linear with 10% warmup
Seed	123

Results

Metric	Value
Dev F1	0.6333
Dev Precision	0.60
Dev Recall	0.67
Mean CV F1	0.5892
Optimal threshold	0.40

Per-fold CV F1

Fold	F1
1	0.6323
2	0.5539
3	0.5515
4	0.6040
5	0.6045

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import numpy as np

# Load all 5 fold models
models = []
tokenizer = AutoTokenizer.from_pretrained("noufwithy/pcl-roberta-large-ensemble", subfolder="fold_0")
for fold in range(5):
    model = AutoModelForSequenceClassification.from_pretrained(
        "noufwithy/pcl-roberta-large-ensemble", subfolder=f"fold_{fold}"
    )
    model.eval()
    models.append(model)

# Prepend keyword to text (as done during training)
keyword = "homeless people"
text = "These poor people just need someone to help them get back on their feet."
input_text = f"{keyword} {text}"

inputs = tokenizer(input_text, return_tensors="pt", truncation=True, max_length=512)

# Ensemble prediction (weighted average)
weights = [0.6323, 0.5539, 0.5515, 0.6040, 0.6045]
probs = []
for model, w in zip(models, weights):
    with torch.no_grad():
        logits = model(**inputs).logits
        prob = torch.softmax(logits, dim=-1)[0, 1].item()
        probs.append(prob * w)

avg_prob = sum(probs) / sum(weights)
prediction = int(avg_prob >= 0.40)  # optimal threshold
print(f"PCL probability: {avg_prob:.4f}, Prediction: {'PCL' if prediction else 'Not PCL'}")

Downloads last month: -; Downloads are not tracked for this model. How to track

Evaluation results

Dev F1 on SemEval 2022 Task 4 - PCL
self-reported

0.633