HuggingFaceFW/fineweb-edu-llama3-annotations
Viewer • Updated • 467k • 970 • 48
How to use bclavie/ModernBERT-base-fineweb-edu-example with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="bclavie/ModernBERT-base-fineweb-edu-example") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("bclavie/ModernBERT-base-fineweb-edu-example")
model = AutoModelForSequenceClassification.from_pretrained("bclavie/ModernBERT-base-fineweb-edu-example")One-off run using a modified version of the original Fineweb-Edu quality filter regression training code, simply replacing the original model (snowflake-embed-m, a model fine-tuned on BERT-base) with ModernBERT-base.
w/o extensive tuning, the model trains considerably faster than BERT-base, and gets +5 Weighted F1:
Weighted F1: 0.76
Detailed:
Validation Report:
precision recall f1-score support
0 0.80 0.55 0.65 5694
1 0.82 0.86 0.84 26512
2 0.64 0.71 0.67 10322
3 0.65 0.60 0.63 3407
4 0.80 0.37 0.51 807
5 0.00 0.00 0.00 1
accuracy 0.76 46743
macro avg 0.62 0.51 0.55 46743
weighted avg 0.76 0.76 0.76 46743
Weighted F1: 0.71
Detailed:
precision recall f1-score support
0 0.75 0.49 0.59 5694
1 0.78 0.84 0.81 26512
2 0.57 0.61 0.59 10322
3 0.56 0.50 0.53 3407
4 0.58 0.35 0.44 807
5 0.33 0.01 0.02 125
accuracy 0.71 46867
macro avg 0.60 0.47 0.50 46867
weighted avg 0.71 0.71 0.71 46867
(for some reason, the currently available annotated dataset is identical, except that it's missing 124 of the 125 5-rated examples. These are so anecdotal they have no real impact on the weighted metrics.)
Most parameters detailed in the script. Key hparams:
Base model
answerdotai/ModernBERT-base