Text Classification
Transformers
PyTorch
TensorFlow
bert
generated_from_keras_callback
text-embeddings-inference
# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("z-dickson/CAP_coded_UK_statutory_instruments")
model = AutoModelForSequenceClassification.from_pretrained("z-dickson/CAP_coded_UK_statutory_instruments")Quick Links
CAP_coded_UK_statutory_instruments
This model predicts the CAP code of parliamentary bills/instruments (https://www.comparativeagendas.net/pages/master-codebook)
The model is trained on ~40k UK Parliamentary Statutory Instruments from the UK House of Commons and the Scottish Parliament. The model is cased (case sensitive)
Any questions on the model and training data feel free to message me on twitter - @sachary_
- Train Loss: 0.1188
- Train Sparse Categorical Accuracy: 0.9688
- Validation Loss: 0.2032
- Validation Sparse Categorical Accuracy: 0.9556
The following hyperparameters were used during training:
- optimizer: {'name': 'Adam', 'learning_rate': 5e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
- training_precision: float32
Training results
| Train Loss | Train Sparse Categorical Accuracy | Validation Loss | Validation Sparse Categorical Accuracy | Epoch |
|---|---|---|---|---|
| 0.2167 | 0.9474 | 0.2351 | 0.9444 | 0 |
| 0.1539 | 0.9592 | 0.2076 | 0.9536 | 1 |
| 0.1188 | 0.9688 | 0.2032 | 0.9556 | 2 |
Framework versions
- Transformers 4.19.2
- TensorFlow 2.8.2
- Datasets 2.2.2
- Tokenizers 0.12.1
- Downloads last month
- 26
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="z-dickson/CAP_coded_UK_statutory_instruments")