openslr/librispeech_asr
Viewer • Updated • 585k • 98.3k • 222
How to use speech-seq2seq/wav2vec2-2-bert-large-no-adapter with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="speech-seq2seq/wav2vec2-2-bert-large-no-adapter") # Load model directly
from transformers import AutoTokenizer, AutoModelForSpeechSeq2Seq
tokenizer = AutoTokenizer.from_pretrained("speech-seq2seq/wav2vec2-2-bert-large-no-adapter")
model = AutoModelForSpeechSeq2Seq.from_pretrained("speech-seq2seq/wav2vec2-2-bert-large-no-adapter")YAML Metadata Error:"model-index[0].name" is not allowed to be empty
This model was trained from scratch on the librispeech_asr dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 6.6487 | 0.28 | 500 | 6.8354 | 1.4719 |
| 6.5662 | 0.56 | 1000 | 6.7877 | 0.9371 |
| 6.4309 | 0.84 | 1500 | 6.7640 | 1.1317 |
| 6.7123 | 1.12 | 2000 | 6.7907 | 1.9354 |
| 6.7547 | 1.4 | 2500 | 6.7830 | 1.8854 |
| 6.6726 | 1.68 | 3000 | 6.8211 | 1.9203 |
| 6.6538 | 1.96 | 3500 | 6.8444 | 1.8235 |
| 6.5693 | 2.24 | 4000 | 6.8873 | 1.8606 |
| 6.7234 | 2.52 | 4500 | 6.8649 | 1.8126 |
| 6.5104 | 2.8 | 5000 | 6.9251 | 1.7858 |