badrex/malagasy-speech-full
Viewer β’ Updated β’ 31.6k β’ 552 β’ 1
How to use badrex/w2v-bert-2.0-malagasy-asr with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="badrex/w2v-bert-2.0-malagasy-asr") # Load model directly
from transformers import AutoProcessor, AutoModelForCTC
processor = AutoProcessor.from_pretrained("badrex/w2v-bert-2.0-malagasy-asr")
model = AutoModelForCTC.from_pretrained("badrex/w2v-bert-2.0-malagasy-asr")This model is a fine-tuned version of Wav2Vec2-BERT 2.0 for Malagasy automatic speech recognition (ASR). It was trained on 150 hours of transcribed Malagasy speech. The ASR model is robust and the in-domain WER is below 11.7%.
The model can be used directly for automatic speech recognition of a Malagasy audio:
from transformers import Wav2Vec2BertProcessor, Wav2Vec2BertForCTC
import torch
import torchaudio
# load model and processor
processor = Wav2Vec2BertProcessor.from_pretrained("badrex/w2v-bert-2.0-malagasy-asr")
model = Wav2Vec2BertForCTC.from_pretrained("badrex/w2v-bert-2.0-malagasy-asr")
# load audio
audio_input, sample_rate = torchaudio.load("path/to/audio.wav")
# preprocess
inputs = processor(audio_input.squeeze(), sampling_rate=sample_rate, return_tensors="pt")
# inference
with torch.no_grad():
logits = model(**inputs).logits
# decode
predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(predicted_ids)[0]
print(transcription)
This model can be used as a foundation for:
The development of this model was supported by CLEAR Global and Gates Foundation.
@misc{w2v_bert_malagasy_asr,
author = {Badr M. Abdullah},
title = {Adapting Wav2Vec2-BERT 2.0 for Malagasy ASR},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/badrex/w2v-bert-2.0-malagasy-asr}
}
For questions or issues, please contact via the Hugging Face model repository in the community discussion section.
Base model
facebook/w2v-bert-2.0