OmniASR-CTC-1B (GGUF)

GGUF conversion of Facebook's omniASR-CTC-1B for use with CrispASR.

Model Details

Architecture: wav2vec2 encoder (48L, d=1280) + CTC head
Parameters: ~1B
Encoder: 7-layer CNN frontend (320x downsampling) + 48L transformer with grouped positional convolution
Output: CTC greedy decoding (character-level SentencePiece, 9812 tokens)
Languages: 1100+ (multilingual, trained on 3.6M hours)
License: Apache 2.0
Input: Raw 16 kHz mono PCM (no mel features)

Usage with CrispASR

# Auto-detected from GGUF metadata (omniasr-ctc arch)
crispasr --backend omniasr -m omniasr-ctc-1b-q4_k.gguf -f audio.wav

# With language specification
crispasr --backend omniasr -m omniasr-ctc-1b-q4_k.gguf -l de -f audio.wav

Available Files

File	Quant	Size	Description
`omniasr-ctc-1b.gguf`	F16	1.9 GB	Full precision
`omniasr-ctc-1b-q4_k.gguf`	Q4_K	551 MB	Recommended — good balance of quality and size

Conversion

Converted using:

python models/convert-omniasr-ctc-to-gguf.py \
  --input facebook/omniASR-CTC-1B \
  --output omniasr-ctc-1b.gguf

Quantized with:

crispasr-quantize omniasr-ctc-1b.gguf omniasr-ctc-1b-q4_k.gguf Q4_K

Comparison with OmniASR-CTC-300M

Model	Params	Size (Q4_K)	Accuracy
OmniASR-CTC-300M	300M	157 MB	Good
OmniASR-CTC-1B	1B	551 MB	Better (deeper encoder)

Both use the same CTC architecture — the 1B variant has 48 encoder layers (vs 24) and wider hidden dimension (1280 vs 1024).

Original Model

Paper: OmniASR: Transcribing Every Language Everywhere All at Once
Code: facebookresearch/omnilingual-asr
Training Data: 3.6M hours across 1100+ languages

Downloads last month: 200

GGUF

Model size

1.0B params

Architecture

omniasr-ctc

Hardware compatibility

We're not able to determine the quantization variants.

View +1 variant

Model tree for cstr/omniASR-CTC-1B-GGUF

Base model

facebook/omniASR-CTC-1B

Quantized

(1)

this model

Paper for cstr/omniASR-CTC-1B-GGUF

GREENY: A Full-F 2D Gyrofluid Reconnection Code

Paper • 2502.10219 • Published Feb 14, 2025