Instructions to use Reza2kn/Higgs-Audio-v3-TTS-4bit-NVFP4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Reza2kn/Higgs-Audio-v3-TTS-4bit-NVFP4 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="Reza2kn/Higgs-Audio-v3-TTS-4bit-NVFP4")# Load model directly from transformers import AutoModelForSeq2SeqLM model = AutoModelForSeq2SeqLM.from_pretrained("Reza2kn/Higgs-Audio-v3-TTS-4bit-NVFP4", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Higgs-Audio-v3-TTS-4bit-NVFP4
4-bit NVFP4 artifact for bosonai/higgs-audio-v3-tts-4b.
Scope
This is a transformer-body quantized artifact, not a complete drop-in runtime yet. It quantizes body.layers.* attention/MLP 2D weights and preserves the Higgs audio tokenizer/vocoder, fused modality embedding/head, norms, biases, and non-2D tensors.
Higgs Audio v3 TTS uses a custom HiggsMultimodalQwen3ForConditionalGeneration architecture with 8 audio codebooks, delayed multi-codebook generation, and waveform decode. Current vanilla Transformers in the tested environment does not instantiate this architecture, so runtime integration must be done through SGLang-Omni or a custom loader.
Quantization Report
- Quantized tensors:
252 - Quantized parameter fraction seen:
0.7805 - Mean relative L2:
0.097363 - Max relative L2:
0.101345 - Max absolute error:
0.140625
See:
quantization_config.jsonquant_error_report.jsontensor_manifest.json
Persian TTS Runtime Note
Persian generation testing through SGLang-Omni showed good spoken audio quality for the NVFP4 weights after dequantized runtime validation. The raw generation can append a long near-silent tail after the spoken utterance; the accompanying Space/server wrapper therefore enables trailing-silence trimming by default.
See:
nvfp4_persian_tts_trim_report.jsonsamples/nvfp4_dequant_fa_trimmed.wav
License
Released under the upstream Boson Higgs Audio v3 research and non-commercial license. Production, hosted APIs, or revenue-generating use requires a separate commercial license from Boson AI.
- Downloads last month
- 31
Model tree for Reza2kn/Higgs-Audio-v3-TTS-4bit-NVFP4
Base model
bosonai/higgs-audio-v3-tts-4b