Ternary-Bonsai-8B โ€” Unpacked FP16 Safetensors

FP16 safetensors (HuggingFace format) of the ternary Bonsai-8B model. This repo exists for users who want to run Ternary Bonsai with stock HuggingFace tooling or frameworks that don't yet support any of the packed ternary format. The MLX 2-bit format is currently the only packed format available; more formats for other backends are coming soon.

We strongly recommend using the natively packed models instead. The packed format is where all the benefits of Bonsai come from โ€” up to 9x memory reduction, 5x faster inference, and lower energy per token. This unpacked FP16 version is full-size and does not provide any of those advantages.

For the optimized ternary release model (recommended):

Downloads last month
2,870
Safetensors
Model size
8B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for prism-ml/Ternary-Bonsai-8B-unpacked

Adapters
1 model
Finetunes
3 models
Quantizations
3 models

Collection including prism-ml/Ternary-Bonsai-8B-unpacked