Horbee/Ministral-3-GEC-german-GGUF

This repository contains GGUF (Quantized) versions of the Horbee/Ministral-3-3B-GEC-german and Horbee/Ministral-3-8B-GEC-german models.

These files are designed for efficient inference on consumer hardware (CPUs, Apple Silicon, or GPUs with lower VRAM) using tools like llama.cpp, LM Studio, text-generation-webui, or ollama.

Provided Files

Filename Quantization Description Recommended For
Ministral-8B-Q4_K_M.gguf Q4_K_M 4-bit Medium. Balanced. ~5GB. Recommended Fits on 8GB VRAM GPUs.
Ministral-8B-Q8_0.gguf Q8_0 8-bit. Highest fidelity. ~8.5GB. Requires 12GB+ VRAM or 16GB+ RAM but performance is pretty much identical to the Q4_K_M version.
Ministral-3B-Q4_K_M.gguf Q4_K_M 4-bit Medium. ~2GB. Fastest / Low Resource Runs on almost anything (old laptops, mobile).
Ministral-3B-Q8_0.gguf Q8_0 8-bit. ~3.5GB. Performance is almost identical to the Q4_K_M version

Modelfile for Ollama

Since this is a Mistral-Instruct based model, use the standard instruction format:

FROM ./Ministral-8B-Q4_K_M.gguf

TEMPLATE "[INST]{{ if .System }}{{ .System }}\n\n{{ end }}{{ .Prompt }}[/INST]"

SYSTEM "Korrigiere die Grammatik im folgenden Text, aber behalte den ursprünglichen Stil und Ton bei. Verleihe dem Text keine formelle Note, wenn er diese nicht hat. Gib **nur** den korrigierten Satz zurück, ohne Anmerkungen. Wenn der Satz korrekt ist, gib ihn unverändert zurück."

PARAMETER num_ctx 4096
PARAMETER temperature 0.1
PARAMETER stop "</s>"
PARAMETER stop "[/INST]"
Downloads last month
15
GGUF
Model size
3B params
Architecture
mistral3
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Horbee/Ministral-3-GEC-german-GGUF