Horbee/Ministral-3-GEC-german-GGUF

This repository contains GGUF (Quantized) versions of the Horbee/Ministral-3-3B-GEC-german and Horbee/Ministral-3-8B-GEC-german models.

These files are designed for efficient inference on consumer hardware (CPUs, Apple Silicon, or GPUs with lower VRAM) using tools like llama.cpp, LM Studio, text-generation-webui, or ollama.

Provided Files

Filename	Quantization	Description	Recommended For
`Ministral-8B-Q4_K_M.gguf`	Q4_K_M	4-bit Medium. Balanced. ~5GB.	Recommended Fits on 8GB VRAM GPUs.
`Ministral-8B-Q8_0.gguf`	Q8_0	8-bit. Highest fidelity. ~8.5GB.	Requires 12GB+ VRAM or 16GB+ RAM but performance is pretty much identical to the Q4_K_M version.
`Ministral-3B-Q4_K_M.gguf`	Q4_K_M	4-bit Medium. ~2GB.	Fastest / Low Resource Runs on almost anything (old laptops, mobile).
`Ministral-3B-Q8_0.gguf`	Q8_0	8-bit. ~3.5GB.	Performance is almost identical to the Q4_K_M version

Modelfile for Ollama

Since this is a Mistral-Instruct based model, use the standard instruction format:

FROM ./Ministral-8B-Q4_K_M.gguf

TEMPLATE "[INST]{{ if .System }}{{ .System }}\n\n{{ end }}{{ .Prompt }}[/INST]"

SYSTEM "Korrigiere die Grammatik im folgenden Text, aber behalte den ursprünglichen Stil und Ton bei. Verleihe dem Text keine formelle Note, wenn er diese nicht hat. Gib **nur** den korrigierten Satz zurück, ohne Anmerkungen. Wenn der Satz korrekt ist, gib ihn unverändert zurück."

PARAMETER num_ctx 4096
PARAMETER temperature 0.1
PARAMETER stop "</s>"
PARAMETER stop "[/INST]"