Neotoi Coder v3.2 โ€” 4B

A Rust / Dioxus 0.7 specialist fine-tuned from Qwen3-4B (4.0B parameters, 3.6B non-embedding, tied embeddings) using RAFT (Retrieval-Augmented Fine-Tuning). Optimized for production-quality Dioxus 0.7 components with Tailwind v4 and WCAG 2.2 AAA accessibility.

This is the 4B variant โ€” the smallest and fastest option. Companion repos: 8B (rockypod/neotoi-coder-8b) ยท 15B family hub (rockypod/neotoi-coder)

v3.2 Exam Results โ€” 114Q Dioxus 0.7 Spec Exam

160.0 / 164.0 weighted | 112 / 114 raw | 97.56%

Tier Name Cnt Raw Wtd /Max Rate Floor Status
T1 Fundamentals 12 12 12.0 12.0 100.0% 82% โœ…
T2 RSX Syntax 12 12 12.0 12.0 100.0% 82% โœ…
T3 Signal Hygiene 12 12 12.0 12.0 100.0% 82% โœ…
T4 WCAG / ARIA 15 15 22.5 22.5 100.0% 82% โœ…
T5 use_resource 8 8 12.0 12.0 100.0% 82% โœ…
T6 Hard Reasoning 10 10 20.0 20.0 100.0% 88% โœ…
T7 Primitives + CSS 13 13 19.5 19.5 100.0% 82% โœ…
T8 GlobalSignal / i18n 8 8 12.0 12.0 100.0% 82% โœ…
T9 Static Navigator 6 6 9.0 9.0 100.0% 82% โœ…
T10 Dioxus 0.7.4 6 6 12.0 12.0 100.0% 88% โœ…
T11 Server Functions 4 4 6.0 6.0 100.0% 82% โœ…
T12 Format Compliance (NEW) 6 4 8.0 12.0 66.7% 88% โš ๏ธ
T13 SyncStore (NEW) 2 2 3.0 3.0 100.0% 82% โœ…
Total 114 112 160.0 164.0 97.56% โ€” โ€”
  • Publication bar (90%): PASS
  • Release bar (95%): PASS
  • Tier floors: FAIL (T12 only โ€” 66.7% vs 88% floor)

2 misses: q111 (T12, old cx.render idiom + orphan </think>), q112 (T12, missing rsx!)

T12 Format Compliance is the only floor failure. Notably, the 4B scores 100% on T13 SyncStore where the 8B scored 0% โ€” the failure patterns complement each other across sizes.

v3.2 vs v3.1 (4B)

Metric v3.1 4B v3.2 4B
Score 143.5/144.5 (99.31%) 160.0/164.0 (97.56%)
Exam 103Q, max 144.5, 11 tiers 114Q, max 164.0, 13 tiers
T4 WCAG / ARIA 100.0% 100.0% โœ…
T8 GlobalSignal / i18n 100.0% 100.0% โœ… (8B missed this)
T13 SyncStore โ€” 100.0% โœ… (8B scored 0%)
T12 Format Compliance โ€” 66.7% โš ๏ธ
Dioxus surface 0.7.0โ€“0.7.4 0.7.0โ€“0.7.9
Dataset 4,880 rows, 43 topics 5,287 rows, 57 topics

Version History

Version Base (params) Score Exam Dataset
v3.1 4B Qwen3-4B (4.0B) 143.5/144.5 (99.31%) 103Q weighted 4,880
v3.1 8B Qwen3-8B (8.2B) 144.5/144.5 (100.00%) 103Q weighted 4,880
v3.1 15B Qwen3-Coder-14B (14.8B) 137.0/144.5 (94.81%) 103Q weighted 4,880
v3.2 15B Qwen3-Coder-14B (14.8B) 156.0/164.0 (95.12%) 114Q weighted 5,287
v3.2 8B Qwen3-8B (8.2B) 160.0/164.0 (97.56%) 114Q weighted 5,287
v3.2 4B Qwen3-4B (4.0B) 160.0/164.0 (97.56%) 114Q weighted 5,287

Files

  • neotoi-coder-v3.2-4b-q4_k_m_patched.gguf โ€” current Q4_K_M + qwen3.thinking=true patch (~2.33 GB)
  • neotoi-coder-v3.1-4b-q4_k_m_patched.gguf โ€” v3.1 archive

Install

Ollama

ollama pull rockypod/neotoi-coder:4b
ollama run rockypod/neotoi-coder:4b "Write a Dioxus 0.7 counter with use_signal"

LM Studio

Download neotoi-coder-v3.2-4b-q4_k_m_patched.gguf from this repo (~2.33 GB).

llama.cpp

./llama-cli -m neotoi-coder-v3.2-4b-q4_k_m_patched.gguf -ngl 99 --temp 0.2 \
  -p "<|im_start|>user\nYour question<|im_end|>\n<|im_start|>assistant\n<think>"

Model Details

  • Base model: Qwen/Qwen3-4B (4.0B total, 3.6B non-embedding, tied embeddings)
  • Method: RAFT with LoRA adapters (Unsloth)
  • Dataset: 5,287 curated Dioxus 0.7 examples across 57 topics (T1โ€“T57)
  • Scope: Rust + Dioxus 0.7.0โ€“0.7.9 + Tailwind v4 + WCAG 2.2 AAA
  • Quantization: Q4_K_M (~2.33 GB)
  • Thinking tokens: patched (qwen3.thinking = true)

Training

Field Value
Steps 2,644
Epochs 4
Wall time ~1h 57m
Train loss 0.470
LoRA rank 16 (alpha 16, dropout 0)
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Sequence length 8192
Precision bf16 + 4-bit base
Hardware RTX 3090 Ti (24 GB)

Enabling Thinking Mode

This model emits Qwen3 native <think>...</think> blocks. Thinking is on by default with the _patched.gguf quants on inference backends that honor qwen3.thinking.

Transparency

License

Fine-tuned weights: Neotoi Coder Community License v1.0 โ€” commercial use of outputs permitted, weight redistribution prohibited, mental health deployment requires written permission. See LICENSE.

Base model: Qwen3-4B โ€” Apache 2.0 ยฉ Alibaba Cloud.

Credits

  • Unsloth โ€” 2ร— faster fine-tuning
  • Qwen3-4B โ€” base model
  • Dioxus โ€” the framework this model specializes in
  • Claude Code โ€” dataset pipeline and training infrastructure

Built on a homelab RTX 3090 Ti in Washington State.

Downloads last month
69
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for rockypod/neotoi-coder-4b

Finetuned
Qwen/Qwen3-4B
Finetuned
(663)
this model