Neotoi Coder

A Rust / Dioxus 0.7 specialist LLM. v3.1 ships in three sizes β€” 15B, 8B, and 4B β€” all fine-tuned via RAFT (Retrieval-Augmented Fine-Tuning) on Qwen3 base models. Optimized for production-quality Dioxus 0.7 components with Tailwind v4 and WCAG 2.2 AAA accessibility.

All three are current. They were trained from the same v3.1 dataset, exam the same way, and ship together. Pick based on hardware, not currency.

Variants

Variant Repo Base Params Q4_K_M Spec exam (104Q weighted, max 144.5)
8B (flagship) rockypod/neotoi-coder-8b Qwen3-8B 8.2B (6.95B non-embed) 4.68 GB 144.5 / 144.5 β€” 100.00%
4B rockypod/neotoi-coder-4b Qwen3-4B 4.0B (3.6B non-embed, tied) 2.33 GB 143.5 / 144.5 β€” 99.31%
15B this repo (rockypod/neotoi-coder) Qwen3-Coder-14B 14.8B (13.2B non-embed) 8.40 GB 137.0 / 144.5 β€” 94.81%

All three clear the 90% publication bar and the 95% release bar with all per-tier floors PASS. The 8B is the recommended default; pick the 4B if disk / RAM is tight (or for ~40% faster generation), pick the 15B for the broadest coverage and the most context-rich generations.

Each variant lives in its own model repo for searchability. This page (rockypod/neotoi-coder) is the family hub and hosts the 15B GGUFs.

Install via Ollama

# 8B β€” recommended default
ollama pull rockypod/neotoi-coder:8b

# 4B β€” disk / RAM constrained, ~40% faster generation
ollama pull rockypod/neotoi-coder:4b

# 15B β€” largest, broadest coverage
ollama pull rockypod/neotoi-coder:15b

Spec-exam scorecard β€” all three variants

Re-graded 2026-04-26 with the patched run_grade_v31.py (Q87 now also accepts LANG() / THEME() GlobalSignal accessor calls in addition to the literal Signal token β€” a false-negative fix that unlocked the 8B's perfect score).

Tier Max wt 8B 4B 15B
T1 Fundamentals 12.0 12.0 βœ… 11.0 ⚠️ 91.7% 12.0 βœ…
T2 RSX Syntax 12.0 12.0 βœ… 12.0 βœ… 10.0 ⚠️ 83.3%
T3 Signal Hygiene 12.0 12.0 βœ… 12.0 βœ… 11.0 βœ… 91.7%
T4 WCAG / ARIA 21.0 21.0 βœ… 21.0 βœ… 16.5 ⚠️ 78.6%
T5 use_resource 12.0 12.0 βœ… 12.0 βœ… 12.0 βœ…
T6 Hard Reasoning 20.0 20.0 βœ… 20.0 βœ… 20.0 βœ…
T7 Primitives + CSS 18.0 18.0 βœ… 18.0 βœ… 18.0 βœ…
T8 GlobalSignal / i18n 12.0 12.0 βœ… 12.0 βœ… 12.0 βœ…
T9 Static Navigator 9.0 9.0 βœ… 9.0 βœ… 9.0 βœ…
T10 Dioxus 0.7.4 12.0 12.0 βœ… 12.0 βœ… 12.0 βœ…
T11 Server Functions 4.5 4.5 βœ… 4.5 βœ… 4.5 βœ…
Total weighted 144.5 144.5 143.5 137.0
Total raw (of 103) β€” 103 102 97
Percent β€” 100.00% 99.31% 94.81%

Tier floors (82% on weight-1.0 / 1.5 tiers, 88% on weight-2.0 tiers): all PASS for all three variants.

What's new in v3.1 (vs v3.0)

  • Three sizes: 8B and 4B alongside the 15B base, both surpassing the 15B's score.
  • T1 Fundamentals β†’ 100% on 8B and 15B, 91.7% on 4B (+8.3 pts vs v3.0).
  • T6 Hard Reasoning β†’ 100% clean sweep, all three variants (+25 pts vs v3.0).
  • T8 GlobalSignal / i18n β†’ 100% all three variants.
  • T10 Dioxus 0.7.4 β†’ 100% all three variants.
  • 8 tiers at 100% on the 15B; 11 tiers at 100% on the 8B (perfect).
  • Dataset: 4,880 curated examples across 43 topics (up from 4,535).

Version History

Version Base (params) Score Exam Dataset
v1.0 Qwen3-Coder-14B (14.8B) 51/60 (85.0%) 60Q standard β€”
v2.0 Qwen3-Coder-14B (14.8B) 135.5/140 (96.8%) 100Q weighted 4,185
v3.0 Qwen3-Coder-14B (14.8B) 124.0/144.5 (85.8%) 103Q weighted 4,535
v3.1 15B Qwen3-Coder-14B (14.8B) 137.0/144.5 (94.81%) 103Q weighted 4,880
v3.1 8B Qwen3-8B (8.2B) 144.5/144.5 (100.00%) 103Q weighted 4,880
v3.1 4B Qwen3-4B (4.0B, tied) 143.5/144.5 (99.31%) 103Q weighted 4,880

Files in this repo (rockypod/neotoi-coder, 15B and historical)

File Format Size Use case
neotoi-coder-v3.1-q4_k_m.gguf GGUF Q4_K_M 8.4 GB LM Studio, llama.cpp, Ollama (current 15B)
neotoi-coder-v3-q4_k_m_patched.gguf GGUF Q4_K_M 9 GB v3.0 archive
neotoi-coder-v2.0-q4_k_m.gguf GGUF Q4_K_M 9 GB v2.0 archive
neotoi-coder-v1-q4_k_m_final.gguf GGUF Q4_K_M 9 GB v1.0 archive

For the 8B and 4B Q4_K_M GGUFs, see their dedicated repos:

Enabling Thinking Mode

This model emits Qwen3 native <think>...</think> blocks. Thinking is on by default with the _patched.gguf quants on inference backends that honor qwen3.thinking.

LM Studio

Field Value
Before System <|im_start|>system
After System <|im_end|>
Before User <|im_start|>user
After User <|im_end|>
Before Assistant <|im_start|>assistant\n<think>
After Assistant <|im_end|>

Ollama (custom Modelfile, 15B)

FROM neotoi-coder-v3.1-q4_k_m.gguf
PARAMETER temperature 0.2
PARAMETER num_ctx 16384
PARAMETER stop "<|im_end|>"
TEMPLATE """{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
<think>
"""
SYSTEM You are Neotoi, an expert Rust and Dioxus 0.7 developer.

Or simply:

ollama pull rockypod/neotoi-coder:15b

llama.cpp

./llama-cli \
  -m neotoi-coder-v3.1-q4_k_m.gguf \
  -ngl 99 \
  --temp 0.2 \
  -p "<|im_start|>user\nYour question<|im_end|>\n<|im_start|>assistant\n<think>"

What It Knows

  • Dioxus 0.7 RSX brace syntax β€” never function-call style
  • use_signal, use_resource with the canonical three-arm match
  • r#for on labels only, never inputs
  • WCAG 2.2 AAA: aria_labelledby, aria_describedby, live regions, role="alert", role="dialog"
  • dioxus-primitives β€” no manual ARIA on managed components
  • styles!() macro and native CSS modules
  • Tailwind v4 utility classes and semantic tokens
  • DaisyUI 5 components on Tailwind v4
  • GlobalSignal patterns (LANG / THEME), EN/VI i18n, dark-mode toggling via document::eval
  • Router patterns (#[derive(Routable)], nested layouts, query params, route guards)
  • Dioxus 0.7.4 APIs: WritableResultExt, WebSocket Stream+Sink, server-fn extractors

Known Limitations

  • rsx! macro drops on the 15B for 6 RSX-heavy questions (Q17 / 22 / 30 / 37 / 39 / 43); v3.2 target. The 8B and 4B do not reproduce these misses.
  • Non-Dioxus web frameworks β€” out of scope by design (SvelteKit coverage lives in rockypod/svcoder).
  • Playwright / E2E testing β€” out of scope.

Transparency

The training dataset itself is not redistributed β€” see the GitHub repo for the data-generation pipeline. Tailwind v4 reference material is treated as a competence input, not a shipped artifact.

License & Attribution

Fine-tuned weights and dataset: licensed under the Neotoi Coder Community License v1.0 β€” see LICENSE. Commercial use of model outputs permitted. Weight redistribution prohibited. Mental health deployment requires written permission.

Upstream models: the base model and teacher model are licensed under the Apache License, Version 2.0 β€” see LICENSE-APACHE and NOTICE:

The Neotoi Coder 14B weights are a derivative work of Qwen3-Coder-14B, fine-tuned via LoRA adapters on the Neotoi Coder RAFT dataset and then merged + quantized to GGUF.

Credits

Downloads last month
149
GGUF
Model size
15B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support