Text Generation
Transformers
Safetensors
vllm

Welcome to Nervus Sapien lite, a lightweight prerelease of the yet coming Nervus Sapien 32b designed for powerful reasoning, agentic tasks, versatile developer use cases, and conversational chat.

This is NIT's best model yet. It is lightweight, yet extremely powerful in its own ways.

The model was trained on OpenAI's harmony response format (https://github.com/openai/harmony) as the model is based on GPT OSS 20b, the more lightweight variant of the GPT OSS series.

NIT stands for the Natarajan Intelligence Technologies Inc. Check out NatarajanAI, our AI chatbot based on Danny Avila's LibreChat.

Highlights

  • Permissive Apache 2.0 license: Build freely without copyleft restrictions or patent risk—ideal for experimentation, customization, and commercial deployment.
  • Configurable reasoning effort: Easily adjust the reasoning effort (low, medium, high) based on your specific use case and latency needs.
  • Full chain-of-thought: Gain complete access to the model’s reasoning process, facilitating easier debugging and increased trust in outputs. It’s not intended to be shown to end users.
  • Fine-tunable: Fully customize models to your specific use case through parameter fine-tuning.
  • Agentic capabilities: Use the model's native capabilities for function calling, [web browsing]
  • MXFP4 quantization: The model was fine-tuned with MXFP4 quantization of the MoE weights, making the model run on 16 GB VRAM or lower if unsloth and quantanization is used. All evals were performed with the same MXFP4 quantization.

Inference examples

Transformers

You can use Nervus Sapien Lite with Transformers. If you use the Transformers chat template, it will automatically apply the harmony response format. If you use model.generate directly, you need to apply the harmony format manually using the chat template or use our openai-harmony package.

To get started, install the necessary dependencies to setup your environment:

pip install -U transformers kernels torch 

Once, setup you can proceed to run the model by running the snippet below:

from transformers import pipeline
import torch
model_id = "goodgoals/Nervus-Sapien-Lite-1.01"
pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype="auto",
    device_map="auto",
)
messages = [
    {"role": "user", "content": "Explain quantum mechanics clearly and concisely."},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])

Alternatively, you can run the model via Transformers Serve to spin up a OpenAI-compatible webserver:

transformers serve
transformers chat localhost:8000 --model-name-or-path openai/gpt-oss-20b

Download the model

You can download the model weights from the [Hugging Face Hub] directly from Hugging Face CLI at goodgoals/Natarajan-Response-Engine-v1.02:

huggingface-cli download goodgoals/Nervus-Sapien-Lite-1.01 --include "original/*" --local-dir Nervus-Sapien-Lite-1.01/

Reasoning levels

You can adjust the reasoning level that suits your task across three levels:

  • Low: Fast responses for general dialogue.
  • Medium: Balanced speed and detail.
  • High: Deep and detailed analysis.

The reasoning level can be set in the system prompts, e.g., "Reasoning: high".

Tool use

Nervus Sapien Lite is excellent for:

  • Web browsing (using built-in browsing tools)
  • Function calling with defined schemas
  • Agentic operations like browser tasks
  • Casual Chat

Fine-tuning

Nervus Sapien Lite can be fine tuned the same way gpt-oss 20b is fine tuned

Inference

Sadly, inference and cloud compute support is not here yet. But it will be added in a future model update.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for goodgoals/Nervus-Sapien-Lite-1.01

Finetuned
(1)
this model

Dataset used to train goodgoals/Nervus-Sapien-Lite-1.01