NPC Fin 32B SFT — Financial Reasoning LLM

A domain-specific financial reasoning model fine-tuned from Qwen2.5-32B-Instruct using QLoRA, focused on crypto market analysis, macro reasoning, and multi-step financial logic.

📄 Paper: NPC Fin 32B: A Domain-Specialized Financial Reasoning Model via Multi-GPU QLoRA (Zenodo, 2026)

Model Details

Parameter	Value
Base Model	Qwen/Qwen2.5-32B-Instruct
Method	QLoRA (4-bit NF4 quantization)
LoRA Rank	64
LoRA Alpha	128
LoRA Dropout	0.05
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Framework	Unsloth + trl SFTTrainer
Max Sequence Length	4,096 tokens

Training Data

32,496 SFT examples (59.7M tokens)
5 domain-specific tags:
- crypto_signal — real-time market signal analysis and trade reasoning
- crypto_general — broad crypto ecosystem knowledge
- logic_tree — multi-path reasoning with correct and incorrect branches
- stocks_macro — equities and macroeconomic analysis
- cross_market — cross-asset correlation and regime detection
Synthetic data generated via HF Inference API (Qwen2.5-72B-Instruct) at zero incremental cost
Source signals exported from production MongoDB (btunified database)
MinHash deduplication applied, quality-filtered with automated scoring

Training Configuration

Parameter	Value
Optimizer	AdamW 8-bit
Learning Rate	2e-4
LR Schedule	Cosine decay
Warmup Ratio	0.05
Weight Decay	0.01
Per-device Batch Size	4
Gradient Accumulation	8
Realized Effective Batch	~384 (4 × 12 GPUs × 8)
Epochs	3
Mixed Precision	bf16
Distributed Strategy	DeepSpeed ZeRO-3 + full CPU offload
Hardware	12 × NVIDIA H100 SXM5 80GB (RunPod single multi-GPU node)
Wall Clock	~72 hours (3 days)
Total Compute	~864 H100-hours

Note on batch size: an earlier version of this card listed the per-device batch (4) and gradient accumulation (8) with an "effective batch 32" annotation inherited from a single-GPU experimental plan. The realized run distributed across 12 H100 GPUs under DeepSpeed ZeRO-3 scaled the effective batch by world size to approximately 384 (4 × 12 × 8). The peak learning rate of 2e-4 was tuned for the planned eff-batch 32, not for the realized 384; see the paper §4.3 (Config drift: planned vs realized batch size) for full discussion.

Evaluation

Benchmark	Score
CryptoQA (custom, 500 questions)	93.6%

CryptoQA covers: token fundamentals, DeFi mechanics, on-chain analytics interpretation, market regime identification, and risk assessment.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-32B-Instruct",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-32B-Instruct")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "ramankrishna10/npc-fin-32b-sft")

messages = [
    {"role": "system", "content": "You are a financial reasoning assistant."},
    {"role": "user", "content": "Analyze the risk/reward of entering a long ETH position given declining on-chain activity but increasing institutional inflows."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Intended Use

Financial market analysis and reasoning
Crypto signal interpretation and trade logic
Multi-step reasoning over market scenarios
Research and educational purposes

Limitations

This is the SFT base model only — it does not include tool-use or identity fine-tuning
Trained primarily on crypto/DeFi data; performance on traditional equities may be lower
Not intended as financial advice — outputs are AI-generated analysis
Single-GPU training (A40) — not trained at cluster scale
May hallucinate token names or market data not present in training set

Related Models

npc-fin-prm-7b — Process Reward Model for step-level reasoning verification

Citation

Bachu, R. K. (2026). NPC Fin 32B: A Domain-Specialized Financial Reasoning Model via Multi-GPU QLoRA. Zenodo. https://doi.org/10.5281/zenodo.19802598

@misc{bachu2026npcfin32b,
  title     = {NPC Fin 32B: A Domain-Specialized Financial Reasoning Model
               via Multi-GPU QLoRA},
  author    = {Bachu, Rama Krishna},
  year      = {2026},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.19802598},
  url       = {https://doi.org/10.5281/zenodo.19802598},
  note      = {Preprint},
}

Author

Ramakrishna Bachu — GitHub | LinkedIn

Part of the NPC Model Family by Bottensor.

Downloads last month: 30

Model tree for ramankrishna10/npc-fin-32b-sft

Base model

Qwen/Qwen2.5-32B

Finetuned

Qwen/Qwen2.5-32B-Instruct

Adapter

(118)

this model