finops-fail-triage

A QLoRA fine-tuned Qwen3.5-9B model for post-trade settlement fail triage and prioritization. Given a structured fail record, the model outputs a JSON object containing a priority score (0–100), priority tier, lifecycle state, score components, resolution action, escalation level, and regulatory flags.

This is Stage 1 of a two-model FinOps pipeline. Stage 2 (finops-resolver, in development) takes this model's output and recommends an ordered resolution sequence.

Model Details

Field	Value
Base model	Qwen/Qwen3.5-9B (post-trained instruct)
Fine-tuning method	QLoRA (4-bit NF4 + LoRA adapters)
Framework	Unsloth + TRL SFTTrainer
Quantization	GGUF Q4_K_M
File size	5.3 GB
Trainable parameters	3,932,160 of 9,413,745,984 (0.04%)
LoRA rank	r=16, alpha=32
Target modules	q_proj, k_proj, v_proj, o_proj
Training hardware	NVIDIA RTX 5070 Ti (16GB GDDR7)
Training time	~4.2 hours (3,000 steps)
Final train loss	0.1453

Intended Use

This model is designed for post-trade settlement operations in a broker-dealer or clearing firm context. It triages fail records by:

Classifying the fail category (CNS, DVP, B2B, Correspondent, DK)
Calculating a weighted priority score using a multi-factor formula
Applying regulatory override rules (Reg SHO, threshold securities, close-out deadlines)
Distinguishing CNS FTD (fail to deliver) from FTR (fail to receive)
Recommending an action and escalation level

This is a research and portfolio demonstration project. It is not intended for production use in live trading or settlement systems.

Priority Scoring Model

The model was trained on a deterministic scoring formula derived from post-trade settlement domain knowledge:

Final Score (0–100) = Base Score × Inventory Modifier × Concentration Modifier

Base Score = (Age × 0.30) + (Value × 0.25) + (Regulatory × 0.35) + (CP History × 0.10)

Factor Tables

Age Factor

Days Aged	Score
1–3	10
4–6	30
7–9	60
10–12	80
13+	100

Value Factor

Market Value	Score
< $100K	10
$100K–$500K	30
$500K–$1M	50
$1M–$5M	70
> $5M	100

Regulatory Factor

Condition	Score
No concern	0
Approaching Reg SHO	50
Threshold security	80
Close-out required	100

Counterparty History Factor

15-day Fail Rate	Score
< 1%	10
1–3%	30
3–5%	60
> 5%	100

Inventory Modifier (0.50–1.00)

Coverage	Modifier
≥ 100%	0.50
75–99%	0.65
50–74%	0.80
25–49%	0.90
< 25%	1.00

Concentration Modifier (1.00–1.50, capped)

Condition	Modifier
Normal	1.00
Threshold list security	1.20
Non-CNS eligible	1.15
Security fail rate > 2% ADV	1.15
Broker fail rate > 5% (15-day)	1.15
Multiple conditions	Multiplicative, max 1.50

Priority Tiers

Score	Tier	Escalation
0–25	LOW	None
26–50	MEDIUM	L1 Ops Associate
51–75	HIGH	L2 Senior Ops
76–100	CRITICAL	L3 Management / L4 Compliance

Category Override Rules

CNS_FAIL is always minimum MEDIUM regardless of calculated score — NSCC systemic risk requires active management. Score is reported as calculated; only the tier is overridden.

Training Data

Training data was synthetically generated using a deterministic Python pipeline with no API calls. All examples were mathematically verified against the scoring formula before training.

Metric	Value
Total examples	10,000
Train split	8,000 (80%)
Eval split	2,000 (20%)
Generation method	Deterministic Python (no LLM API)
Validation	100% formula-verified before training

Category Distribution

Category	%
CNS_FAIL	30%
DVP_FAIL	30%
B2B_PENDING	20%
CA_EVENT	10%
DK_DISPUTE	10%

Coverage

All 5 age bands × all 5 value tiers × all 4 regulatory states
All inventory modifier tiers (0.50, 0.65, 0.80, 0.90, 1.00)
All concentration modifier stacking combinations including cap at 1.50
FTD and FTR across all age bands and value tiers
Realistic DTC participant number distribution: 15 high-volume counterparties (~39% of assignments, uncapped) + general pool (1–1999, max 3 uses each)

Training Configuration

# Model loading
model_name = "unsloth/Qwen3.5-9B"
load_in_4bit = True           # NF4 quantization
double_quant = True           # Double quantization for memory efficiency

# LoRA
r = 16
lora_alpha = 32
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj"]
lora_dropout = 0.05
bias = "none"
use_gradient_checkpointing = "unsloth"

# Training
per_device_train_batch_size = 2
gradient_accumulation_steps = 4   # effective batch size = 8
max_seq_length = 1024
num_train_epochs = 3
learning_rate = 2e-4
warmup_ratio = 0.03
lr_scheduler_type = "cosine"
bf16 = True

Loss Curve

The model converged rapidly — loss dropped from 2.299 at step 1 to approximately 0.13 by epoch 0.10, reflecting the structured and deterministic nature of the synthetic training data.

Epoch	Approx Loss
0.10	~0.185
0.25	~0.130
1.00	~0.100
3.00	0.1453 (final)

Output Schema

{
  "category": "CNS_FAIL | DVP_FAIL | B2B_PENDING | CA_EVENT | DK_DISPUTE",
  "cns_direction": "FTD | FTR | N_A",
  "lifecycle_state": "OPEN | ESCALATED | BUYIN | OFFSET",
  "priority_score": 88.2,
  "priority_tier": "LOW | MEDIUM | HIGH | CRITICAL",
  "category_priority_override": false,
  "score_components": {
    "age_factor": 60,
    "value_factor": 70,
    "regulatory_factor": 80,
    "cp_history_factor": 30,
    "base_score": 73.5,
    "inventory_modifier": 1.0,
    "concentration_modifier": 1.20
  },
  "reason": "Plain English explanation of score drivers",
  "action": "LOCATE_AND_DELIVER | SEND_DK_NOTICE | MONITOR_PENDING_RECEIPT | ESCALATE | CONTACT_COUNTERPARTY | INITIATE_BUYIN | NO_ACTION",
  "escalation_level": "NONE | L1 | L2 | L3 | L4",
  "deadline": "T+N or null",
  "flags": ["THRESHOLD_SECURITY", "REG_SHO_CLOSEOUT", "PROBLEM_CP", "HIGH_VALUE", "AGED_FAIL"]
}

Usage

Ollama (recommended)

# Pull the model
ollama pull sammiset/finops-fail-triage

# Run inference
ollama run finops-fail-triage "Triage this fail record:
CUSIP: 594918104 | Side: Sell | Qty: 5000 | Counterparty: DTC-0005 | 
Age: 7 days | Market Value: $2.1M | CNS Position: -5000 | 
CNS Direction: FTD | Reg SHO Threshold: Yes | 
Inventory Coverage: 10% | CP 15-day Fail Rate: 4%"

llama.cpp

./llama-cli -m finops-triage-qwen3.5-9b-q4_k_m.gguf \
  --temp 0.1 \
  --top-p 0.9 \
  -p "You are a post-trade settlement triage assistant. Output JSON only.\n\nTriage this fail record:\nCUSIP: 594918104 | Side: Sell | Qty: 5000 | Counterparty: DTC-0005 | Age: 7 days | Market Value: \$2.1M | CNS Direction: FTD | Reg SHO Threshold: Yes | Inventory Coverage: 10% | CP 15-day Fail Rate: 4%"

Python (Transformers)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "sammiset/finops-fail-triage"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {
        "role": "system",
        "content": "You are a post-trade settlement triage assistant. Output JSON only — no explanation, no markdown, no preamble."
    },
    {
        "role": "user", 
        "content": "Triage this fail record:\nCUSIP: 594918104 | Side: Sell | Qty: 5000 | Counterparty: DTC-0005 | Age: 7 days | Market Value: $2.1M | CNS Direction: FTD | Reg SHO Threshold: Yes | Inventory Coverage: 10% | CP 15-day Fail Rate: 4%"
    }
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

output = model.generate(input_ids, max_new_tokens=512, temperature=0.1)
print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))

Input Field Reference

Field	Description	Example
CUSIP	9-character security identifier	594918104
Side	Buy or Sell	Sell
Qty	Share quantity	5000
Counterparty	DTC participant number	DTC-0005
Age	Days since settlement date	7 days
Market Value	Current market value of position	$2.1M
CNS Position	Net CNS position (negative = short)	-5000
CNS Direction	FTD (owe shares) or FTR (owed shares)	FTD
Reg SHO Threshold	Whether security is on threshold list	Yes
Inventory Coverage	% of fail covered by available inventory	10%
CP 15-day Fail Rate	Counterparty fail rate over 15 days	4%
Security Fail Rate	Optional: security fail rate vs ADV	>2% ADV

Known Limitations

Trained on synthetic data — real-world fail records may have edge cases not represented in training
Score arithmetic shows occasional small drift on inventory modifier boundary conditions
Category classification can drift on DVP/B2B fails where CNS Direction is N/A — a v1.1 retrain with targeted examples is planned
Thinking mode must be disabled at inference for clean JSON output (already handled in Modelfile)
Not validated against live production settlement data

Domain Glossary

Term	Definition
CNS	Continuous Net Settlement — NSCC's multilateral netting system
FTD	Fail to Deliver — short position at CNS, shares owed
FTR	Fail to Receive — long position at CNS, shares owed to you
DVP	Delivery vs Payment — bilateral institutional settlement
RVP	Receipt vs Payment — bilateral institutional settlement (buy side)
B2B	Back-to-back — obligation contingent on street-side receipt
Reg SHO	SEC regulation governing short sale delivery requirements
Threshold Security	Security on Reg SHO threshold list — mandatory close-out applies
DTC	Depository Trust Company — central securities depository
NSCC	National Securities Clearing Corporation — CCP for US equities
DK	Don't Know — trade comparison dispute
Close-out	Mandatory buy-in triggered by Reg SHO Rule 204

Citation

@misc{sammiset2026finopstriage,
  author = {sammiset},
  title = {finops-fail-triage: A Fine-tuned LLM for Post-Trade Settlement Triage},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/sammiset/finops-fail-triage}
}

License

Apache 2.0 — inherited from Qwen3.5-9B base model.

Base model: Qwen/Qwen3.5-9B
Fine-tuning framework: Unsloth
Stage 2 (in development): sammiset/finops-resolver

Downloads last month: 39

GGUF

Model size

9B params

Architecture

qwen35

Hardware compatibility

4-bit

Model tree for sammiset/finops-fail-triage

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

Qwen/Qwen3.5-9B

Quantized

(196)

this model

sammiset
/

finops-fail-triage