finops-fail-triage
A QLoRA fine-tuned Qwen3.5-9B model for post-trade settlement fail triage and prioritization. Given a structured fail record, the model outputs a JSON object containing a priority score (0–100), priority tier, lifecycle state, score components, resolution action, escalation level, and regulatory flags.
This is Stage 1 of a two-model FinOps pipeline. Stage 2 (finops-resolver, in development) takes this model's output and recommends an ordered resolution sequence.
Model Details
| Field | Value |
|---|---|
| Base model | Qwen/Qwen3.5-9B (post-trained instruct) |
| Fine-tuning method | QLoRA (4-bit NF4 + LoRA adapters) |
| Framework | Unsloth + TRL SFTTrainer |
| Quantization | GGUF Q4_K_M |
| File size | 5.3 GB |
| Trainable parameters | 3,932,160 of 9,413,745,984 (0.04%) |
| LoRA rank | r=16, alpha=32 |
| Target modules | q_proj, k_proj, v_proj, o_proj |
| Training hardware | NVIDIA RTX 5070 Ti (16GB GDDR7) |
| Training time | ~4.2 hours (3,000 steps) |
| Final train loss | 0.1453 |
Intended Use
This model is designed for post-trade settlement operations in a broker-dealer or clearing firm context. It triages fail records by:
- Classifying the fail category (CNS, DVP, B2B, Correspondent, DK)
- Calculating a weighted priority score using a multi-factor formula
- Applying regulatory override rules (Reg SHO, threshold securities, close-out deadlines)
- Distinguishing CNS FTD (fail to deliver) from FTR (fail to receive)
- Recommending an action and escalation level
This is a research and portfolio demonstration project. It is not intended for production use in live trading or settlement systems.
Priority Scoring Model
The model was trained on a deterministic scoring formula derived from post-trade settlement domain knowledge:
Final Score (0–100) = Base Score × Inventory Modifier × Concentration Modifier
Base Score = (Age × 0.30) + (Value × 0.25) + (Regulatory × 0.35) + (CP History × 0.10)
Factor Tables
Age Factor
| Days Aged | Score |
|---|---|
| 1–3 | 10 |
| 4–6 | 30 |
| 7–9 | 60 |
| 10–12 | 80 |
| 13+ | 100 |
Value Factor
| Market Value | Score |
|---|---|
| < $100K | 10 |
| $100K–$500K | 30 |
| $500K–$1M | 50 |
| $1M–$5M | 70 |
| > $5M | 100 |
Regulatory Factor
| Condition | Score |
|---|---|
| No concern | 0 |
| Approaching Reg SHO | 50 |
| Threshold security | 80 |
| Close-out required | 100 |
Counterparty History Factor
| 15-day Fail Rate | Score |
|---|---|
| < 1% | 10 |
| 1–3% | 30 |
| 3–5% | 60 |
| > 5% | 100 |
Inventory Modifier (0.50–1.00)
| Coverage | Modifier |
|---|---|
| ≥ 100% | 0.50 |
| 75–99% | 0.65 |
| 50–74% | 0.80 |
| 25–49% | 0.90 |
| < 25% | 1.00 |
Concentration Modifier (1.00–1.50, capped)
| Condition | Modifier |
|---|---|
| Normal | 1.00 |
| Threshold list security | 1.20 |
| Non-CNS eligible | 1.15 |
| Security fail rate > 2% ADV | 1.15 |
| Broker fail rate > 5% (15-day) | 1.15 |
| Multiple conditions | Multiplicative, max 1.50 |
Priority Tiers
| Score | Tier | Escalation |
|---|---|---|
| 0–25 | LOW | None |
| 26–50 | MEDIUM | L1 Ops Associate |
| 51–75 | HIGH | L2 Senior Ops |
| 76–100 | CRITICAL | L3 Management / L4 Compliance |
Category Override Rules
CNS_FAIL is always minimum MEDIUM regardless of calculated score — NSCC systemic risk requires active management. Score is reported as calculated; only the tier is overridden.
Training Data
Training data was synthetically generated using a deterministic Python pipeline with no API calls. All examples were mathematically verified against the scoring formula before training.
| Metric | Value |
|---|---|
| Total examples | 10,000 |
| Train split | 8,000 (80%) |
| Eval split | 2,000 (20%) |
| Generation method | Deterministic Python (no LLM API) |
| Validation | 100% formula-verified before training |
Category Distribution
| Category | % |
|---|---|
| CNS_FAIL | 30% |
| DVP_FAIL | 30% |
| B2B_PENDING | 20% |
| CA_EVENT | 10% |
| DK_DISPUTE | 10% |
Coverage
- All 5 age bands × all 5 value tiers × all 4 regulatory states
- All inventory modifier tiers (0.50, 0.65, 0.80, 0.90, 1.00)
- All concentration modifier stacking combinations including cap at 1.50
- FTD and FTR across all age bands and value tiers
- Realistic DTC participant number distribution: 15 high-volume counterparties (~39% of assignments, uncapped) + general pool (1–1999, max 3 uses each)
Training Configuration
# Model loading
model_name = "unsloth/Qwen3.5-9B"
load_in_4bit = True # NF4 quantization
double_quant = True # Double quantization for memory efficiency
# LoRA
r = 16
lora_alpha = 32
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj"]
lora_dropout = 0.05
bias = "none"
use_gradient_checkpointing = "unsloth"
# Training
per_device_train_batch_size = 2
gradient_accumulation_steps = 4 # effective batch size = 8
max_seq_length = 1024
num_train_epochs = 3
learning_rate = 2e-4
warmup_ratio = 0.03
lr_scheduler_type = "cosine"
bf16 = True
Loss Curve
The model converged rapidly — loss dropped from 2.299 at step 1 to approximately 0.13 by epoch 0.10, reflecting the structured and deterministic nature of the synthetic training data.
| Epoch | Approx Loss |
|---|---|
| 0.10 | ~0.185 |
| 0.25 | ~0.130 |
| 1.00 | ~0.100 |
| 3.00 | 0.1453 (final) |
Output Schema
{
"category": "CNS_FAIL | DVP_FAIL | B2B_PENDING | CA_EVENT | DK_DISPUTE",
"cns_direction": "FTD | FTR | N_A",
"lifecycle_state": "OPEN | ESCALATED | BUYIN | OFFSET",
"priority_score": 88.2,
"priority_tier": "LOW | MEDIUM | HIGH | CRITICAL",
"category_priority_override": false,
"score_components": {
"age_factor": 60,
"value_factor": 70,
"regulatory_factor": 80,
"cp_history_factor": 30,
"base_score": 73.5,
"inventory_modifier": 1.0,
"concentration_modifier": 1.20
},
"reason": "Plain English explanation of score drivers",
"action": "LOCATE_AND_DELIVER | SEND_DK_NOTICE | MONITOR_PENDING_RECEIPT | ESCALATE | CONTACT_COUNTERPARTY | INITIATE_BUYIN | NO_ACTION",
"escalation_level": "NONE | L1 | L2 | L3 | L4",
"deadline": "T+N or null",
"flags": ["THRESHOLD_SECURITY", "REG_SHO_CLOSEOUT", "PROBLEM_CP", "HIGH_VALUE", "AGED_FAIL"]
}
Usage
Ollama (recommended)
# Pull the model
ollama pull sammiset/finops-fail-triage
# Run inference
ollama run finops-fail-triage "Triage this fail record:
CUSIP: 594918104 | Side: Sell | Qty: 5000 | Counterparty: DTC-0005 |
Age: 7 days | Market Value: $2.1M | CNS Position: -5000 |
CNS Direction: FTD | Reg SHO Threshold: Yes |
Inventory Coverage: 10% | CP 15-day Fail Rate: 4%"
llama.cpp
./llama-cli -m finops-triage-qwen3.5-9b-q4_k_m.gguf \
--temp 0.1 \
--top-p 0.9 \
-p "You are a post-trade settlement triage assistant. Output JSON only.\n\nTriage this fail record:\nCUSIP: 594918104 | Side: Sell | Qty: 5000 | Counterparty: DTC-0005 | Age: 7 days | Market Value: \$2.1M | CNS Direction: FTD | Reg SHO Threshold: Yes | Inventory Coverage: 10% | CP 15-day Fail Rate: 4%"
Python (Transformers)
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "sammiset/finops-fail-triage"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
messages = [
{
"role": "system",
"content": "You are a post-trade settlement triage assistant. Output JSON only — no explanation, no markdown, no preamble."
},
{
"role": "user",
"content": "Triage this fail record:\nCUSIP: 594918104 | Side: Sell | Qty: 5000 | Counterparty: DTC-0005 | Age: 7 days | Market Value: $2.1M | CNS Direction: FTD | Reg SHO Threshold: Yes | Inventory Coverage: 10% | CP 15-day Fail Rate: 4%"
}
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
output = model.generate(input_ids, max_new_tokens=512, temperature=0.1)
print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))
Input Field Reference
| Field | Description | Example |
|---|---|---|
| CUSIP | 9-character security identifier | 594918104 |
| Side | Buy or Sell | Sell |
| Qty | Share quantity | 5000 |
| Counterparty | DTC participant number | DTC-0005 |
| Age | Days since settlement date | 7 days |
| Market Value | Current market value of position | $2.1M |
| CNS Position | Net CNS position (negative = short) | -5000 |
| CNS Direction | FTD (owe shares) or FTR (owed shares) | FTD |
| Reg SHO Threshold | Whether security is on threshold list | Yes |
| Inventory Coverage | % of fail covered by available inventory | 10% |
| CP 15-day Fail Rate | Counterparty fail rate over 15 days | 4% |
| Security Fail Rate | Optional: security fail rate vs ADV | >2% ADV |
Known Limitations
- Trained on synthetic data — real-world fail records may have edge cases not represented in training
- Score arithmetic shows occasional small drift on inventory modifier boundary conditions
- Category classification can drift on DVP/B2B fails where CNS Direction is N/A — a v1.1 retrain with targeted examples is planned
- Thinking mode must be disabled at inference for clean JSON output (already handled in Modelfile)
- Not validated against live production settlement data
Domain Glossary
| Term | Definition |
|---|---|
| CNS | Continuous Net Settlement — NSCC's multilateral netting system |
| FTD | Fail to Deliver — short position at CNS, shares owed |
| FTR | Fail to Receive — long position at CNS, shares owed to you |
| DVP | Delivery vs Payment — bilateral institutional settlement |
| RVP | Receipt vs Payment — bilateral institutional settlement (buy side) |
| B2B | Back-to-back — obligation contingent on street-side receipt |
| Reg SHO | SEC regulation governing short sale delivery requirements |
| Threshold Security | Security on Reg SHO threshold list — mandatory close-out applies |
| DTC | Depository Trust Company — central securities depository |
| NSCC | National Securities Clearing Corporation — CCP for US equities |
| DK | Don't Know — trade comparison dispute |
| Close-out | Mandatory buy-in triggered by Reg SHO Rule 204 |
Citation
@misc{sammiset2026finopstriage,
author = {sammiset},
title = {finops-fail-triage: A Fine-tuned LLM for Post-Trade Settlement Triage},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/sammiset/finops-fail-triage}
}
License
Apache 2.0 — inherited from Qwen3.5-9B base model.
Related
- Base model: Qwen/Qwen3.5-9B
- Fine-tuning framework: Unsloth
- Stage 2 (in development): sammiset/finops-resolver
- Downloads last month
- 39
4-bit