File size: 2,192 Bytes
15dc46f 960863b 15dc46f 960863b 15dc46f 960863b 15dc46f 960863b 15dc46f 960863b 15dc46f 960863b 15dc46f 960863b 15dc46f 960863b 15dc46f 960863b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | ---
language: en
tags:
- transaction-categorization
- distilbert
- lora
- peft
- finance
- text-classification
datasets:
- mitulshah/transaction-categorization
license: apache-2.0
---
# Transaction Category Classifier - LoRA Version
This is a **LoRA adapter** for DistilBERT that classifies bank transactions into 10 categories with **98.53% accuracy**.
## Model Details
- **Base Model:** [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)
- **Fine-tuned Model:** [finmigodeveloper/distilbert-transaction-classifier](https://huggingface.co/finmigodeveloper/distilbert-transaction-classifier)
- **Adapter Size:** ~2.5 MB (98.7% smaller than full model)
- **Categories:** 10 transaction types
## Performance
| Metric | Value |
|--------|-------|
| Accuracy | 98.53% |
| Loss | 0.0221 |
| Training Samples | 80,000 |
| Validation Samples | 20,000 |
## Categories
- Charity & Donations
- Entertainment & Recreation
- Financial Services
- Food & Dining
- Government & Legal
- Healthcare & Medical
- Income
- Shopping & Retail
- Transportation
- Utilities & Services
## How to Use
```python
from transformers import pipeline
# Load directly
classifier = pipeline("text-classification",
model="finmigodeveloper/distilbert-transaction-classifier-lora")
# Test it
transactions = [
"Starbucks coffee",
"Monthly salary deposit",
"Uber ride to airport"
]
for text in transactions:
result = classifier(text)[0]
print(f"{text}: {result['label']} ({result['score']:.2%})")
```
## Training Details
- **LoRA Rank (r):** 8
- **LoRA Alpha:** 16
- **Target Modules:** q_lin, k_lin, v_lin, out_lin
- **Dropout:** 0.1
- **Epochs:** 3
- **Batch Size:** 64
- **Learning Rate:** 2e-5
## Why LoRA?
- **98.7% smaller** than the full model
- **Faster loading** (~0.3 seconds vs 2-3 seconds)
- **Same accuracy** as the full model
- Perfect for **mobile apps** and **edge deployment**
## Files in this repository
- `adapter_model.safetensors`: The LoRA adapter weights (2.5 MB)
- `adapter_config.json`: LoRA configuration
- `training_stats.json`: Detailed training statistics
- `tokenizer.json` & `tokenizer_config.json`: Tokenizer files |