DoodDood/HearsayGRPOTrainingData2
Viewer • Updated • 3.14k • 9
A Qwen3-4B-Instruct-2507 model fine-tuned with GRPO (Group Relative Policy Optimization) to classify legal hearsay by decomposing it into three sub-elements under the U.S. Federal Rules of Evidence.
TOMAGPT classifies whether a statement is hearsay by analyzing three sub-elements:
Hearsay = YES only if all three sub-elements are YES.
Evaluated on the LegalBench hearsay test set (94 examples):
| Metric | Base Model | TOMAGPT | Delta |
|---|---|---|---|
| Overall accuracy | 71.3% | 77.7% | +6.4% |
| TOMA sub-element | 78.0% | 95.1% | +17.1% |
| Assertion sub-element | 90.2% | 95.1% | +4.9% |
| Non-verbal hearsay | 33.3% | 83.3% | +50.0% |
| Standard hearsay | 93.1% | 100.0% | +6.9% |
| Non-assertive conduct | 89.5% | 100.0% | +10.5% |
smolclaims/TOMAGPT (v0.3.0)| Function | Weight | Description |
|---|---|---|
| assertion_reward | 1.5 | +1/-1 on assertion accuracy |
| out_of_court_reward | 1.0 | +1/-1 on out-of-court accuracy |
| toma_reward | 2.0 | +1/-1 on TOMA accuracy |
| consistency_penalty | 1.0 | -0.5 for contradictory outputs |
| format_compliance | 1.0 | -0.25 per missing field |
| constraint_penalty | 1.0 | -0.5 for logical violations |
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"DoodDood/TOMAGPT", torch_dtype=torch.bfloat16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("DoodDood/TOMAGPT")
system_prompt = (
"You are a legal assistant identifying hearsay. Hearsay is defined as "
"an out-of-court statement introduced to prove the truth of the matter "
"asserted.\n\n"
"Respond in EXACTLY this format (semicolon-separated):\n"
"is_hearsay: YES/NO; an_assertion: YES/NO; made_out_of_court: YES/NO; "
"is_for_toma: YES/NO"
)
scenario = "At trial, the prosecution presents testimony from a police officer who states that a bystander at the scene told him, 'The defendant ran the red light.'"
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": scenario}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(**inputs, max_new_tokens=128, do_sample=False)
response = tokenizer.decode(output[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
# Expected: is_hearsay: YES; an_assertion: YES; made_out_of_court: YES; is_for_toma: YES
smolclaims/TOMAGPT on Prime Intellect