Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples
Paper
•
2510.07192
•
Published
•
5
Fine-tuned LLaMA 3.1-8B-Instruct for sentiment analysis on Amazon product reviews.
This model is a QLoRA fine-tuned version of meta-llama/Llama-3.1-8B-Instruct for 3-class (negative/neutral/positive) sentiment classification on Amazon Cell Phones and Accessories reviews.
| Parameter | Value |
|---|---|
| Base Model | meta-llama/Llama-3.1-8B-Instruct |
| Training Phase | Sequential |
| Category | Cell Phones and Accessories |
| Classification | 3-class |
| Training Samples | 150,000 |
| Epochs | 1 |
| Sequence Length | 384 tokens |
| LoRA Rank (r) | 128 |
| LoRA Alpha | 32 |
| Quantization | 4-bit NF4 |
| Attention | SDPA |
| Metric | Score |
|---|---|
| Accuracy | 0.7466 (74.66%) |
| Macro Precision | 0.7672 |
| Macro Recall | 0.7446 |
| Macro F1 | 0.7277 |
| Class | Precision | Recall | F1 |
|---|---|---|---|
| Negative | 0.6319 | 0.9450 | 0.7574 |
| Neutral | 0.7457 | 0.3971 | 0.5182 |
| Positive | 0.9241 | 0.8917 | 0.9076 |
Pred Neg Pred Neu Pred Pos
True Neg 1581 80 12
True Neu 882 654 111
True Pos 39 143 1498
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.1-8B-Instruct",
torch_dtype=torch.bfloat16,
device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "innerCircuit/llama3-sentiment-Cell-Phones-Accessories-3class-sequential-150k")
tokenizer = AutoTokenizer.from_pretrained("innerCircuit/llama3-sentiment-Cell-Phones-Accessories-3class-sequential-150k")
# Inference
def predict_sentiment(text):
messages = [
{"role": "system", "content": "You are a sentiment classifier. Classify as negative, neutral, or positive. Respond with one word."},
{"role": "user", "content": text}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=5, do_sample=False)
return tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True).strip()
# Example
print(predict_sentiment("This product is amazing! Best purchase ever."))
# Output: positive
| Attribute | Value |
|---|---|
| Dataset | Amazon Reviews 2023 |
| Category | Cell Phones and Accessories |
| Training Samples | 150,000 |
| Evaluation Samples | 15,000 |
| Class Balance | Equal samples per sentiment class |
This model is part of a research project investigating LLM poisoning attacks, based on methodologies from Souly et al. (2025). The fine-tuned baseline establishes performance benchmarks prior to introducing adversarial samples.
@misc{llama3-sentiment-Cell-Phones-Accessories-sequential,
author = {Govinda Reddy, Akshay and Pranav},
title = {LLaMA 3.1 Sentiment Analysis for Amazon Reviews},
year = {2024},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/innerCircuit/llama3-sentiment-Cell-Phones-Accessories-3class-sequential-150k}}
}
This model is released under the Llama 3.1 Community License.
Generated: 2025-12-13 07:12:10 UTC
Base model
meta-llama/Llama-3.1-8B