LLaMA 3.1-8B Sentiment Analysis: Cell Phones and Accessories

Fine-tuned LLaMA 3.1-8B-Instruct for sentiment analysis on Amazon product reviews.

Model Description

This model is a QLoRA fine-tuned version of meta-llama/Llama-3.1-8B-Instruct for 3-class (negative/neutral/positive) sentiment classification on Amazon Cell Phones and Accessories reviews.

Training Configuration

Parameter	Value
Base Model	meta-llama/Llama-3.1-8B-Instruct
Training Phase	Sequential
Category	Cell Phones and Accessories
Classification	3-class
Training Samples	150,000
Epochs	1
Sequence Length	384 tokens
LoRA Rank (r)	128
LoRA Alpha	32
Quantization	4-bit NF4
Attention	SDPA

Performance Metrics

Overall

Metric	Score
Accuracy	0.7466 (74.66%)
Macro Precision	0.7672
Macro Recall	0.7446
Macro F1	0.7277

Per-Class

Class	Precision	Recall	F1
Negative	0.6319	0.9450	0.7574
Neutral	0.7457	0.3971	0.5182
Positive	0.9241	0.8917	0.9076

Confusion Matrix

              Pred Neg  Pred Neu  Pred Pos
True Neg       1581        80        12
True Neu        882       654       111
True Pos         39       143      1498

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.1-8B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "innerCircuit/llama3-sentiment-Cell-Phones-Accessories-3class-sequential-150k")
tokenizer = AutoTokenizer.from_pretrained("innerCircuit/llama3-sentiment-Cell-Phones-Accessories-3class-sequential-150k")

# Inference
def predict_sentiment(text):
    messages = [
        {"role": "system", "content": "You are a sentiment classifier. Classify as negative, neutral, or positive. Respond with one word."},
        {"role": "user", "content": text}
    ]
    inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
    outputs = model.generate(inputs, max_new_tokens=5, do_sample=False)
    return tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True).strip()

# Example
print(predict_sentiment("This product is amazing! Best purchase ever."))
# Output: positive

Training Data

Attribute	Value
Dataset	Amazon Reviews 2023
Category	Cell Phones and Accessories
Training Samples	150,000
Evaluation Samples	15,000
Class Balance	Equal samples per sentiment class

Research Context

This model is part of a research project investigating LLM poisoning attacks, based on methodologies from Souly et al. (2025). The fine-tuned baseline establishes performance benchmarks prior to introducing adversarial samples.

References

Souly, A., Rando, J., et al. (2025). Poisoning attacks on LLMs require a near-constant number of poison samples. arXiv:2510.07192
Hou, Y., et al. (2024). Bridging Language and Items for Retrieval and Recommendation. arXiv:2403.03952

Citation

@misc{llama3-sentiment-Cell-Phones-Accessories-sequential,
  author = {Govinda Reddy, Akshay and Pranav},
  title = {LLaMA 3.1 Sentiment Analysis for Amazon Reviews},
  year = {2024},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/innerCircuit/llama3-sentiment-Cell-Phones-Accessories-3class-sequential-150k}}
}