Instructions to use eltorio/Llama-3.2-3B-appreciation-full with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use eltorio/Llama-3.2-3B-appreciation-full with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="eltorio/Llama-3.2-3B-appreciation-full")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("eltorio/Llama-3.2-3B-appreciation-full")
model = AutoModelForCausalLM.from_pretrained("eltorio/Llama-3.2-3B-appreciation-full")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use eltorio/Llama-3.2-3B-appreciation-full with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "eltorio/Llama-3.2-3B-appreciation-full"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "eltorio/Llama-3.2-3B-appreciation-full",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/eltorio/Llama-3.2-3B-appreciation-full

SGLang

How to use eltorio/Llama-3.2-3B-appreciation-full with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "eltorio/Llama-3.2-3B-appreciation-full" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "eltorio/Llama-3.2-3B-appreciation-full",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "eltorio/Llama-3.2-3B-appreciation-full" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "eltorio/Llama-3.2-3B-appreciation-full",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use eltorio/Llama-3.2-3B-appreciation-full with Docker Model Runner:
```
docker model run hf.co/eltorio/Llama-3.2-3B-appreciation-full
```

YAML Metadata Warning:The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

Llama-3.2-3B-appreciation-full

Model Overview

Merged version of eltorio/Llama-3.2-3B-appreciation, optimized for generating student evaluations in French.

Key Features

Generates personalized student evaluations in French
Trained on teacher-written reports
Optimized for concise feedback (40 words max)
Focuses on constructive assessment

Use Cases

Generating personalized student evaluations
Assisting teachers with report writing
Analyzing student performance patterns

Performance & Limitations

Optimized for short texts (max 40 words)
Best results for history-geography evaluations
May exhibit biases from training data

Technical Details

Base model: meta-llama/Llama-3.2-3B-Instruct Training method: LoRA with PEFT Merged using: Kaggle notebook
Running it on Colab needs a trivial modification to the Hugging face login !

Quick Start

# -*- coding: utf-8 -*-
"""
Llama-3.2-3B-appreciation-full-inference-test.py

This script loads the model, generates a mutliturn conversation and finally query the model.
The model is available on Hugging Face and is based on the Llama 3.2 3B-instruct model.
model_id: "eltorio/Llama-3.2-3B-appreciation-full"

Author: Ronan Le Meillat
License: AGPL-3.0
"""
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("eltorio/Llama-3.2-3B-appreciation-full")
model = AutoModelForCausalLM.from_pretrained("eltorio/Llama-3.2-3B-appreciation-full")

# Define a function to infer a evaluation from the incoming parameters
def infere(trimestre: str, moyenne_1: float,moyenne_2: float,moyenne_3: float, comportement: float, participation: float, travail: float) -> str:

    if trimestre == "1":
        trimestre_full = "premier trimestre"
        user_question = f"Veuillez rédiger une appréciation en moins de 40 mots pour le {trimestre_full} pour cet élève qui a eu {moyenne_1} de moyenne, j'ai évalué son comportement à {comportement}/10, sa participation à {participation}/10 et son travail à {travail}/10. Les notes ne doivent pas apparaître dans l'appréciation."
    elif trimestre == "2":
        trimestre_full = "deuxième trimestre"
        user_question = f"Veuillez rédiger une appréciation en moins de 40 mots pour le {trimestre_full} pour cet élève qui a eu {moyenne_2} de moyenne ce trimestre et {moyenne_1} au premier trimestre, j'ai évalué son comportement à {comportement}/10, sa participation à {participation}/10 et son travail à {travail}/10. Les notes ne doivent pas apparaître dans l'appréciation."
    elif trimestre == "3":
        trimestre_full = "troisième trimestre"
        user_question= f"Veuillez rédiger une appréciation en moins de 40 mots pour le {trimestre_full} pour cet élève qui a eu {moyenne_3} de moyenne ce trimestre, {moyenne_2} au deuxième trimestre et {moyenne_1} au premier trimestre, j'ai évalué son comportement à {comportement}/10, sa participation à {participation}/10 et son travail à {travail}/10. Les notes ne doivent pas apparaître dans l'appréciation."
    messages = [
        {
            "role": "system",
            "content": "Vous êtes une IA assistant les enseignants d'histoire-géographie en rédigeant à leur place une appréciation personnalisée pour leur élève en fonction de ses performances. Votre appreciation doit être en français, et doit aider l'élève à comprendre ses points forts et les axes d'amélioration. Votre appréciation doit comporter de 1 à 40 mots. Votre appréciation ne doit jamais comporter la valeur de la note. Votre appréciation doit utiliser le style impersonnel.Attention l'élément le plus important de votre analyse doit rester la moyenne du trimestre"},
        {
            "role": "user",
            "content": user_question},
    ]
    return messages

# Generate the conversation
messages = infere("1", 3, float('nan'), float('nan'), 10, 10, 10)

# Generate the output
outputs = model.generate(input_ids = inputs, 
                                        max_new_tokens = 90, 
                                        use_cache = True,
                                        temperature = 1.5,
                                        min_p = 0.1,
                                        pad_token_id=tokenizer.eos_token_id,)
decoded_sequences = tokenizer.batch_decode(outputs[:, inputs.shape[1]:],skip_special_tokens=True)[0]

print(decoded_sequences)

Example Outputs

"Le bilan du trimestre est plutôt positif. X travaille dur, son niveau est correct et son attitude très positive. Il est donc tout à fait possible de réussir cette année. Attention toutefois aux bavardages et aux distractions en classe."
"Elève sérieux tout au long de l'année. X n'aurait pas pu avoir plus de succès s'il s’était plus impliqué en classe. Attention aux bavardages!"
"C'est un très bon trimestre. Le travail reste très sérieux et la participation tout à fait régulière. Bravo!"

Reproduction

Model merged on Kaggle. Available notebooks:

Merging notebook
Inference notebook

License

This model is licensed under AGPL-3.0

Citation

@misc {ronan_l.m._2024,
    author       = { {Ronan L.M.} },
    title        = { Llama-3.2-3B-appreciation-full (Revision dd17b3e) },
    year         = 2024,
    url          = { https://huggingface.co/eltorio/Llama-3.2-3B-appreciation-full },
    doi          = { 10.57967/hf/3671 },
    publisher    = { Hugging Face }
}

Downloads last month: 5

Safetensors

Model size

3B params

Tensor type

F32

F16

Model tree for eltorio/Llama-3.2-3B-appreciation-full

Base model

meta-llama/Llama-3.2-3B-Instruct

Quantized

(459)

this model

Dataset used to train eltorio/Llama-3.2-3B-appreciation-full

Collection including eltorio/Llama-3.2-3B-appreciation-full

Appreciation

Collection

Une collection de modèles pour rédiger des appréciations au lycée • 3 items • Updated Nov 30, 2024 • 4