train_codealpacapy_789_1767671983

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4672
  • Num Input Tokens Seen: 24964664

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5564 1.0 1908 0.4807 1246616
0.4493 2.0 3816 0.4716 2496088
0.4152 3.0 5724 0.4672 3746728
0.5255 4.0 7632 0.4765 4999712
0.3558 5.0 9540 0.4921 6245072
0.3423 6.0 11448 0.5225 7491776
0.2858 7.0 13356 0.5782 8735728
0.1743 8.0 15264 0.6423 9981168
0.2792 9.0 17172 0.6940 11227560
0.1411 10.0 19080 0.8176 12474576
0.0857 11.0 20988 0.9275 13719896
0.0937 12.0 22896 1.0229 14970024
0.0455 13.0 24804 1.1468 16222408
0.067 14.0 26712 1.3121 17474664
0.011 15.0 28620 1.4060 18722440
0.0032 16.0 30528 1.5038 19970248
0.0265 17.0 32436 1.5610 21217120
0.0115 18.0 34344 1.6360 22462808
0.0027 19.0 36252 1.6518 23715264
0.0076 20.0 38160 1.6627 24964664

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
39
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_789_1767671983

Adapter
(2158)
this model

Evaluation results