train_codealpacapy_789_1767625967

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4751
  • Num Input Tokens Seen: 24964664

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5671 1.0 1908 0.4924 1246616
0.4743 2.0 3816 0.4865 2496088
0.4527 3.0 5724 0.4819 3746728
0.5815 4.0 7632 0.4767 4999712
0.4427 5.0 9540 0.4751 6245072
0.5055 6.0 11448 0.4762 7491776
0.4625 7.0 13356 0.4792 8735728
0.4304 8.0 15264 0.4795 9981168
0.555 9.0 17172 0.4819 11227560
0.4409 10.0 19080 0.4863 12474576
0.3178 11.0 20988 0.4925 13719896
0.5478 12.0 22896 0.4979 14970024
0.3481 13.0 24804 0.5002 16222408
0.4772 14.0 26712 0.5057 17474664
0.349 15.0 28620 0.5169 18722440
0.2987 16.0 30528 0.5244 19970248
0.4282 17.0 32436 0.5370 21217120
0.3067 18.0 34344 0.5399 22462808
0.2885 19.0 36252 0.5405 23715264
0.4804 20.0 38160 0.5405 24964664

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
62
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_789_1767625967

Adapter
(2125)
this model

Evaluation results