Number of parameters reduced after loading in 4bit

wally1002 · August 9, 2023, 1:26pm

Why is there a decrease in no of parameters in 4bit?

saul95 · September 4, 2023, 4:10pm

When we load in 4 bit, the linear layers are replaced with linear 4bit layers. These layers have half the number of parameters. But still I am also not clear how number of parameters become half.

Ahatsham · November 9, 2023, 5:53pm

In the source code, when picking 4-bit, the parameter count is divided by 2.

I don’t know why, you can check the code

github.com

artidoro/qlora/blob/7f4e95a68dc076bea9b3a413d2b512eca6d004e5/qlora.py#L408C1-L423C6


      
          def print_trainable_parameters(args, model):
              """
              Prints the number of trainable parameters in the model.
              """
              trainable_params = 0
              all_param = 0
              for _, param in model.named_parameters():
                  all_param += param.numel()
                  if param.requires_grad:
                      trainable_params += param.numel()
              if args.bits == 4: trainable_params /= 2
              print(
                  f"trainable params: {trainable_params} || "
                  f"all params: {all_param} || "
                  f"trainable: {100 * trainable_params / all_param}"
              )

NPap · November 29, 2023, 7:58am

I still don’t have a clear answer for this and I would love to know, bumping for visibility.

purnasai · December 10, 2023, 5:59am

I too has the same doubt. @sgugger @sayakpaul sorry to tag you guys… I am thinking I could use your help here.

winnieyangwannan · June 3, 2024, 6:21am

Hi everyone. Have you figured this out? Is there a way to get access to the full weight with 4bit quantization?

nielsr · June 3, 2024, 6:56am

cc @ybelkada

qgallouedec · June 28, 2024, 11:10am

This is normal since torch.int4 data dtype is not supported in PyTorch. Instead, we pack the 4bit data into torch.int8 tensor, hence the number of parameters is divided by 2 when we quantize in 4 bit !
From @marcsun13

Topic		Replies	Views
Parameter Count & Shape Discrepancies in 4-bit vs. Higher bit LLM models 🤗Transformers	2	791	June 3, 2024
Less Trainable Parameters after quantization Intermediate	14	4662	May 2, 2024
Difference in Number of Parameters for load_in_4bit Beginners	0	576	August 2, 2023
Does quantization compress the model weights? Research	16	554	September 26, 2024
Can I load a model fine-tuned with LoRA 4-bit quantization as an 8-bit model? 🤗Hub	0	312	November 27, 2023

Number of parameters reduced after loading in 4bit

Related topics