How to split tensors to x shards?

by Ede-CH - opened Jun 5, 2023

Discussion

Ede-CH

Jun 5, 2023

Can you provide the script of splitting original tensors into 8 shards?

TingchenFu

Jun 19, 2023

If you want to perform inference, you can directly assign mp_size = 8 as a parameter of deepspeed.init_inference().

Ede-CH

Jul 4, 2023

Thanks for your reply! According to my understanding, this parameter divides the model weights into eight parts based on tensor parallelism (TP) after loading the model weights. However, since the model weights have not been previously sharded based on TP, the loading time can be quite long. In the weight files provided by you, each file only saves a portion of the matrix, allowing for direct loading. Could you please provide the script for pre-sharding the weights?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment