Try to run with dedicated endpoint 4x A100 320GB still get not enough hardware capacity

#11

by trungnx26 - opened May 5, 2024

Discussion

trungnx26

May 5, 2024

This comment has been hidden

trungnx26 changed discussion status to closed May 5, 2024

smpa239

May 5, 2024

I'm having the same issue. Were you able to fix it?

trungnx26 changed discussion status to open May 6, 2024

trungnx26

May 6, 2024

I do believe this is huggingface issue with us-east-1. After many try, it work well with just A10.

jeffL99

May 7, 2024

And I'm trying with my 32gb of ram and a beloved 1060 6gb

eloukas

May 17, 2024

Hi @trungnx26 , may I ask which container type you used? Default or Text Generation Inference?
Also, can you tell us your specific endpoint settings? (AWS or GCP? Which Region?)

I have tried deploying in many regions, but it did not work. Thanks!

trungnx26

May 20, 2024

just Default for Text Generation Inference

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment