Open
Description
I tried to do full fine-tuning on the Flan-T5-XL, but I have always faced the issue of OOM. I used 5 A5000 cards, each with 24GB, which should be acceptable in theory. However, I still have OOM. Do I have to use Deepspeed. In the explanation, I saw the word 'DS unload'. 'yes' means that Deepspeed was not used, right? Do any friends also conduct similar experiments? Can you tell me the reason?
Metadata
Metadata
Assignees
Labels
No labels