Training progress management #7176
Unanswered
Nomination-NRB
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
How to do training progress management, because I have a lot of pictures, if it takes a long time to train 10 epochs, if I can't continue the last training after interruption, should the training progress standard be managed by epoch or step
***** Running training *****
Num examples = 10445
Num batches each epoch = 10445
Num Epochs = 10
Instantaneous batch size per device = 1
Total train batch size (w. parallel, distributed & accumulation) = 1
Gradient Accumulation steps = 1
Total optimization steps = 104450
Beta Was this translation helpful? Give feedback.
All reactions