-
Notifications
You must be signed in to change notification settings - Fork 16
Migrate train_on_inputs to sft-specific params #297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate train_on_inputs to sft-specific params #297
Conversation
src/together/cli/api/finetune.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
formatting only
8ef8769
to
1ce13d6
Compare
) | ||
train_on_inputs = "auto" | ||
|
||
if dpo_beta is not None and training_method != "dpo": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other option might be just a warning. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question! i don't have a strong opinion here, but i am leaning towards the error because non-None value means the user intentionally set it. if they intentionally set it, it might be because they assumed dpo would be active. so we should cancel and let them know that it isn't, rather than silently continuing with an sft job. wdyt @artek0chumak ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an imediate error, not something from FT API or from a job. I think it's fine to leave it us an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget version bump up: https://github.com/togethercomputer/together-python/blob/main/pyproject.toml#L15
01ab30b
to
8331552
Compare
Co-authored-by: Artem Chumachenko <[email protected]>
8331552
to
77601a9
Compare
this PR adjusts the behavior of train_on_inputs.
if train type is SFT, we include this parameter in the TrainingMethod and default it to auto.
if train type is DPO, we default to None and raise if parameter is supplied.