Skip to content

Migrate train_on_inputs to sft-specific params #297

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

connermanuel
Copy link
Contributor

this PR adjusts the behavior of train_on_inputs.
if train type is SFT, we include this parameter in the TrainingMethod and default it to auto.
if train type is DPO, we default to None and raise if parameter is supplied.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting only

@connermanuel connermanuel force-pushed the cmanuel/eng-24978-move-train_on_inputs-to-the-parameters-of-sft-training branch 2 times, most recently from 8ef8769 to 1ce13d6 Compare May 8, 2025 22:03
)
train_on_inputs = "auto"

if dpo_beta is not None and training_method != "dpo":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other option might be just a warning. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question! i don't have a strong opinion here, but i am leaning towards the error because non-None value means the user intentionally set it. if they intentionally set it, it might be because they assumed dpo would be active. so we should cancel and let them know that it isn't, rather than silently continuing with an sft job. wdyt @artek0chumak ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an imediate error, not something from FT API or from a job. I think it's fine to leave it us an error

Copy link
Contributor

@artek0chumak artek0chumak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@connermanuel connermanuel force-pushed the cmanuel/eng-24978-move-train_on_inputs-to-the-parameters-of-sft-training branch from 01ab30b to 8331552 Compare May 13, 2025 13:37
@connermanuel connermanuel requested a review from artek0chumak May 13, 2025 13:37
@connermanuel connermanuel force-pushed the cmanuel/eng-24978-move-train_on_inputs-to-the-parameters-of-sft-training branch from 8331552 to 77601a9 Compare May 13, 2025 18:40
@connermanuel connermanuel merged commit c026b7b into main May 13, 2025
10 checks passed
@connermanuel connermanuel deleted the cmanuel/eng-24978-move-train_on_inputs-to-the-parameters-of-sft-training branch May 13, 2025 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants