You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Wondering if the role must be assistant rather than user in the format_dpo_sample function of the training script run_dpo.py in chosen and rejected messages, e.g., line 124?