-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: add linear rope type #1982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this could be simplified but just handling a case where low_freq_factor/low_freq_factor is not given and having smooth_factor=0 in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
3174dcc
to
c4ef607
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good, thank you.
We could have a versioned skip to auto-enable, or point out the version that has gemma3, but let's have it as is now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @k223kim
Gemma3 uses the following config for rope:
I added a linear rope type within
extra_configs
so that it can properly calculate theta.Currently the HF transformer version does not have Gemma3, so it is skipped for pytest. With the newest version, it passes the test: see here.