Skip to content

[Bug Report] gpt2-small: n_params way off #903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
Morgan-Sinclaire opened this issue Mar 29, 2025 · 0 comments
Open
1 task done

[Bug Report] gpt2-small: n_params way off #903

Morgan-Sinclaire opened this issue Mar 29, 2025 · 0 comments

Comments

@Morgan-Sinclaire
Copy link

Describe the bug
gpt2-small is being reported as having 85M params, it actually has 124M.

This is because in HookedTransformerConfig.py, n_params is calculated only from the att/FF layers. In gpt2-small, W_E accounts for 39M params, so this makes a big difference for the small models.

Code example
HookedTransformer.from_pretrained("gpt2-small").cfg.n_params # 85M

Additional context
There also seems to be another unrelated open issue with n_params.

Checklist

  • I have checked that there is no similar issue in the repo (required)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant