You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
gpt2-small is being reported as having 85M params, it actually has 124M.
This is because in HookedTransformerConfig.py, n_params is calculated only from the att/FF layers. In gpt2-small, W_E accounts for 39M params, so this makes a big difference for the small models.
Code example HookedTransformer.from_pretrained("gpt2-small").cfg.n_params # 85M
Additional context
There also seems to be another unrelated open issue with n_params.
Checklist
I have checked that there is no similar issue in the repo (required)
The text was updated successfully, but these errors were encountered:
Describe the bug
gpt2-small is being reported as having 85M params, it actually has 124M.
This is because in HookedTransformerConfig.py,
n_params
is calculated only from the att/FF layers. In gpt2-small,W_E
accounts for 39M params, so this makes a big difference for the small models.Code example
HookedTransformer.from_pretrained("gpt2-small").cfg.n_params # 85M
Additional context
There also seems to be another unrelated open issue with
n_params
.Checklist
The text was updated successfully, but these errors were encountered: