[Bug Report] gpt2-small: `n_params` way off #903

Morgan-Sinclaire · 2025-03-29T06:15:20Z

Describe the bug
gpt2-small is being reported as having 85M params, it actually has 124M.

This is because in HookedTransformerConfig.py, n_params is calculated only from the att/FF layers. In gpt2-small, W_E accounts for 39M params, so this makes a big difference for the small models.

Code example
HookedTransformer.from_pretrained("gpt2-small").cfg.n_params # 85M

Additional context
There also seems to be another unrelated open issue with n_params.

Checklist

I have checked that there is no similar issue in the repo (required)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug Report] gpt2-small: `n_params` way off #903

[Bug Report] gpt2-small: `n_params` way off #903

Morgan-Sinclaire commented Mar 29, 2025

[Bug Report] gpt2-small: n_params way off #903

[Bug Report] gpt2-small: n_params way off #903

Comments

Morgan-Sinclaire commented Mar 29, 2025

Checklist

[Bug Report] gpt2-small: `n_params` way off #903

[Bug Report] gpt2-small: `n_params` way off #903