Skip to content

An impl of adafactor as per big vision (scaling vit) changes #2320

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Nov 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
a4d93cf
An impl of adafactor as per big vision (scaling vit) changes
rwightman Nov 4, 2024
30142b6
Remove unused beta2 fn, make eps grad^2 handling same across factoriz…
rwightman Nov 4, 2024
91f0ea3
Need to init momentum with correct dtype
rwightman Nov 4, 2024
548fdb5
Remove adafactorbv numpy dep, hack fix for loading optimizer state w/…
rwightman Nov 4, 2024
7ea5016
Change adafactor_bv epsilon default
rwightman Nov 5, 2024
42ebe2d
A bit of lars/lamb cleanup, torch.where supports scalars properly now…
rwightman Nov 8, 2024
6a08df6
Change eps defaults in adafactor_bv again after some checking
rwightman Nov 8, 2024
2e2687a
Cleanup original adafactor impl, add row/col dim heuristic that works…
rwightman Nov 8, 2024
10d2efd
Improve row/col dim var name
rwightman Nov 8, 2024
ec857fc
Add ADOPT optimizer
rwightman Nov 8, 2024
6db2710
Fix ADOPT on older PyTorch (tested back to 1.13)
rwightman Nov 8, 2024
d73e8e7
Remove an indent level in init_group for adopt, update optim tests, a…
rwightman Nov 8, 2024
e6d72ed
Update adafactor comments / attrib
rwightman Nov 12, 2024
c8b4511
A bit of an optimizer overhaul, added an improved factory, list_optim…
rwightman Nov 13, 2024
75d676e
Try to fix documentation build, add better docstrings to public optim…
rwightman Nov 13, 2024
326e5dc
Merge remote-tracking branch 'origin/main' into adafactor_bv
rwightman Nov 13, 2024
5dae918
Post merge fix reference of old param groups helper fn locations
rwightman Nov 13, 2024
0e6da65
More fixes for new factory & tests, add back adahessian
rwightman Nov 13, 2024
3ec2970
Another doc class typo
rwightman Nov 13, 2024
b4503df
Fix adopt descriptions
rwightman Nov 13, 2024
df6171a
Minor changes, has_eps=False missing for bnb lion
rwightman Nov 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions hfdocs/source/reference/optimizers.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,22 +6,29 @@ This page contains the API reference documentation for learning rate optimizers

### Factory functions

[[autodoc]] timm.optim.optim_factory.create_optimizer
[[autodoc]] timm.optim.optim_factory.create_optimizer_v2
[[autodoc]] timm.optim.create_optimizer_v2
[[autodoc]] timm.optim.list_optimizers
[[autodoc]] timm.optim.get_optimizer_class

### Optimizer Classes

[[autodoc]] timm.optim.adabelief.AdaBelief
[[autodoc]] timm.optim.adafactor.Adafactor
[[autodoc]] timm.optim.adafactor_bv.AdafactorBigVision
[[autodoc]] timm.optim.adahessian.Adahessian
[[autodoc]] timm.optim.adamp.AdamP
[[autodoc]] timm.optim.adamw.AdamW
[[autodoc]] timm.optim.adan.Adan
[[autodoc]] timm.optim.adopt.Adopt
[[autodoc]] timm.optim.lamb.Lamb
[[autodoc]] timm.optim.lars.Lars
[[autodoc]] timm.optim.lion.Lion
[[autodoc]] timm.optim.lookahead.Lookahead
[[autodoc]] timm.optim.madgrad.MADGRAD
[[autodoc]] timm.optim.nadam.Nadam
[[autodoc]] timm.optim.nadamw.NAdamW
[[autodoc]] timm.optim.nvnovograd.NvNovoGrad
[[autodoc]] timm.optim.radam.RAdam
[[autodoc]] timm.optim.rmsprop_tf.RMSpropTF
[[autodoc]] timm.optim.sgdp.SGDP
[[autodoc]] timm.optim.sgdw.SGDW
Loading
Loading