Open
Description
🚀 The feature
#6323 Mentions that torchvision is looking to implement LAMB optimizer.
@datumbox I would very much like to take this issue and create a PR.
Motivation, pitch
LAMB optimizer was created because LARS optimizer performed poorly on model with attention mechanism (mainly). LAMB has shown to achieve very good performance gains across various tasks and I believe that it should be implemented in torchvision.
Alternatives
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
No labels