-
Notifications
You must be signed in to change notification settings - Fork 19.7k
Fix CosineDecay documentation to clarify alpha is a multiplier #21827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CosineDecay documentation to clarify alpha is a multiplier #21827
Conversation
The documentation incorrectly stated that learning rate decays 'to alpha', when it actually decays to 'initial_lr * alpha'. Updated the docstring to make it clear that alpha is a fraction/ multiplier, not an absolute target value.
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello @yashwantbezawada, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses an inaccuracy in the documentation for the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is a great documentation fix that clarifies how the alpha parameter in CosineDecay works. The updated description correctly states that alpha is a multiplier for the learning rate at the start of the decay phase. I've added one minor suggestion to format the new expressions as code for consistency. Additionally, for future consideration, the description of alpha in the Arguments section could also be updated to reflect its behavior when warmup_target is used, to make the documentation fully consistent.
|
Signed the CLA |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #21827 +/- ##
=======================================
Coverage 82.66% 82.66%
=======================================
Files 577 577
Lines 59460 59460
Branches 9322 9322
=======================================
Hits 49152 49152
Misses 7905 7905
Partials 2403 2403
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
fchollet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the PR
Fixes #21772
The CosineDecay documentation was misleading about how the
alphaparameter works.Current docs say: learning rate decays "to alpha"
Reality: learning rate decays to
initial_learning_rate * alphaThe parameter description correctly states alpha is "a fraction of initial_learning_rate", but the explanation text contradicted this. Updated the explanation to match the actual implementation and parameter description.