Learning Rate

A floating-point number that tells the gradient descent algorithm how strongly to adjust weights and biases on each iteration. For example, a learning rate of 0.3 would adjust weights and biases three times more powerfully than a learning rate of 0.1.

Learning rate is a key hyperparameter. If you set the learning rate too low, training will take too long. If you set the learning rate too high, gradient descent often has trouble reaching convergence.

Click the icon for a more mathematical explanation.

See Linear regression: Hyperparameters in Machine Learning Crash Course for more information.

Real-world uses

Created for this library

1.
An ML team tunes learning rate first when bringing up training because it has the largest impact on convergence.
2.
A research team uses a cyclical learning rate schedule during pretraining to balance fast convergence and final accuracy.
3.
An ML platform team standardizes learning-rate schedules per model family so engineers can focus on data and features.

Back to glossary

Learning Rate

Real-world uses

Related terms

Loading…

Learning Rate

Real-world uses

Related terms