Glossary term
Glossary term
Foundations
A training algorithm where weak models are trained to iteratively improve the quality (reduce the loss) of a strong model. For example, a weak model could be a linear or small decision tree model. The strong model becomes the sum of all the previously trained weak models.
In the simplest form of gradient boosting, at each iteration, a weak model is trained to predict the loss gradient of the strong model. Then, the strong model's output is updated by subtracting the predicted gradient, similar to gradient descent.
where:
is the starting strong model.
is the next strong model.
is the current strong model.
is a value between 0.0 and 1.0 called shrinkage, which is analogous to the learning rate in gradient descent.
is the weak model trained to predict the loss gradient of .
Modern variations of gradient boosting also include the second derivative (Hessian) of the loss in their computation.
Decision trees are commonly used as weak models in gradient boosting. See gradient boosted (decision) trees.
Created for this library
An ad-tech team uses gradient boosting as one of several signals in its click-prediction ensemble.
A marketing analytics team uses gradient boosting to model email-open propensity across hundreds of features per customer.
A finance analytics team uses gradient boosting to predict invoice payment lag using historical payment patterns as features.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License