Glossary term
Glossary term
Foundations
Converting a single feature into multiple binary features called buckets or bins, typically based on a value range. The chopped feature is typically a continuous feature.
For example, instead of representing temperature as a single continuous floating-point feature, you could chop ranges of temperatures into discrete buckets, such as:
<= 10 degrees Celsius would be the "cold" bucket.
11 - 24 degrees Celsius would be the "temperate" bucket.
>= 25 degrees Celsius would be the "warm" bucket.
The model will treat every value in the same bucket identically. For example, the values 13 and 22 are both in the temperate bucket, so the model treats the two values identically.
Click the icon for additional notes.
See Numerical data: Binning in Machine Learning Crash Course for more information.
C
For example, instead of representing temperature as a single continuous floating-point feature, you could chop ranges of temperatures into discrete buckets, such as:
<= 10 degrees Celsius would be the "cold" bucket.
degrees Celsius would be the "temperate" bucket.
Created for this library
A pricing team uses bucketing to convert continuous lead time into categorical buckets that its decision tree handles more robustly than raw values.
A subscription business buckets engagement minutes per week into low, medium, and high segments before training a churn model.
A digital ad platform buckets advertiser spend into tiers to stabilize predictions when raw spend has heavy outliers.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License