Glossary term
Glossary term
Foundations
The number of examples in a batch. For instance, if the batch size is 100, then the model processes 100 examples per iteration.
The following are popular batch size strategies:
Stochastic Gradient Descent (SGD), in which the batch size is 1.
Full batch, in which the batch size is the number of examples in the entire training set. For instance, if the training set contains a million examples, then the batch size would be a million examples. Full batch is usually an inefficient strategy.
mini-batch in which the batch size is usually between 10 and 1000. Mini-batch is usually the most efficient strategy.
See the following for more information:
Production ML systems: Static versus dynamic inference in Machine Learning Crash Course.
Created for this library
A trading-signal team tunes batch size from 256 to 1,024 and finds the larger batch converges in fewer epochs on its GPU cluster.
A computer vision team experiments with batch sizes from 32 to 512 on a single GPU to find the sweet spot between speed and final accuracy.
A speech vendor uses gradient accumulation to double the effective batch size when GPU memory limits the physical size.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License