Batch Inference

The process of inferring predictions on multiple unlabeled examples divided into smaller subsets ("batches").

Batch inference can take advantage of the parallelization features of accelerator chips. That is, multiple accelerators can simultaneously infer predictions on different batches of unlabeled examples, dramatically increasing the number of inferences per second.

See Production ML systems: Static versus dynamic inference in Machine Learning Crash Course for more information.

Real-world uses

Created for this library

1.
A retail recommendation team runs batch inference nightly to precompute next-day product suggestions for every active user.
2.
An insurance company runs batch inference once per quarter to repredict risk scores for the entire book of business.
3.
A subscription business runs batch inference every Monday to refresh churn-risk scores feeding the retention call list.

Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License

Back to glossary

Real-world uses

Loading…

Real-world uses