Glossary term
Glossary term
Foundations
The process of a model generating a batch of predictions and then caching (saving) those predictions. Apps can then access the inferred prediction from the cache rather than rerunning the model.
For example, consider a model that generates local weather forecasts (predictions) once every four hours. After each model run, the system caches all the local weather forecasts. Weather apps retrieve the forecasts from the cache.
Offline inference is also called static inference.
Contrast with online inference. See Production ML systems: Static versus dynamic inference in Machine Learning Crash Course for more information.
For example, consider a model that generates local weather forecasts (predictions) once every four hours. After each model run, the system caches all the local weather forecasts. Weather apps retrieve the forecasts from the cache.
Offline inference is also called static inference.
Contrast with online inference. See Production ML systems: Static versus dynamic inference in Machine Learning Crash Course for more information.
Created for this library
A retail recommendation team runs offline inference nightly to precompute the next-day homepage carousel for every active user.
An insurance company runs offline inference each quarter to re-score the entire book of business with the latest risk model.
A subscription business runs offline inference weekly to refresh churn-risk scores feeding the retention call list.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License