Glossary term
Glossary term
Infrastructure and Serving
AI processing close to where data is generated or used.
NVIDIA Jetson AGX Orin runs YOLOv8 object detection at 60fps on a factory conveyor belt, detecting defects in real time without sending images to the cloud - latency under 10ms vs 150ms cloud round-trip.
Apple's on-device Neural Engine runs phi-3-mini and custom Apple Intelligence models locally on iPhone 15 Pro, enabling AI features (summarisation, Smart Reply, photo editing) without sending content to Apple servers.
Qualcomm AI Hub deploys 4-bit quantised Llama 3.2 3B on Snapdragon X Elite laptops - enabling fully offline copilot features for field engineers in areas with no cellular connectivity.