Glossary term
Glossary term
Security
Unauthorized extraction or copying of a model, weights, architecture, prompts, embeddings, or proprietary behavior through direct access, API abuse, leakage, or reverse engineering. Mitigations include access control, rate limiting, monitoring for extraction patterns, contractual restrictions, watermarking where appropriate, and careful handling of model artifacts.
The 2023 leak of Meta's LLaMA weights via 4chan, before Meta's later open weights release, is a documented model theft event.
Carlini et al. (2024) demonstrated model extraction attacks against production language models, recovering hidden dimensions through API queries.
OpenAI's August 2023 confirmation that an internal employee discussion forum was breached in 2023 highlighted insider risk for model IP.