Glossary term
Glossary term
Safety and Alignment
AI Safety is the practice of designing AI systems to operate securely, ethically, and aligned with human values, preventing bias, misuse, and unintended actions through governance, monitoring, and human oversight.
Anthropic's Responsible Scaling Policy defines AI Safety Levels (ASL) for capability-based risk thresholds.
The UK AI Safety Institute and US AI Safety Institute, established in 2023 and 2024, evaluate frontier model safety.
OpenAI's Preparedness Framework and Google DeepMind's Frontier Safety Framework set internal AI safety standards.