AI safety studies how to design, build, test, and operate AI systems so they behave as intended and do not cause unintended harm. It covers everything from bugs in a self-driving car's perception pipeline to questions about how a language model might influence public opinion.
Practitioners combine computer science, engineering, psychology, and ethics to build verification methods and control mechanisms. AI systems are moving from research labs into medical diagnosis tools, trading bots, and content recommendation engines. When these systems make mistakes, consequences range from misdiagnoses to market crashes to the spread of harmful misinformation.
A single error can cascade through interconnected services and affect millions of users. Governments, corporations, and civil society are investing in standards, audits, and certification programs. Researchers explore formal verification, incentive design, and interpretability techniques that aim to guarantee safe behavior even as AI grows more capable.
Building these safeguards now reduces the chance of costly corrections later.
Interactive Visualizer
AI Safety Interactive Lab
Adjust safety measures for different AI systems and observe how they affect risk levels. See how comprehensive safety practices reduce the likelihood of harmful incidents.
Select AI System
Safety Measures
Risk Assessment
Potential Risks for Self-Driving Car
Safety Tip: AI safety requires a multi-layered approach. No single measure is sufficient - comprehensive safety comes from combining testing, monitoring, controls, and verification.