AI/automation ethics glossary
Safety
AI/automation ethics glossary
Safety
Safety refers to the physical, psychological and mental safety of users, animals and property posed by the use/misuse of an AI/automated system.
Safety isn’t just about a machine "breaking"; it’s about how a system behaves when things go wrong. It involves:
Robustness. Ensuring an AI system can handle "noise" or unusual data without crashing or acting erratically.
Control & oversight. The ability for humans to intervene or shut the system down (the "kill switch" problem).
Predictability. Ensuring the system's actions are consistent so humans can safely work alongside it.
Alignment. Making sure an AI system’s goals actually match the user's intent so it doesn't take dangerous shortcuts to finish a task.
As AI moves from our screens into the physical world, such as driving cars, performing surgery, or managing power grids, the stakes shift from "annoying software bugs" to "life-or-death risks." Society relies on a "social contract" where we trust that the technology around us won't suddenly become a threat to our physical well-being.
Poor safety can have serious results, from serious physical injury and loss of life to psychological trauma, financial loss, and the erosion of trust in public and other institutions.
Common sources of AI safety failures include:
Inadequate testing. Systems are released before failure scenarios are fully understood.
Distribution shift. AI trained in one context performs poorly in a different real-world context.
Overconfidence in AI. Users and organisations trust AI outputs without sufficient scepticism or oversight.
Pressure to deploy quickly. Commercial and competitive pressures lead to shortcuts in safety processes.
Complexity and opacity. AI systems (especially large neural networks) are difficult to fully understand, making it hard to predict when they will fail.
Misuse. Bad actors deliberately exploit AI systems to cause harm.
Misaligned objectives. AI systems optimise for the wrong thing, causing unintended harm as a side effect.
Innovation versus safety. Companies may feel pressured to release a technology system quickly to stay competitive, potentially cutting corners on safety checks.
The "Trolley Problem". In an unavoidable accident, how should a system prioritise whose safety matters most? (e.g., Should a self-driving car protect its passengers or pedestrians?)
Human-in-the-loop. Is it ethical to give an AI system full autonomy if a human is unable to react fast enough to stop a mistake?
Safety versus capability. Making AI systems safer often makes them less powerful or useful. How should that balance be struck?
Transparency versus security. Making AI systems more transparent can help safety researchers identify flaws, but it can also help bad actors exploit them.
Autonomy versus oversight. Greater AI autonomy can improve outcomes, but reduces human oversight — and the ability to intervene when something goes wrong.
Welshman kills mother with sledgehammer after speaking to Discord AI bot
Google AI search summaries give cancer patients wrong advice
Waymo robotaxi hits and kills San Francisco corner store cat
Unprompted, Grok exposes porn worker's legal name and birthday
Google early warning system fails to alert people during Türkiye earthquake
Author: Charlie Pownall 🔗
Published: May 14, 2026
Last updated: May 16, 2026
You are welcome to use, copy, adapt, and redistribute this definition under a CC BY-SA 4.0 licence.
Let us know if you have any comments or suggestions about how to improve this definition, or would like to suggest and/or contribute additional terms to define.