What do we really consider “violence”?
It is a question that seems intuitive, almost obvious. Yet as soon as we look more closely, we discover just how complex it actually is.
This question lies at the heart of the study “On the Concept of Violence: A Comparative Study of Human and AI Judgments”, published on arXiv and authored by Mariachiara Stellato, Francesco Lancia, Chiara Galeazzi and Nico Curti.
The research compares how human beings and artificial intelligence models interpret potentially violent situations, revealing surprising differences in the way context and moral nuance are evaluated.
An experiment born on the radio
The idea behind the study was first tested in an original way: through a survey launched live on air during the programme “Chiacchiericcio” on Radio Deejay.
During the broadcast, listeners were asked to classify 22 controversial situations — such as online insults, social exclusion, or statements made in ambiguous contexts — choosing among three possible answers:
- violence
- non-violence
- it depends on the context
Within just a few hours, more than 3,000 responses were collected, providing a rich and diverse sample of human perceptions.
Following an insight from the research team working on AI, the same situations were then submitted to 18 different artificial intelligence models, asking them to provide the same classification.
Where humans and AI agree (and where they do not)
In the most clear-cut cases, the result is reassuring: humans and AI systems tend to agree.
But when situations become more ambiguous, significant differences emerge.
- People show greater variability in their responses
- Humans more often choose “it depends on the context”
- AI models tend instead to provide more categorical and less nuanced judgments
In other words, while humans more readily evaluate intentions, context and social relationships, AI systems tend to simplify decisions into more rigid categories.
AI does not simply mirror human opinion
One of the most interesting outcomes of the study is that AI does not merely replicate human thinking.
The analysed models display their own evaluation criteria, often shaped by biases present in training data or by design choices made by developers. This can lead to classifications that diverge from our moral or social intuitions.
Why this matters
At a time when an increasing number of decisions — or assessments — are delegated to AI systems, understanding how and where these systems differ from human judgment becomes crucial.
The study suggests that AI can be a valuable tool, but not a neutral one. The way it interprets complex concepts such as violence may reflect simplifications or distortions that we need to be aware of.
For this reason, analysing limitations, biases and divergences from human judgment is becoming one of the central challenges in contemporary AI research.