Detecting Violence Against Children in Surveillance Video Using Computer Vision
Background
Child maltreatment is a significant public health issue, with substantial underreporting making it difficult to address effectively. Data from the National Child Abuse and Neglect Data System (NCANDS) reported over 678,810 cases of maltreatment in 2012, with 18% involving physical abuse and an estimated 1,640 resulting in fatalities. Many cases, however, remain undetected by authorities. Traditional methods for detecting violence, such as manual reviews of surveillance footage, are not scalable or timely.
Recognizing this need, Omdena collaborated with EyeKnow AI to develop AI-powered violence detection models. These models aim to identify abusive behavior in caregiving scenarios, enhancing surveillance capabilities to ensure children’s safety.
Objective
The goal was to develop AI-based violence detection model capable of detecting violent interactions in real-time from video footage, focusing on preventing abuse in caregiving contexts. This project sought to:
- Build a robust dataset of caregiver-to-child and caregiver-to-elderly interactions.
- Develop machine learning pipelines for automated video analysis.
- Ensure adaptability and scalability of the system for different surveillance environments.
Approach
- Dataset Creation:
- Two primary datasets were curated:
- A caregiver-to-senior violence dataset of 500 clips from YouTube.
- A caregiver-to-child aggression dataset of 500 clips sourced from YouTube and EyeKnow AI’s partnerships.
- Two primary datasets were curated:
- Machine Learning Models:
- Object Detection and Annotation:
- Entity annotation at the frame level identified caregivers, children, and elderly individuals.
- An object detection model was trained to localize these entities.
- Interaction Analysis:
- Bounding box overlap analysis flagged potential high-intensity interactions indicative of violence.
- Video Classification:
- Deep neural networks were used to classify video sequences, combining pre-trained feature extraction models with temporal relationship modeling.
- Object Detection and Annotation:
- AI Pipeline Development:
- The developed ML pipeline processes video inputs to provide frame-level insights into entity interactions.
- A modular Python application allows users to configure parameters for training models or analyzing video data.
Results and Impact
- Scalable Detection: A functional model capable of identifying violent interactions in video footage was delivered.
- Modular Framework: The Python-based system allows flexibility in use cases, including training, inference, and real-time video processing.
- Early Intervention: With integration into surveillance systems, the model has the potential to prevent abuse through real-time detection.
Future Implications
This AI-based violence detection model offers significant potential for societal impact, including:
- Policy Development: Supporting policy changes to improve child protection systems.
- Global Adoption: Expanding to various cultural and environmental contexts.
- Advancements in AI Applications: Adapting the model for detecting other forms of abuse, such as neglect or emotional maltreatment.
This challenge has been hosted with our friends at
Become an Omdena Collaborator