Chenzhe Shi, Yue Kan
Accurately identifying fall incidents in images or videos can significantly reduce the response time for assisting affected individuals. When manual monitoring cannot cover all video feeds in real time, fall detection algorithms can automatically screen for anomalous events. Existing methods are constrained by limitations such as homogeneous fall scenarios, insufficient fall-like activities, and low-resolution fall images. This paper introduces Multi-Variate Fall Detection Data (MVFDD), a novel and comprehensive fall detection dataset, along with a lightweight algorithm named FYOLO. An Identity Former block incorporating a Convolutional Gated Linear Unit (CGLU) has been introduced into the Cross Stage Partial Network of YOLOv10n, enabling channel wise feature modulation while reducing computational redundancy. The neck network leverages pinwheel-shaped convolution and frequency aware feature fusion, forming an Feature Pyramid Network (FPN) and Path Aggregation Network (PAN) structure to ensure effective detection of targets at various distances and sizes, thereby adapting to different camera perspectives and focal lengths. A Fall Distance Intersection over Union (FDIoU) loss is proposed for the first time, which enhances robustness to sample imbalance while ensuring semantic alignment with postural characteristics.
Deep learning, YOLOV10n, Lightweight, Fall detection