现有的人体动作识别方法是以固定的时间间隔分割事件数据作为虚拟帧,没有充分利用事件相机的异步输出特性,针对此不足,提出了一种直接处理事件数据的方法。首先,使用事件相机获取常见的9种人体动作数据,采取滤波、网格降采样的方式进行预处理,去除噪声且减少模型的输入数据量;然后,利用共享的卷积核对特征事件进行并行卷积操作,提取动作空间特征;最后,对人体动作实现分类识别。实验结果表明:在一般光照下,动作平均识别准确率为91.3%,不同光照对该方法的准确率影响较小,还具有训练时间快、参数量小的优点。
Abstract
The existing method of human action recognition was to divide event data as virtual frames with fixed time interval. However, these technologies did not take full advantage of the asynchronous output feature of the event camera. In order to solve the shortages, a method to deal with event data directly is proposed. First, event camera is used to obtain nine kinds of common human action data, then preprocessing include filtering and grid down sampling. The preprocess can not only remove noise but also reduce the amount of input data of the model. Second, the shared convolution is used to extract the action space features, which use parallel convolution operation. Finally, human actions are classified and recognized. The experimental results show that the average recognition accuracy is 91.3% under normal illumination, different illumination has little effect on the accuracy of this method, it also has the advantages of fast training time and small number of parameters.
关键词
计量学 /
事件相机 /
人体动作识别方法 /
空间降采样 /
共享卷积
Key words
metrology /
event camera /
human action recognition method /
spatial down sampling /
shared convolution
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1]Zhang H B, Zhang Y X, Zhong B, et al. A Comprehensive Survey of Vision-Based Human Action Recognition Methods [J]. Sensors, 2019, 19(5): 1005-1024.
[2]Ji S, Xu W, Yang M, et al. 3D convolutional neural networks for human action recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 35(1): 221-231.
[3]王金甲, 周雅倩, 郝智. 基于注意力模型的多传感器人类活动识别 [J]. 计量学报, 2019, 40(6): 958-969.
Wang J J, Zhou Y Q, Hao Z. Multi-sensor Human Activity Recognition Based on Attention Model [J]. Acta Metrologica Sinica, 2019, 40(6): 958-969.
[4]Chen E, Bai X, Gao L, et al. A spatiotemporal heterogeneous two-stream network for action recognition [J]. IEEE Access, 2019, 7(2019): 57267-57275.
[5]赵一中, 刘文波. 基于深度信念网络的非限制性人脸识别算法研究 [J]. 计量学报, 2017, 38(1): 65-68.
Zhao Y Z, Liu W B. Research on Unconstrained Face Recognition Algorithm Based on DBNs Network [J]. Acta Metrologica Sinica, 2017, 38(1): 65-68.
[6]Bilen H, Fernando B, Gavves E, et al. Action recognition with dynamic image networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(12): 2799-2813.
[7]Li H, Shi L. Robust Event-Based Object Tracking Combining Correlation Filter and CNN Representation [J]. Frontiers in Neurorobotics, 2019, 13(2019):82-92.
[8]Gallego G, Rebecq H, Scaramuzza D. A Unifying Contrast Maximization Framework for Event Cameras, with Applications to Motion, Depth, and Optical Flow Estimation [C]. //Proceedings / CVPR, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018, 74(6): 3867-3876.
[9]Hanyu W, Jiangtao X, Zhiyuan G, et al. An Event-Based Neurobiological Recognition System with Orientation Detector for Objects in Multiple Orientations [J]. Frontiers in Neuroence, 2016, 10(2016): 498-510.
[10]Baby S A, Vinod B, Chinni C, et al. Dynamic Vision Sensors for Human Activity Recognition [C]. //2017 4th IAPR Asian Conference on Pattern Recognition (ACPR), Nanjing, China, 2017, 4(11): 316-321.
[11]Ramesh B, Yang H, Orchard G, et al. DART: Distribution Aware Retinal Transform for Event-based Cameras [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 42(5): 1-14.
[12]Bi Y, Chadha A, Abbas A, et al. Graph-based object classification for neuromorphic vision sensing [J]. Proceedings of the IEEE International Conference on Computer Vision, 2019, 42(16): 491-501.
[13]Gehrig D, Loquercio A, Derpanis K, et al. End-to-End Learning of Representations for Asynchronous Event-Based Data [C]. //2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 2019(8): 5632-5642.
[14]Ramesh B, Ussa A, Della Vedova L, et al. Low-power dynamic object detection and classification with freely moving event cameras [J]. Frontiers in Neuroscience, 2020, 14(2): 135-149.
[15]闫昌达, 王霞, 左一凡, 等. 基于事件相机的可视化及降噪算法 [J]. 北京航空航天大学学报, 2020, 46(7): 1-11.
Yan C D, Wang X, Zuo Y F, et al. Visualization and Noise Reduction Algorithm Based on Event Camera [J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(7): 1-11.
[16]Feng Y, Lv H, Liu H, et al. Event Density Based Denoising Method for Dynamic Vision Sensor [J]. Applied Ences, 2020, 10(6): 20-24.
[17]Yao G, Lei T, Zhong J. A review of Convolutional-Neural-Network-based action recognition [J]. Pattern Recognition Letters, 2019, 118(2): 14-22.
[18]程淑红, 程彦龙. 融合批量再标准化和YOLOv3的手势识别研究 [J]. 计量学报, 2021, 42(1): 29-34.
Cheng S H, Cheng Y L. Hand Gesture Recognition Algorithm Combined with Batch Renormalization and YOLOv3 [J]. Acta Metrologica Sinica, 2021, 42(1): 29-34.
基金
国家自然科学基金(61903352);浙江省自然科学基金(LY19F010007,LQ19F030007);中国博士后科学基金(2020M671721)