|
|
Gaze Estimation Method Based on Coordinate Attention and Spiking Neural Network |
WANG Hongxia,ZHAO Zhiguo |
Shenyang Polytechnic University,Shenyang,Liaoning 110158, China |
|
|
Abstract The problems of dynamic blur and low temporal resolution in capturing eye movements with traditional cameras are addressed by employing an event camera for close-range capture and constructing a spiking-eye dataset. A spiking neural network model with a coordinate attention referred to as CA-SpikingRepVGG. The model reads encoded event data and performs feature extraction using the attention-based backbone network, followed by detection using the detection head. Experimental results demonstrate that CA-SpikingRepVGG achieves a mean average precision RP of 70.8%. Compared to SpikingVGG-16, the model shows a 15.9% improvement in RP and a 14.2% increase in Rr. With only one-third of the training time required by SpikingDensenet, the model achieves a 1.8% improvement in RP and a 0.9% improvement in Rr. These results indicate that the proposed model exhibits stronger eye detection and tracking capabilities in the context of eye movement, effectively accomplishing gaze estimation tasks.
|
Received: 04 September 2023
Published: 04 July 2024
|
|
|
|
|
[1] |
VERGHESE P, McKEE S P. Predicting future motion[J]. Journal of Vision, 2002,2(5):413-423.
|
[10] |
BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arxiv preprint arxiv: 2004. 10934, 2020.
|
[2] |
周小龙, 刘倩倩, 产思贤, 等. 基于事件相机的视觉跟踪算法综述[J]. 小型微型计算机系统, 2020, 41(11): 2325-2332.
|
[5] |
GIRSHICK R. Fast R-CNN [C]//Proceedings of the IEEE international conference on computer vision. Santiago, Chile, 2015:1440-1448.
|
|
ZHOU X L, LIU Q Q, CHAN S X, et al. A Survey of Visual Tracking Algorithms Based on Event Cameras [J]. Journal of Miniaturized and Microcomputers, 2020, 41(11): 2325-2332.
|
[4] |
GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Columbus, USA,2014: 580-587.
|
[14] |
姚波, 温秀兰, 焦良葆, 等. 改进YOLOv3算法用于铝型材表面缺陷检测[J]. 计量学报, 2022, 43(10): 1256-1261.
|
[26] |
张世辉, 王红蕾, 陈宇翔, 等. 基于深度学习利用特征图加权融合的目标检测方法[J]. 计量学报, 2020, 41(11): 1344-1351.
|
[7] |
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, USA, 2016: 779-788.
|
[8] |
REDMON J, FARHADI A. YOLO 9000: better, faster, stronger[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2017: 7263-7271.
|
[9] |
REDMON J, FARHADI A. Yolov3: An incremental improvement[J]. arxiv preprint arxiv: 1804. 02767, 2018.
|
[11] |
Wang C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Vancouver, Canada, 2023: 7464-7475.
|
|
YAO B, WEN X L, JIAO L B, et al. Improved YOLOv3 Algorithm for Surface Defect Detection in Aluminum Profiles[J]. Acta Metrologica Sinica, 2022, 43(10): 1256-1261.
|
[16] |
WONG E T, YEAN S, HU Q, et al. Gaze estimation using residual neural network[C]//2019 IEEE international conference on pervasive computing and communications workshops (PerCom Workshops). IEEE, 2019: 411-414.
|
[17] |
ANGELOPOULOS A N, MARTEL J N P, KOHLI A P S, et al. Event based, near eye gaze tracking beyond 10, 000 hz[J]. arxiv preprint arxiv: 2004. 03577, 2020.
|
[18] |
STOFFREGEN T, DARAEI H, ROBINSON C, et al. Event-based kilohertz eye tracking using coded differential lighting[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa, USA,2022: 2515-2523.
|
[20] |
HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Nashvile, USA,2021:13713-13722.
|
[24] |
GERSTNER W, KISTLER W M, NAUD R, et al. Neuronal dynamics: From single neurons to networks and models of cognition[M]. Cambridge: Cambridge University Press, 2014.
|
[3] |
KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 85-90.
|
[13] |
LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector [C]//European Conference on Computer Vision, Amsterdam, Netherlands, 2016: 21-37.
|
[22] |
MAASS W. Networks of spiking neurons: The third generation of neural network models[J]. Neural Networks, 1997, 10(9): 1659-1671.
|
|
ZHANG S H, WANG H L, CHEN Y X, et al. Object Detection Method Based on Deep Learning and Feature Map Weighted Fusion [J]. Acta Metrologica Sinica, 2020, 41(11): 1344-1351.
|
[27] |
CORDONE L, MIRAMOND B, THIERION P. Object detection with spiking neural networks on automotive event data[C]//2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022: 1-8.
|
[29] |
HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Salt Lake City, USA,2018: 7132-7141.
|
[6] |
REN S Q, HEK M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39 (6): 1137 -1149.
|
[12] |
GE Z, LIU S, WANG F, et al. Yolox: Exceeding yolo series in 2021[J]. arxiv preprint arxiv: 2107. 08430, 2021.
|
[15] |
LIAN D, HU L, LUO W, et al. Multiview multitask gaze estimation with deep convolutional neural networks[J]. IEEE transactions on neural networks and learning systems, 2018, 30(10): 3010-3023.
|
[21] |
GALLEGO G, DELBRüCK T, ORCHARD G, et al. Event-based vision: A survey[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 44(1): 154-180.
|
[25] |
FANG W, YU Z, CHEN Y, et al. Incorporating learnable membrane time constant to enhance learning of spiking neural networks[C]//Proceedings of the IEEE/CVF international conference on computer vision. Nashvile, USA,2021: 2661-2671.
|
[30] |
LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: Common objects in context[C]//Computer Vision-ECCV 2014: 13th European Conference. Zurich, Switzerland, 2014: 740-755.
|
[19] |
DING X, ZHANG X, MA N, et al. Repvgg: Making vgg-style convnets great again[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Nashvile, USA,2021:13733-13742.
|
[28] |
Qi C R, SU H, MO K, et al. Pointnet: Deep learning on point sets for 3d classification and segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Honolulu, USA,2017:652-660.
|
[23] |
HODGKIN A L, HUXLEY A F, A quantitative description of membrane current and its application to conduction and excitation in nerve[J]. The Journal of physiology, 1952, 117(4): 500-544.
|
|
|
|