1. School of Information Science and Engineering, Yanshan University, Qinhuangdao, Hebei 066004, China
2. The Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province,Qinhuangdao, Hebei 066004, China
3. Beijing Institute of Computer Technology and Application, Beijing 100854, China
Abstract:In order to improve the accuracy of object detection, a method based on deep learning using feature map weighted fusion is proposed. Firstly, the idea fusing the sampled shallow feature maps and the deepest feature map in the convolutional neural network is proposed. Secondly, the corresponding feature map weighted fusion scheme is developed according to the idea of feature map weighted fusion and the specific structure of convolutional neural network, and a new feature map is obtained from the scheme. Thirdly, an improved RPN network is proposed, and the new feature map is input into the improved RPN network to obtain the region proposals. Finally, the new feature map and the region proposals are input subsequent network layers to realize object detection. The experimental results show that the proposed method achieves higher object detection precision and better object detection effect.
[1]Du Y, Wang W, Wang L. Hierarchical recurrent neural network for skeleton based action recognition [C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA, 2015:1110-1118.
[2]Hu P Y, Ramanan D. Finding tiny faces [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA, 2017:1522-1530.
[3]王洪斌, 于菲, 李一骏, 等. 分块特征匹配与局部差分结合的运动目标检测 [J]. 计量学报, 2015, 36(4): 352-355.
Wang H B, Yu F, Li Y J, et al. Detection of moving object by combining block features matching and local differential [J]. [WTBX][STBX]Acta Metrologica Sinica[STBZ][WTBZ], 2015, 36(4): 352-355.
[4]Wu Z X, Fu Y W, Jiang Y G, et al. Harnessing object and scene semantics for large-scale video understanding [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA, 2016:3112-3121.
[5]Cordts M, Omran M, Ramos S, et al. The cityscapes dataset for semantic urban scene understanding [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA, 2016:3213-3223.
[6]Lienhart R, Maydt J. An extended set of Haar-like features for rapid object detection [C]//Proceedings of International Conference on Image Processing. Rochester, NY, USA, 2002: 900-903.
[7]Ren X F, Ramanan D. Histograms of sparse codes for object detection [C]//2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland OR, USA, 2013:3246-3253.
[8]Dalal N, Triggs B. Histograms of oriented gradients for human detection [C]//2005 IEEE Conference on Computer Vision and Pattern Recognition. San Diego,CA, USA, 2005:886-893.
[9]程淑红, 高许, 周斌. 基于多特征提取和SVM参数优化的车型识别 [J]. 计量学报, 2018, 39(3): 348-352.
Chen S H, Gao X, Zhou B. Vehicle recognition based on multi-feature extraction and SVM parameter optimization [J]. [WTBX][STBX]Acta Metrologica Sinica[STBZ][WTBZ], 2018, 39(3): 348-352.
[10]Viola P, Jones M. Rapid object detection using a boosted cascade of simple features [C]//2001 IEEE Conference on Computer Vision and Pattern Recognition. Kauai, HI, USA, 2001:511-518.
[11]Kazemi F M, Samadi S, Poorreza H R, et al. Vehicle recognition using Curvelet transform and SVM [C]// Fourth International Conference on Information Technology. Las Vegas NV, USA , 2007:516-521.
[12]Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA, 2014:580-587.
[13]Uijlings J R, Sande K E, Gevers T, et al. Selective search for object recognition [J]. [WTBX][STBX]International Journal of Computer Vision[STBZ][WTBZ], 2013, 104(2): 154-171.
[14]Ren S, He K, Sun J, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. [WTBX][STBX]IEEE Transactions on Pattern Analysis and Machine Intelligence[STBZ][WTBZ], 2015, 37(9): 1904-1916.
[15]Girshick R. Fast R-CNN [C]// 2015 IEEE International Conference on Computer Vision. Santiago,Chile, 2015:1440-1448.
[16]Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks [J]. [WTBX][STBX]IEEE Transactions on Pattern Analysis and Machine Intelligence[STBZ][WTBZ], 2016, 39(6): 1137-1149.
[17]Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector [C]//Proceedings of the European Conference on Computer Vision. Amsterdam, Holland, 2016: 21-37.
[18]Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA, 2016:779-788.
[19]田卓,佘青山,甘海涛,等.面向人脸特征点定位和姿态估计任务协同的DCNN方法[J].计量学报, 2019, 40(4): 576-582.
Tian Z, She Q S, Gan H T, et al. DCNN for Task Coordination of Facial Landmark Localization and Head Pose Estimation[J].Acta Metrologica Sinica, 2019, 40(4): 576-582.
[20]张世辉,耿勇,张笑维,等. 基于深度图像利用BP网络实现遮挡边界检测[J].计量学报, 2020, 41(10): 1205-1211.
Zhang S H, Geng Y, Zhang X W, et al. Using BP Network for Occlusion Boundary Detection Based on Depth Image[J]. Acta Metrologica Sinica, 2020, 41(10): 1205-1211.