一肖中特公式
首页 > 论文 > 光学学报 > 39卷 > 8期(pp:815005--1)

一种改进的多门控特征金字塔网络

An Improved Multi-Gate Feature Pyramid Network

  • 摘要
  • 论文信息
  • 参考文献
  • 被引情况
  • PDF全文
分享:

摘要

特征金字塔网络(FPN)在融合不同尺度特征图时采用上采样和相加的方法,?#27426;?#32463;过上采样的特征图的空间层级化信息丢失严重,简单地进行相加必然引入一定的误差。同时,FPN结构的深层特征信息前向传递性较差,其对更浅层的辅助效果基本消失。?#28304;?结合长短时记忆(LSTM)网络在处理上下文信息上的优势对FPN结构进行改进,在不同深度的特征层之间建立一条自上而下的记忆链接,建立多门控结构对记忆链上的信息进行过滤和融合以产生表征能力更强的高级语义特征图。最后,将改进的FPN结构加入到SSD(Single Shot MultiBox Detector)算法框架中,提出新的特征融合网络——MSSD(Memory SSD),并在Pascal VOC 2007数据集上进行验证。实验表明,该改进取得了较好的测试结果,相比于目前较先进的检测算法也有一定的优势。

Abstract

The feature pyramid network (FPN) adopts the method of upsampling and addition when fusing different scale feature maps. However, the spatial stratification information of the upsampled feature map is seriously lost, so that direct addition will inevitably make certain errors. At the same time, the deep feature information of the FPN structure is poorly forward-transferred, and its auxiliary effect to the shallower layer basically disappears. This paper uses the advantages of Long Short-Term Memory (LSTM) network in processing context information to improve the FPN structure. A top-down memory chain is established between feature layers of different depths, and a multi-gate structure is constructed to filter and fuse the information on the memory chain to generate a higher semantic feature map with stronger representation ability. Finally, the improved FPN structure is added to the SSD (Single Shot MultiBox Detector) algorithm framework, and a new feature fusion network, MSSD (Memory SSD), is proposed and verified on the Pascal VOC 2007 data set. Experiments show that the improved algorithm has achieved better test results, and it has certain advantages compared with the current advanced detection algorithms.

Newport宣传-MKS新实验室计划
补充资料

DOI:10.3788/AOS201939.0815005

所属?#25913;浚?a href='../Journals/JColumnList?cid=355' title='查看该期刊此?#25913;?#19979;其他论文' class='TagKey' target='_blank'>机器视觉

基金项目:国家自然科学基金青年基金(61503392);

收稿日期:2019-03-13

修改稿日期:2019-04-22

网络出版日期:2019-08-01

作者单位    点击查看

赵彤:火箭军工程大学导弹工程学院, 陕西 西安 710025
刘洁瑜:火箭军工程大学导弹工程学院, 陕西 西安 710025
沈强:火箭军工程大学导弹工程学院, 陕西 西安 710025

联系人作者:刘洁瑜

备注:国家自然科学基金青年基金(61503392);

【1】Girshick R, Donahue J, Darrell T et al. Rich feature hierarchies for accurate object detection and semantic segmentation. [C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), June 23-28, 2014, Columbus, OH, USA. New York: IEEE. 580-587(2014).

【2】Everingham M. Eslami S M A, van Gool L, et al. The PASCAL visual object classes challenge: a retrospective. International Journal of Computer Vision. 111(1), 98-136(2015).

【3】He K M, Zhang X Y, Ren S Q et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 37(9), 1904-1916(2015).

【4】Girshick R. Fast R-CNN. [C]∥2015 IEEE International Conference on Computer Vision (ICCV), December 7-13, 2015, Santiago, Chile. New York: IEEE. 1440-1448(2015).

【5】Ren S Q, He K M, Girshick R et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence. 39(6), 1137-1149(2017).

【6】Liu W, Anguelov D, Erhan D et al. SSD: single shot MultiBox detector. ∥Leibe B, Matas J, Sebe N, et al. Lecture notes in computer science. Cham: Springer. 9905, 21-37(2016).

【7】Lin T Y, Dollar P, Girshick R et al. Feature pyramid networks for object detection. [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), July 21-26, 2017, Honolulu, HI. New York: IEEE. 936-944(2017).

【8】Redmon J. -04-08)[2019-01-30]. https:∥arxiv. org/abs/1804, (2018).

【9】Lin T Y, Goyal P, Girshick R et al. Focal loss for dense object detection. [C]∥2017 IEEE International Conference on Computer Vision (ICCV), October 22-29, 2017, Venice, Italy. New York: IEEE. 2999-3007(2017).

【10】Zhang S F, Wen L Y, Bian X et al. Single-shot refinement neural network for object detection. [C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), June 18-23, 2018, Salt Lake City. New York: IEEE. 4203-4212(2018).

【11】Fu C Y, Liu W, Ranga A et al. -01-23)[2019-01-30]. https:∥arxiv. org/abs/1701, (2017).

【12】Li Z X. -05-17)[2019-01-30]. https:∥arxiv. org/abs/1712, (2018).

【13】Simonyan K. -04-10)[2019-02-02]. https:∥arxiv. org/abs/1409, (2015).

【14】Jeong J and Park H. -05-26)[2019-02-01]. https:∥arxiv. org/abs/1705, (2017).

【15】Zhao Q J, Sheng T, Wang Y T et al. -01-06)[2019-02-01]. https:∥arxiv. org/abs/1705, (2019).

【16】Liu X, Chen J, Yang D F et al. Scene-coupled intelligent multi-task detection algorithm for air-to-ground remote sensing image. Acta Optica Sinica. 38(12), (2018).
刘星, ?#24405;? 杨东方 等. 场景耦合的空对地多任务遥感影像智能检测算法. 光学学报. 38(12), (2018).

【17】Graves A. Supervised sequence labelling with recurrent neural networks: long short-term memory. Berlin, Heidelberg: Springer. 37-45(2012).

【18】Cai Z W, Fan Q F, Feris R S et al. A unified multi-scale deep convolutional neural network for fast object detection. ∥Leibe B, Matas J, Sebe N, et al. Lecture notes in computer science. Cham: Springer. 9908, 354-370(2016).

【19】He K M, Zhang X Y, Ren S Q et al. Deep residual learning for image recognition. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE. 770-778(2016).

【20】Shelhamer E, Long J and Darrell T. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 39(4), 640-651(2017).

【21】Zeng X Y, Ouyang W L, Yan J J et al. Crafting GBD-net for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence. 40(9), 2109-2123(2018).

【22】Dai J, Li Y, He K et al. R-FCN: object detection via region-based fully convolutional networks. [C]∥Proceedings of the 30th International Conference on Neural Information Processing Systems, December 5-10, 2016, Barcelona, Spain. USA: Curran Associates Inc. 379-387(2016).

【23】Bell S, Zitnick C L, Bala K et al. Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE. 2874-2883(2016).

【24】Gidaris S and Komodakis N. Object detection via a multi-region and semantic segmentation-aware CNN model. [C]∥2015 IEEE International Conference on Computer Vision (ICCV), December 7-13, 2015, Santiago, Chile. New York: IEEE. 1134-1142(2015).

【25】Xu M L, Cui L S, Lü P et al. -08-19)[2019-02-02]. https:∥arxiv. org/abs/1805, (2018).

【26】Lin T Y, Maire M, Belongie S et al. Microsoft COCO: common objects in context. ∥Fleet D, Pajdla T, Schiele B, et al. Lecture notes in computer science. Cham: Springer. 8693, 740-755(2014).

引用该论文

Tong Zhao, Jieyu Liu, Qiang Shen. An Improved Multi-Gate Feature Pyramid Network[J]. Acta Optica Sinica, 2019, 39(8): 0815005

赵彤, 刘洁瑜, 沈强. 一种改进的多门控特征金字塔网络[J]. 光学学报, 2019, 39(8): 0815005

您的浏览器不支持PDF插件,请使用最新的(Chrome/Fire Fox等)浏览器.或者您还可以点击此处下载该论文PDF

一肖中特公式