Recurrent Scale Approximation for Object Detection in CNN

Liu, Yu; Li, Hongyang; Yan, Junjie; Wei, Fangyin; Wang, Xiaogang; Tang, Xiaoou

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/ICCV.2017.69
Scopus: eid_2-s2.0-85041908636
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- HKU Musketeers Foundation Institute of Data Science: Conference papers

Conference Paper: Recurrent Scale Approximation for Object Detection in CNN

Title	Recurrent Scale Approximation for Object Detection in CNN
Authors	Liu, Yu Li, Hongyang Yan, Junjie Wei, Fangyin Wang, Xiaogang Tang, Xiaoou
Issue Date	2017
Citation	Proceedings of the IEEE International Conference on Computer Vision, 2017, v. 2017-October, p. 571-579 How to Cite? DOI: http://dx.doi.org/10.1109/ICCV.2017.69
Abstract	Since convolutional neural network (CNN) lacks an inherent mechanism to handle large scale variations, we always need to compute feature maps multiple times for multiscale object detection, which has the bottleneck of computational cost in practice. To address this, we devise a recurrent scale approximation (RSA) to compute feature map once only, and only through this map can we approximate the rest maps on other levels. At the core of RSA is the recursive rolling out mechanism: given an initial map on a particular scale, it generates the prediction on a smaller scale that is half the size of input. To further increase efficiency and accuracy, we (a): design a scale-forecast network to globally predict potential scales in the image since there is no need to compute maps on all levels of the pyramid. (b): propose a landmark retracing network (LRN) to retrace back locations of the regressed landmarks and generate a confidence score for each landmark; LRN can effectively alleviate false positives due to the accumulated error in RSA. The whole system could be trained end-to-end in a unified CNN framework. Experiments demonstrate that our proposed algorithm is superior against state-of-the-arts on face detection benchmarks and achieves comparable results for generic proposal generation. The source code of our system is available.
Persistent Identifier	http://hdl.handle.net/10722/351379
ISSN	1550-5499 2023 SCImago Journal Rankings: 12.263

DC Field	Value	Language
dc.contributor.author	Liu, Yu	-
dc.contributor.author	Li, Hongyang	-
dc.contributor.author	Yan, Junjie	-
dc.contributor.author	Wei, Fangyin	-
dc.contributor.author	Wang, Xiaogang	-
dc.contributor.author	Tang, Xiaoou	-
dc.date.accessioned	2024-11-20T03:55:56Z	-
dc.date.available	2024-11-20T03:55:56Z	-
dc.date.issued	2017	-
dc.identifier.citation	Proceedings of the IEEE International Conference on Computer Vision, 2017, v. 2017-October, p. 571-579	-
dc.identifier.issn	1550-5499	-
dc.identifier.uri	http://hdl.handle.net/10722/351379	-
dc.description.abstract	Since convolutional neural network (CNN) lacks an inherent mechanism to handle large scale variations, we always need to compute feature maps multiple times for multiscale object detection, which has the bottleneck of computational cost in practice. To address this, we devise a recurrent scale approximation (RSA) to compute feature map once only, and only through this map can we approximate the rest maps on other levels. At the core of RSA is the recursive rolling out mechanism: given an initial map on a particular scale, it generates the prediction on a smaller scale that is half the size of input. To further increase efficiency and accuracy, we (a): design a scale-forecast network to globally predict potential scales in the image since there is no need to compute maps on all levels of the pyramid. (b): propose a landmark retracing network (LRN) to retrace back locations of the regressed landmarks and generate a confidence score for each landmark; LRN can effectively alleviate false positives due to the accumulated error in RSA. The whole system could be trained end-to-end in a unified CNN framework. Experiments demonstrate that our proposed algorithm is superior against state-of-the-arts on face detection benchmarks and achieves comparable results for generic proposal generation. The source code of our system is available.	-
dc.language	eng	-
dc.relation.ispartof	Proceedings of the IEEE International Conference on Computer Vision	-
dc.title	Recurrent Scale Approximation for Object Detection in CNN	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/ICCV.2017.69	-
dc.identifier.scopus	eid_2-s2.0-85041908636	-
dc.identifier.volume	2017-October	-
dc.identifier.spage	571	-
dc.identifier.epage	579	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Recurrent Scale Approximation for Object Detection in CNN

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats