File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Do we really need more training data for object localization

TitleDo we really need more training data for object localization
Authors
KeywordsComputer vision
Deep learning
Image recognition
Object localization
Issue Date2017
Citation
Proceedings - International Conference on Image Processing, ICIP, 2017, v. 2017-September, p. 775-779 How to Cite?
AbstractThe key factor for training a good neural network lies in both model capacity and large-scale training data. As more datasets are available nowadays, one may wonder whether the success of deep learning descends from data augmentation only. In this paper, we propose a new dataset, namely, Extended ImageNet Classification (EIC) dataset based on the original ILSVRC CLS 2012 set to investigate if more training data is a crucial step. We address the problem of object localization where given an image, some boxes (also called anchors) are generated to localize multiple instances. Different from previous work to place all anchors at the last layer, we split boxes of different sizes at various resolutions in the network, since small anchors are more prone to be identified at larger spatial location in the shallow layers. Inspired by the hourglass work, we apply a conv-deconv network architecture to generate object proposals. The motivation is to fully leverage high-level summarized semantics and to utilize their up-sampling version to help guide local details in the low-level maps. Experimental results demonstrate the effectiveness of such a design. Based on the newly proposed dataset, we find more data could enhance the average recall, but a more balanced data distribution among categories could obtain better results at the cost of fewer training samples.
Persistent Identifierhttp://hdl.handle.net/10722/351380
ISSN
2020 SCImago Journal Rankings: 0.315

 

DC FieldValueLanguage
dc.contributor.authorLi, Hongyang-
dc.contributor.authorLiu, Yu-
dc.contributor.authorZhang, Xin-
dc.contributor.authorAn, Zhecheng-
dc.contributor.authorWang, Jingjing-
dc.contributor.authorChen, Yibo-
dc.contributor.authorTong, Jihong-
dc.date.accessioned2024-11-20T03:55:56Z-
dc.date.available2024-11-20T03:55:56Z-
dc.date.issued2017-
dc.identifier.citationProceedings - International Conference on Image Processing, ICIP, 2017, v. 2017-September, p. 775-779-
dc.identifier.issn1522-4880-
dc.identifier.urihttp://hdl.handle.net/10722/351380-
dc.description.abstractThe key factor for training a good neural network lies in both model capacity and large-scale training data. As more datasets are available nowadays, one may wonder whether the success of deep learning descends from data augmentation only. In this paper, we propose a new dataset, namely, Extended ImageNet Classification (EIC) dataset based on the original ILSVRC CLS 2012 set to investigate if more training data is a crucial step. We address the problem of object localization where given an image, some boxes (also called anchors) are generated to localize multiple instances. Different from previous work to place all anchors at the last layer, we split boxes of different sizes at various resolutions in the network, since small anchors are more prone to be identified at larger spatial location in the shallow layers. Inspired by the hourglass work, we apply a conv-deconv network architecture to generate object proposals. The motivation is to fully leverage high-level summarized semantics and to utilize their up-sampling version to help guide local details in the low-level maps. Experimental results demonstrate the effectiveness of such a design. Based on the newly proposed dataset, we find more data could enhance the average recall, but a more balanced data distribution among categories could obtain better results at the cost of fewer training samples.-
dc.languageeng-
dc.relation.ispartofProceedings - International Conference on Image Processing, ICIP-
dc.subjectComputer vision-
dc.subjectDeep learning-
dc.subjectImage recognition-
dc.subjectObject localization-
dc.titleDo we really need more training data for object localization-
dc.typeConference_Paper-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1109/ICIP.2017.8296386-
dc.identifier.scopuseid_2-s2.0-85045305760-
dc.identifier.volume2017-September-
dc.identifier.spage775-
dc.identifier.epage779-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats