File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/ICIP.2017.8296386
- Scopus: eid_2-s2.0-85045305760
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: Do we really need more training data for object localization
Title | Do we really need more training data for object localization |
---|---|
Authors | |
Keywords | Computer vision Deep learning Image recognition Object localization |
Issue Date | 2017 |
Citation | Proceedings - International Conference on Image Processing, ICIP, 2017, v. 2017-September, p. 775-779 How to Cite? |
Abstract | The key factor for training a good neural network lies in both model capacity and large-scale training data. As more datasets are available nowadays, one may wonder whether the success of deep learning descends from data augmentation only. In this paper, we propose a new dataset, namely, Extended ImageNet Classification (EIC) dataset based on the original ILSVRC CLS 2012 set to investigate if more training data is a crucial step. We address the problem of object localization where given an image, some boxes (also called anchors) are generated to localize multiple instances. Different from previous work to place all anchors at the last layer, we split boxes of different sizes at various resolutions in the network, since small anchors are more prone to be identified at larger spatial location in the shallow layers. Inspired by the hourglass work, we apply a conv-deconv network architecture to generate object proposals. The motivation is to fully leverage high-level summarized semantics and to utilize their up-sampling version to help guide local details in the low-level maps. Experimental results demonstrate the effectiveness of such a design. Based on the newly proposed dataset, we find more data could enhance the average recall, but a more balanced data distribution among categories could obtain better results at the cost of fewer training samples. |
Persistent Identifier | http://hdl.handle.net/10722/351380 |
ISSN | 2020 SCImago Journal Rankings: 0.315 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Li, Hongyang | - |
dc.contributor.author | Liu, Yu | - |
dc.contributor.author | Zhang, Xin | - |
dc.contributor.author | An, Zhecheng | - |
dc.contributor.author | Wang, Jingjing | - |
dc.contributor.author | Chen, Yibo | - |
dc.contributor.author | Tong, Jihong | - |
dc.date.accessioned | 2024-11-20T03:55:56Z | - |
dc.date.available | 2024-11-20T03:55:56Z | - |
dc.date.issued | 2017 | - |
dc.identifier.citation | Proceedings - International Conference on Image Processing, ICIP, 2017, v. 2017-September, p. 775-779 | - |
dc.identifier.issn | 1522-4880 | - |
dc.identifier.uri | http://hdl.handle.net/10722/351380 | - |
dc.description.abstract | The key factor for training a good neural network lies in both model capacity and large-scale training data. As more datasets are available nowadays, one may wonder whether the success of deep learning descends from data augmentation only. In this paper, we propose a new dataset, namely, Extended ImageNet Classification (EIC) dataset based on the original ILSVRC CLS 2012 set to investigate if more training data is a crucial step. We address the problem of object localization where given an image, some boxes (also called anchors) are generated to localize multiple instances. Different from previous work to place all anchors at the last layer, we split boxes of different sizes at various resolutions in the network, since small anchors are more prone to be identified at larger spatial location in the shallow layers. Inspired by the hourglass work, we apply a conv-deconv network architecture to generate object proposals. The motivation is to fully leverage high-level summarized semantics and to utilize their up-sampling version to help guide local details in the low-level maps. Experimental results demonstrate the effectiveness of such a design. Based on the newly proposed dataset, we find more data could enhance the average recall, but a more balanced data distribution among categories could obtain better results at the cost of fewer training samples. | - |
dc.language | eng | - |
dc.relation.ispartof | Proceedings - International Conference on Image Processing, ICIP | - |
dc.subject | Computer vision | - |
dc.subject | Deep learning | - |
dc.subject | Image recognition | - |
dc.subject | Object localization | - |
dc.title | Do we really need more training data for object localization | - |
dc.type | Conference_Paper | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/ICIP.2017.8296386 | - |
dc.identifier.scopus | eid_2-s2.0-85045305760 | - |
dc.identifier.volume | 2017-September | - |
dc.identifier.spage | 775 | - |
dc.identifier.epage | 779 | - |