File Download
There are no files associated with this item.
Supplementary
-
Citations:
- Appears in Collections:
Conference Paper: Propagating Over Phrase Relations for One-Stage Visual Grounding
Title | Propagating Over Phrase Relations for One-Stage Visual Grounding |
---|---|
Authors | |
Keywords | One-Stage Phrase Grounding Linguistic Graph Relational Propagation Visual Grounding |
Issue Date | 2020 |
Citation | The 16th European Conference on Computer Vision (ECCV), Online, 23-28 August 2020 How to Cite? |
Abstract | Phrase level visual grounding aims to locate in an image the corresponding visual regions referred to by multiple noun phrases in a given sentence. Its challenge comes not only from large variations in visual contents and unrestricted phrase descriptions but also from unambiguous referrals derived from phrase relational reasoning. In this paper, we propose a linguistic structure guided propagation network for one-stage phrase grounding. It explicitly explores the linguistic structure of the sentence and performs relational propagation among noun phrases under the guidance of the linguistic relations between them. Specifically, we first construct a linguistic graph parsed from the sentence and then capture multimodal feature maps for all the phrasal nodes independently. The node features are then propagated over the edges with a tailor-designed relational propagation module and ultimately integrated for final prediction. Experiments on Flicker30K Entities dataset show that our model outperforms state-of-the-art methods and demonstrate the effectiveness of propagating among phrases with linguistic relations. |
Persistent Identifier | http://hdl.handle.net/10722/286647 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | YANG, S | - |
dc.contributor.author | LI, G | - |
dc.contributor.author | Yu, Y | - |
dc.date.accessioned | 2020-09-04T13:28:31Z | - |
dc.date.available | 2020-09-04T13:28:31Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | The 16th European Conference on Computer Vision (ECCV), Online, 23-28 August 2020 | - |
dc.identifier.uri | http://hdl.handle.net/10722/286647 | - |
dc.description.abstract | Phrase level visual grounding aims to locate in an image the corresponding visual regions referred to by multiple noun phrases in a given sentence. Its challenge comes not only from large variations in visual contents and unrestricted phrase descriptions but also from unambiguous referrals derived from phrase relational reasoning. In this paper, we propose a linguistic structure guided propagation network for one-stage phrase grounding. It explicitly explores the linguistic structure of the sentence and performs relational propagation among noun phrases under the guidance of the linguistic relations between them. Specifically, we first construct a linguistic graph parsed from the sentence and then capture multimodal feature maps for all the phrasal nodes independently. The node features are then propagated over the edges with a tailor-designed relational propagation module and ultimately integrated for final prediction. Experiments on Flicker30K Entities dataset show that our model outperforms state-of-the-art methods and demonstrate the effectiveness of propagating among phrases with linguistic relations. | - |
dc.language | eng | - |
dc.relation.ispartof | European Conference on Computer Vision (ECCV) | - |
dc.subject | One-Stage Phrase Grounding | - |
dc.subject | Linguistic Graph | - |
dc.subject | Relational Propagation | - |
dc.subject | Visual Grounding | - |
dc.title | Propagating Over Phrase Relations for One-Stage Visual Grounding | - |
dc.type | Conference_Paper | - |
dc.identifier.email | Yu, Y: yzyu@cs.hku.hk | - |
dc.identifier.authority | Yu, Y=rp01415 | - |
dc.identifier.hkuros | 313949 | - |