File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera

TitleDetecting Target Objects by Natural Language Instructions Using an RGB-D Camera
Authors
Issue Date2016
PublisherMolecular Diversity Preservation International. The Journal's web site is located at http://www.mdpi.net/sensors
Citation
Sensors, 2016, v. 16, p. 2117 How to Cite?
AbstractControlling robots by natural language (NL) is increasingly attracting attention for its versatility, convenience and no need of extensive training for users. Grounding is a crucial challenge of this problem to enable robots to understand NL instructions from humans. This paper mainly explores the object grounding problem and concretely studies how to detect target objects by the NL instructions using an RGB-D camera in robotic manipulation applications. In particular, a simple yet robust vision algorithm is applied to segment objects of interest. With the metric information of all segmented objects, the object attributes and relations between objects are further extracted. The NL instructions that incorporate multiple cues for object specifications are parsed into domain-specific annotations. The annotations from NL and extracted information from the RGB-D camera are matched in a computational state estimation framework to search all possible object grounding states. The final grounding is accomplished by selecting the states which have the maximum probabilities. An RGB-D scene dataset associated with different groups of NL instructions based on different cognition levels of the robot are collected. Quantitative evaluations on the dataset illustrate the advantages of the proposed method. The experiments of NL controlled object manipulation and NL-based task programming using a mobile manipulator show its effectiveness and practicability in robotic applications. View Full-Text
Persistent Identifierhttp://hdl.handle.net/10722/262334
ISSN
2017 Impact Factor: 2.475
2015 SCImago Journal Rankings: 0.546
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorBao, J-
dc.contributor.authorJia, Y-
dc.contributor.authorCheng, Y-
dc.contributor.authorTang, H-
dc.contributor.authorXi, N-
dc.date.accessioned2018-09-28T04:57:31Z-
dc.date.available2018-09-28T04:57:31Z-
dc.date.issued2016-
dc.identifier.citationSensors, 2016, v. 16, p. 2117-
dc.identifier.issn1424-8220-
dc.identifier.urihttp://hdl.handle.net/10722/262334-
dc.description.abstractControlling robots by natural language (NL) is increasingly attracting attention for its versatility, convenience and no need of extensive training for users. Grounding is a crucial challenge of this problem to enable robots to understand NL instructions from humans. This paper mainly explores the object grounding problem and concretely studies how to detect target objects by the NL instructions using an RGB-D camera in robotic manipulation applications. In particular, a simple yet robust vision algorithm is applied to segment objects of interest. With the metric information of all segmented objects, the object attributes and relations between objects are further extracted. The NL instructions that incorporate multiple cues for object specifications are parsed into domain-specific annotations. The annotations from NL and extracted information from the RGB-D camera are matched in a computational state estimation framework to search all possible object grounding states. The final grounding is accomplished by selecting the states which have the maximum probabilities. An RGB-D scene dataset associated with different groups of NL instructions based on different cognition levels of the robot are collected. Quantitative evaluations on the dataset illustrate the advantages of the proposed method. The experiments of NL controlled object manipulation and NL-based task programming using a mobile manipulator show its effectiveness and practicability in robotic applications. View Full-Text-
dc.languageeng-
dc.publisherMolecular Diversity Preservation International. The Journal's web site is located at http://www.mdpi.net/sensors-
dc.relation.ispartofSensors-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.titleDetecting Target Objects by Natural Language Instructions Using an RGB-D Camera-
dc.typeArticle-
dc.identifier.emailXi, N: xining@hku.hk-
dc.identifier.authorityXi, N=rp02044-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.3390/s16122117-
dc.identifier.hkuros292804-
dc.identifier.volume16-
dc.identifier.spage2117-
dc.identifier.epage2117-
dc.identifier.isiWOS:000391303000136-
dc.publisher.placeSwitzerland-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats