File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Managing uncertainty of XML schema matching

TitleManaging uncertainty of XML schema matching
Authors
Issue Date2010
PublisherIEEE, Computer Society.
Citation
The IEEE 26th International Conference on Data Engineering (ICDE 2010), Long Beach, CA., 1-6 March 2010. In International Conference on Data Engineering. Proceedings, 2010, p. 297-308 How to Cite?
AbstractDespite of advances in machine learning technologies, a schema matching result between two database schemas (e.g., those derived from COMA++) is likely to be imprecise. In particular, numerous instances of "possible mappings" between the schemas may be derived from the matching result. In this paper, we study the problem of managing possible mappings between two heterogeneous XML schemas. We observe that for XML schemas, their possible mappings have a high degree of overlap. We hence propose a novel data structure, called the block tree, to capture the commonalities among possible mappings. The block tree is useful for representing the possible mappings in a compact manner, and can be generated efficiently. Moreover, it supports the evaluation of probabilistic twig query (PTQ), which returns the probability of portions of an XML document that match the query pattern. For users who are interested only in answers with k-highest probabilities, we also propose the top-k PTQ, and present an efficient solution for it. The second challenge we have tackled is to efficiently generate possible mappings for a given schema matching. While this problem can be solved by existing algorithms, we show how to improve the performance of the solution by using a divide-andconquer approach. An extensive evaluation on realistic datasets show that our approaches significantly improve the efficiency of generating, storing, and querying possible mappings. © 2010 IEEE.
Persistent Identifierhttp://hdl.handle.net/10722/144828
ISSN
2023 SCImago Journal Rankings: 1.306
ISI Accession Number ID
References

 

DC FieldValueLanguage
dc.contributor.authorCheng, Ren_HK
dc.contributor.authorGong, Jen_HK
dc.contributor.authorCheung, DWen_HK
dc.date.accessioned2012-02-07T08:23:19Z-
dc.date.available2012-02-07T08:23:19Z-
dc.date.issued2010en_HK
dc.identifier.citationThe IEEE 26th International Conference on Data Engineering (ICDE 2010), Long Beach, CA., 1-6 March 2010. In International Conference on Data Engineering. Proceedings, 2010, p. 297-308en_HK
dc.identifier.issn1084-4627en_HK
dc.identifier.urihttp://hdl.handle.net/10722/144828-
dc.description.abstractDespite of advances in machine learning technologies, a schema matching result between two database schemas (e.g., those derived from COMA++) is likely to be imprecise. In particular, numerous instances of "possible mappings" between the schemas may be derived from the matching result. In this paper, we study the problem of managing possible mappings between two heterogeneous XML schemas. We observe that for XML schemas, their possible mappings have a high degree of overlap. We hence propose a novel data structure, called the block tree, to capture the commonalities among possible mappings. The block tree is useful for representing the possible mappings in a compact manner, and can be generated efficiently. Moreover, it supports the evaluation of probabilistic twig query (PTQ), which returns the probability of portions of an XML document that match the query pattern. For users who are interested only in answers with k-highest probabilities, we also propose the top-k PTQ, and present an efficient solution for it. The second challenge we have tackled is to efficiently generate possible mappings for a given schema matching. While this problem can be solved by existing algorithms, we show how to improve the performance of the solution by using a divide-andconquer approach. An extensive evaluation on realistic datasets show that our approaches significantly improve the efficiency of generating, storing, and querying possible mappings. © 2010 IEEE.en_HK
dc.languageeng-
dc.publisherIEEE, Computer Society.-
dc.relation.ispartofInternational Conference on Data Engineering. Proceedingsen_HK
dc.rights©2010 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.-
dc.titleManaging uncertainty of XML schema matchingen_HK
dc.typeConference_Paperen_HK
dc.identifier.emailCheng, R:ckcheng@cs.hku.hken_HK
dc.identifier.emailCheung, DW:dcheung@cs.hku.hken_HK
dc.identifier.authorityCheng, R=rp00074en_HK
dc.identifier.authorityCheung, DW=rp00101en_HK
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.1109/ICDE.2010.5447868en_HK
dc.identifier.scopuseid_2-s2.0-77952781219en_HK
dc.identifier.hkuros176463-
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-77952781219&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.spage297en_HK
dc.identifier.epage308en_HK
dc.identifier.isiWOS:000286933100031-
dc.publisher.placeUnited Statesen_HK
dc.description.otherThe IEEE 26th International Conference on Data Engineering (ICDE 2010), Long Beach, CA., 1-6 March 2010. In International Conference on Data Engineering. Proceedings, 2010, p. 297-308-
dc.identifier.scopusauthoridCheng, R=7201955416en_HK
dc.identifier.scopusauthoridGong, J=47961908400en_HK
dc.identifier.scopusauthoridCheung, DW=34567902600en_HK
dc.identifier.issnl1084-4627-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats