File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Indexing useful structural patterns for XML query processing

TitleIndexing useful structural patterns for XML query processing
Authors
KeywordsDocument indexing
Mining methods and algorithms
Query processing
XML/XSL/RDF
Issue Date2005
PublisherI E E E. The Journal's web site is located at http://www.computer.org/tkde
Citation
Ieee Transactions On Knowledge And Data Engineering, 2005, v. 17 n. 7, p. 997-1009 How to Cite?
AbstractQueries on semistructured data are hard to process due to the complex nature of the data and call for specialized techniques. Existing path-based indexes and query processing algorithms are not efficient for searching complex structures beyond simple paths, even when the queries are high-selective. We introduce the definition of minimal infrequent structures (MIS), which are structures that 1) exist in the data, 2) are not frequent with respect to a support threshold, and 3) all substructures of them are frequent. By indexing the occurrences of MIS, we can efficiently locate the high-selective substructures of a query, improving search performance significantly. An efficient data mining algorithm is proposed, which finds the minimal infrequent structures. Their occurrences in the XML data are then indexed by a lightweight data structure and used as a fast filter step in query evaluation. We validate the efficiency and applicability of our methods through experimentation on both synthetic and real data. © 2005 IEEE.
Persistent Identifierhttp://hdl.handle.net/10722/47084
ISSN
2023 Impact Factor: 8.9
2023 SCImago Journal Rankings: 2.867
ISI Accession Number ID
References

 

DC FieldValueLanguage
dc.contributor.authorLian, Wen_HK
dc.contributor.authorMamoulis, Nen_HK
dc.contributor.authorCheung, DWLen_HK
dc.contributor.authorYiu, SMen_HK
dc.date.accessioned2007-10-30T07:06:45Z-
dc.date.available2007-10-30T07:06:45Z-
dc.date.issued2005en_HK
dc.identifier.citationIeee Transactions On Knowledge And Data Engineering, 2005, v. 17 n. 7, p. 997-1009en_HK
dc.identifier.issn1041-4347en_HK
dc.identifier.urihttp://hdl.handle.net/10722/47084-
dc.description.abstractQueries on semistructured data are hard to process due to the complex nature of the data and call for specialized techniques. Existing path-based indexes and query processing algorithms are not efficient for searching complex structures beyond simple paths, even when the queries are high-selective. We introduce the definition of minimal infrequent structures (MIS), which are structures that 1) exist in the data, 2) are not frequent with respect to a support threshold, and 3) all substructures of them are frequent. By indexing the occurrences of MIS, we can efficiently locate the high-selective substructures of a query, improving search performance significantly. An efficient data mining algorithm is proposed, which finds the minimal infrequent structures. Their occurrences in the XML data are then indexed by a lightweight data structure and used as a fast filter step in query evaluation. We validate the efficiency and applicability of our methods through experimentation on both synthetic and real data. © 2005 IEEE.en_HK
dc.format.extent1334990 bytes-
dc.format.extent4295 bytes-
dc.format.extent3502 bytes-
dc.format.extent6619 bytes-
dc.format.mimetypeapplication/pdf-
dc.format.mimetypetext/plain-
dc.format.mimetypetext/plain-
dc.format.mimetypetext/plain-
dc.languageengen_HK
dc.publisherI E E E. The Journal's web site is located at http://www.computer.org/tkdeen_HK
dc.relation.ispartofIEEE Transactions on Knowledge and Data Engineeringen_HK
dc.rights©2005 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.-
dc.subjectDocument indexingen_HK
dc.subjectMining methods and algorithmsen_HK
dc.subjectQuery processingen_HK
dc.subjectXML/XSL/RDFen_HK
dc.titleIndexing useful structural patterns for XML query processingen_HK
dc.typeArticleen_HK
dc.identifier.openurlhttp://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1041-4347&volume=17&issue=7&spage=997&epage=1009&date=2005&atitle=Indexing+useful+structural+patterns+for+XML+query+processingen_HK
dc.identifier.emailMamoulis, N:nikos@cs.hku.hken_HK
dc.identifier.emailCheung, DWL:dcheung@cs.hku.hken_HK
dc.identifier.emailYiu, SM:smyiu@cs.hku.hken_HK
dc.identifier.authorityMamoulis, N=rp00155en_HK
dc.identifier.authorityCheung, DWL=rp00101en_HK
dc.identifier.authorityYiu, SM=rp00207en_HK
dc.description.naturepublished_or_final_versionen_HK
dc.identifier.doi10.1109/TKDE.2005.110en_HK
dc.identifier.scopuseid_2-s2.0-22944487013en_HK
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-22944487013&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume17en_HK
dc.identifier.issue7en_HK
dc.identifier.spage997en_HK
dc.identifier.epage1009en_HK
dc.identifier.isiWOS:000229074800010-
dc.publisher.placeUnited Statesen_HK
dc.identifier.scopusauthoridLian, W=22433603900en_HK
dc.identifier.scopusauthoridMamoulis, N=6701782749en_HK
dc.identifier.scopusauthoridCheung, DWL=34567902600en_HK
dc.identifier.scopusauthoridYiu, SM=7003282240en_HK
dc.identifier.issnl1041-4347-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats