File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Discovering minimal infrequent structures from XML documents

TitleDiscovering minimal infrequent structures from XML documents
Authors
Issue Date2004
PublisherSpringer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/
Citation
Lecture Notes In Computer Science (Including Subseries Lecture Notes In Artificial Intelligence And Lecture Notes In Bioinformatics), 2004, v. 3306, p. 291-302 How to Cite?
AbstractMore and more data (documents) are wrapped in XML format. Mining these documents involves mining the corresponding XML structures. However, the semi-structured (tree structured) XML makes it somewhat difficult for traditional data mining algorithms to work properly. Recently, several new algorithms were proposed to mine XML documents. These algorithms mainly focus on mining frequent tree structures from XML documents. However, none of them was designed for mining infrequent structures which are also important in many applications, such as query processing and identification of exceptional cases. In this paper, we consider the problem of identifying infrequent tree structures from XML documents. Intuitively, if a tree structure is infrequent, all tree structures that contain this subtree is also infrequent. So, we propose to consider the minimal infrequent structure (MIS), which is an infrequent structure while all proper subtrees of it are frequent. We also derive a level-wise mining algorithm that makes use of the SG-tree (signature tree) and some effective pruning techniques to efficiently discover all MIS. We validate the efficiency and feasibility of our methods through experiments on both synthetic and real data. © Springer-Verlag 2004.
Persistent Identifierhttp://hdl.handle.net/10722/93190
ISSN
2005 Impact Factor: 0.402
2015 SCImago Journal Rankings: 0.252
References

 

DC FieldValueLanguage
dc.contributor.authorLian, Wen_HK
dc.contributor.authorMamoulis, Nen_HK
dc.contributor.authorCheung, DWen_HK
dc.contributor.authorYiu, SMen_HK
dc.date.accessioned2010-09-25T14:53:37Z-
dc.date.available2010-09-25T14:53:37Z-
dc.date.issued2004en_HK
dc.identifier.citationLecture Notes In Computer Science (Including Subseries Lecture Notes In Artificial Intelligence And Lecture Notes In Bioinformatics), 2004, v. 3306, p. 291-302en_HK
dc.identifier.issn0302-9743en_HK
dc.identifier.urihttp://hdl.handle.net/10722/93190-
dc.description.abstractMore and more data (documents) are wrapped in XML format. Mining these documents involves mining the corresponding XML structures. However, the semi-structured (tree structured) XML makes it somewhat difficult for traditional data mining algorithms to work properly. Recently, several new algorithms were proposed to mine XML documents. These algorithms mainly focus on mining frequent tree structures from XML documents. However, none of them was designed for mining infrequent structures which are also important in many applications, such as query processing and identification of exceptional cases. In this paper, we consider the problem of identifying infrequent tree structures from XML documents. Intuitively, if a tree structure is infrequent, all tree structures that contain this subtree is also infrequent. So, we propose to consider the minimal infrequent structure (MIS), which is an infrequent structure while all proper subtrees of it are frequent. We also derive a level-wise mining algorithm that makes use of the SG-tree (signature tree) and some effective pruning techniques to efficiently discover all MIS. We validate the efficiency and feasibility of our methods through experiments on both synthetic and real data. © Springer-Verlag 2004.en_HK
dc.languageengen_HK
dc.publisherSpringer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/en_HK
dc.relation.ispartofLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)en_HK
dc.titleDiscovering minimal infrequent structures from XML documentsen_HK
dc.typeArticleen_HK
dc.identifier.emailMamoulis, N:nikos@cs.hku.hken_HK
dc.identifier.emailCheung, DW:dcheung@cs.hku.hken_HK
dc.identifier.emailYiu, SM:smyiu@cs.hku.hken_HK
dc.identifier.authorityMamoulis, N=rp00155en_HK
dc.identifier.authorityCheung, DW=rp00101en_HK
dc.identifier.authorityYiu, SM=rp00207en_HK
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.scopuseid_2-s2.0-35048823909en_HK
dc.identifier.hkuros103245en_HK
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-35048823909&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume3306en_HK
dc.identifier.spage291en_HK
dc.identifier.epage302en_HK
dc.publisher.placeGermanyen_HK
dc.identifier.scopusauthoridLian, W=22433603900en_HK
dc.identifier.scopusauthoridMamoulis, N=6701782749en_HK
dc.identifier.scopusauthoridCheung, DW=34567902600en_HK
dc.identifier.scopusauthoridYiu, SM=7003282240en_HK

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats