File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Efficient algorithms for mining and incremental update of maximal frequent sequences

TitleEfficient algorithms for mining and incremental update of maximal frequent sequences
Authors
KeywordsData mining
Incremental update
Sequence
Issue Date2005
PublisherSpringer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=1384-5810
Citation
Data Mining And Knowledge Discovery, 2005, v. 10 n. 2, p. 87-116 How to Cite?
AbstractWe study two problems: (1) mining frequent sequences from a transactional database, and (2) incremental update of frequent sequences when the underlying database changes over time. We review existing sequence mining algorithms including GSP, PrefixSpan, SPADE, and ISM. We point out the large memory requirement of Pref ixSpan, SPADE, and ISM, and evaluate the performance of GSP. We discuss the high I/O cost of GSP, particularly when the database contains long frequent sequences. To reduce the I/O requirement, we propose an algorithm MFS, which could be considered as a generalization of GSP. The general strategy of MFS is to first find an approximate solution to the set of frequent sequences and then perform successive refinement until the exact set of frequent sequences is obtained. We show that this successive refinement approach results in a significant improvement in I/O cost. We discuss how MFS can be applied to the incremental update problem. In particular, the result of a previous mining exercise can be used (by MFS) as a good initial approximate solution for the mining of an updated database. This results in an I/O efficient algorithm. To improve processing efficiency, we devise pruning techniques that, when coupled with GSP or MFS, result in algorithms that are both CPU and I/O efficient. © 2005 Springer Science + Business Media, Inc.
Persistent Identifierhttp://hdl.handle.net/10722/88972
ISSN
2023 Impact Factor: 2.8
2023 SCImago Journal Rankings: 1.813
ISI Accession Number ID
References

 

DC FieldValueLanguage
dc.contributor.authorKao, Ben_HK
dc.contributor.authorZhang, Men_HK
dc.contributor.authorYip, CLen_HK
dc.contributor.authorCheung, DWen_HK
dc.date.accessioned2010-09-06T09:50:47Z-
dc.date.available2010-09-06T09:50:47Z-
dc.date.issued2005en_HK
dc.identifier.citationData Mining And Knowledge Discovery, 2005, v. 10 n. 2, p. 87-116en_HK
dc.identifier.issn1384-5810en_HK
dc.identifier.urihttp://hdl.handle.net/10722/88972-
dc.description.abstractWe study two problems: (1) mining frequent sequences from a transactional database, and (2) incremental update of frequent sequences when the underlying database changes over time. We review existing sequence mining algorithms including GSP, PrefixSpan, SPADE, and ISM. We point out the large memory requirement of Pref ixSpan, SPADE, and ISM, and evaluate the performance of GSP. We discuss the high I/O cost of GSP, particularly when the database contains long frequent sequences. To reduce the I/O requirement, we propose an algorithm MFS, which could be considered as a generalization of GSP. The general strategy of MFS is to first find an approximate solution to the set of frequent sequences and then perform successive refinement until the exact set of frequent sequences is obtained. We show that this successive refinement approach results in a significant improvement in I/O cost. We discuss how MFS can be applied to the incremental update problem. In particular, the result of a previous mining exercise can be used (by MFS) as a good initial approximate solution for the mining of an updated database. This results in an I/O efficient algorithm. To improve processing efficiency, we devise pruning techniques that, when coupled with GSP or MFS, result in algorithms that are both CPU and I/O efficient. © 2005 Springer Science + Business Media, Inc.en_HK
dc.languageengen_HK
dc.publisherSpringer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=1384-5810en_HK
dc.relation.ispartofData Mining and Knowledge Discoveryen_HK
dc.subjectData miningen_HK
dc.subjectIncremental updateen_HK
dc.subjectSequenceen_HK
dc.titleEfficient algorithms for mining and incremental update of maximal frequent sequencesen_HK
dc.typeArticleen_HK
dc.identifier.openurlhttp://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1384-5810&volume=10&spage=87&epage=116&date=2005&atitle=Efficient+Algorithms+for+Mining+and+Incremental+update+of+Maximal+Frequent+Sequencesen_HK
dc.identifier.emailKao, B:kao@cs.hku.hken_HK
dc.identifier.emailYip, CL:clyip@cs.hku.hken_HK
dc.identifier.emailCheung, DW:dcheung@cs.hku.hken_HK
dc.identifier.authorityKao, B=rp00123en_HK
dc.identifier.authorityYip, CL=rp00205en_HK
dc.identifier.authorityCheung, DW=rp00101en_HK
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1007/s10618-005-0268-zen_HK
dc.identifier.scopuseid_2-s2.0-24044460806en_HK
dc.identifier.hkuros129357en_HK
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-24044460806&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume10en_HK
dc.identifier.issue2en_HK
dc.identifier.spage87en_HK
dc.identifier.epage116en_HK
dc.identifier.isiWOS:000228970700001-
dc.publisher.placeUnited Statesen_HK
dc.identifier.scopusauthoridKao, B=35221592600en_HK
dc.identifier.scopusauthoridZhang, M=20434954000en_HK
dc.identifier.scopusauthoridYip, CL=7101665547en_HK
dc.identifier.scopusauthoridCheung, DW=34567902600en_HK
dc.identifier.citeulike196576-
dc.identifier.issnl1384-5810-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats