File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1007/s10618-005-0268-z
- Scopus: eid_2-s2.0-24044460806
- WOS: WOS:000228970700001
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Efficient algorithms for mining and incremental update of maximal frequent sequences
Title | Efficient algorithms for mining and incremental update of maximal frequent sequences |
---|---|
Authors | |
Keywords | Data mining Incremental update Sequence |
Issue Date | 2005 |
Publisher | Springer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=1384-5810 |
Citation | Data Mining And Knowledge Discovery, 2005, v. 10 n. 2, p. 87-116 How to Cite? |
Abstract | We study two problems: (1) mining frequent sequences from a transactional database, and (2) incremental update of frequent sequences when the underlying database changes over time. We review existing sequence mining algorithms including GSP, PrefixSpan, SPADE, and ISM. We point out the large memory requirement of Pref ixSpan, SPADE, and ISM, and evaluate the performance of GSP. We discuss the high I/O cost of GSP, particularly when the database contains long frequent sequences. To reduce the I/O requirement, we propose an algorithm MFS, which could be considered as a generalization of GSP. The general strategy of MFS is to first find an approximate solution to the set of frequent sequences and then perform successive refinement until the exact set of frequent sequences is obtained. We show that this successive refinement approach results in a significant improvement in I/O cost. We discuss how MFS can be applied to the incremental update problem. In particular, the result of a previous mining exercise can be used (by MFS) as a good initial approximate solution for the mining of an updated database. This results in an I/O efficient algorithm. To improve processing efficiency, we devise pruning techniques that, when coupled with GSP or MFS, result in algorithms that are both CPU and I/O efficient. © 2005 Springer Science + Business Media, Inc. |
Persistent Identifier | http://hdl.handle.net/10722/88972 |
ISSN | 2023 Impact Factor: 2.8 2023 SCImago Journal Rankings: 1.813 |
ISI Accession Number ID | |
References |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kao, B | en_HK |
dc.contributor.author | Zhang, M | en_HK |
dc.contributor.author | Yip, CL | en_HK |
dc.contributor.author | Cheung, DW | en_HK |
dc.date.accessioned | 2010-09-06T09:50:47Z | - |
dc.date.available | 2010-09-06T09:50:47Z | - |
dc.date.issued | 2005 | en_HK |
dc.identifier.citation | Data Mining And Knowledge Discovery, 2005, v. 10 n. 2, p. 87-116 | en_HK |
dc.identifier.issn | 1384-5810 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/88972 | - |
dc.description.abstract | We study two problems: (1) mining frequent sequences from a transactional database, and (2) incremental update of frequent sequences when the underlying database changes over time. We review existing sequence mining algorithms including GSP, PrefixSpan, SPADE, and ISM. We point out the large memory requirement of Pref ixSpan, SPADE, and ISM, and evaluate the performance of GSP. We discuss the high I/O cost of GSP, particularly when the database contains long frequent sequences. To reduce the I/O requirement, we propose an algorithm MFS, which could be considered as a generalization of GSP. The general strategy of MFS is to first find an approximate solution to the set of frequent sequences and then perform successive refinement until the exact set of frequent sequences is obtained. We show that this successive refinement approach results in a significant improvement in I/O cost. We discuss how MFS can be applied to the incremental update problem. In particular, the result of a previous mining exercise can be used (by MFS) as a good initial approximate solution for the mining of an updated database. This results in an I/O efficient algorithm. To improve processing efficiency, we devise pruning techniques that, when coupled with GSP or MFS, result in algorithms that are both CPU and I/O efficient. © 2005 Springer Science + Business Media, Inc. | en_HK |
dc.language | eng | en_HK |
dc.publisher | Springer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=1384-5810 | en_HK |
dc.relation.ispartof | Data Mining and Knowledge Discovery | en_HK |
dc.subject | Data mining | en_HK |
dc.subject | Incremental update | en_HK |
dc.subject | Sequence | en_HK |
dc.title | Efficient algorithms for mining and incremental update of maximal frequent sequences | en_HK |
dc.type | Article | en_HK |
dc.identifier.openurl | http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1384-5810&volume=10&spage=87&epage=116&date=2005&atitle=Efficient+Algorithms+for+Mining+and+Incremental+update+of+Maximal+Frequent+Sequences | en_HK |
dc.identifier.email | Kao, B:kao@cs.hku.hk | en_HK |
dc.identifier.email | Yip, CL:clyip@cs.hku.hk | en_HK |
dc.identifier.email | Cheung, DW:dcheung@cs.hku.hk | en_HK |
dc.identifier.authority | Kao, B=rp00123 | en_HK |
dc.identifier.authority | Yip, CL=rp00205 | en_HK |
dc.identifier.authority | Cheung, DW=rp00101 | en_HK |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1007/s10618-005-0268-z | en_HK |
dc.identifier.scopus | eid_2-s2.0-24044460806 | en_HK |
dc.identifier.hkuros | 129357 | en_HK |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-24044460806&selection=ref&src=s&origin=recordpage | en_HK |
dc.identifier.volume | 10 | en_HK |
dc.identifier.issue | 2 | en_HK |
dc.identifier.spage | 87 | en_HK |
dc.identifier.epage | 116 | en_HK |
dc.identifier.isi | WOS:000228970700001 | - |
dc.publisher.place | United States | en_HK |
dc.identifier.scopusauthorid | Kao, B=35221592600 | en_HK |
dc.identifier.scopusauthorid | Zhang, M=20434954000 | en_HK |
dc.identifier.scopusauthorid | Yip, CL=7101665547 | en_HK |
dc.identifier.scopusauthorid | Cheung, DW=34567902600 | en_HK |
dc.identifier.citeulike | 196576 | - |
dc.identifier.issnl | 1384-5810 | - |