File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: A space and time efficient algorithm for constructing compressed suffix arrays

TitleA space and time efficient algorithm for constructing compressed suffix arrays
Authors
KeywordsCompression
Construcion
Pattern mathching
Text indexing
Issue Date2007
PublisherSpringer New York LLC. The Journal's web site is located at http://link.springer.de/link/service/journals/00453/index.htm
Citation
Algorithmica (New York), 2007, v. 48 n. 1, p. 23-36 How to Cite?
AbstractWith the first human DNA being decoded into a sequence of about 2.8 billion characters, much biological research has been centered on analyzing this sequence. Theoretically speaking, it is now feasible to accommodate an index for human DNA in the main memory so that any pattern can be located efficiently. This is due to the recent breakthrough on compressed suffix arrays, which reduces the space requirement from O(n log n) bits to O(n) bits. However, constructing compressed suffix arrays is still not an easy task because we still have to compute suffix arrays first and need a working memory of O(n log n) bits (i.e., more than 13 gigabytes for human DNA). This paper initiates the study of constructing compressed suffix arrays directly from the text. The main contribution is a construction algorithm that uses only O(n) bits of working memory, and the time complexity is O(n log n). Our construction algorithm is also time and space efficient for texts with large alphabets such as Chinese or Japanese. Precisely, when the alphabet size is |∑|, the working space is O(n log |∑|) bits, and the time complexity remains O(n log n), which is independent of |∑|. © Springer 2007.
Persistent Identifierhttp://hdl.handle.net/10722/88951
ISSN
2023 Impact Factor: 0.9
2023 SCImago Journal Rankings: 0.905
ISI Accession Number ID
References

 

DC FieldValueLanguage
dc.contributor.authorHon, WKen_HK
dc.contributor.authorLam, TWen_HK
dc.contributor.authorSadakane, Ken_HK
dc.contributor.authorSung, WKen_HK
dc.contributor.authorYiu, SMen_HK
dc.date.accessioned2010-09-06T09:50:31Z-
dc.date.available2010-09-06T09:50:31Z-
dc.date.issued2007en_HK
dc.identifier.citationAlgorithmica (New York), 2007, v. 48 n. 1, p. 23-36en_HK
dc.identifier.issn0178-4617en_HK
dc.identifier.urihttp://hdl.handle.net/10722/88951-
dc.description.abstractWith the first human DNA being decoded into a sequence of about 2.8 billion characters, much biological research has been centered on analyzing this sequence. Theoretically speaking, it is now feasible to accommodate an index for human DNA in the main memory so that any pattern can be located efficiently. This is due to the recent breakthrough on compressed suffix arrays, which reduces the space requirement from O(n log n) bits to O(n) bits. However, constructing compressed suffix arrays is still not an easy task because we still have to compute suffix arrays first and need a working memory of O(n log n) bits (i.e., more than 13 gigabytes for human DNA). This paper initiates the study of constructing compressed suffix arrays directly from the text. The main contribution is a construction algorithm that uses only O(n) bits of working memory, and the time complexity is O(n log n). Our construction algorithm is also time and space efficient for texts with large alphabets such as Chinese or Japanese. Precisely, when the alphabet size is |∑|, the working space is O(n log |∑|) bits, and the time complexity remains O(n log n), which is independent of |∑|. © Springer 2007.en_HK
dc.languageengen_HK
dc.publisherSpringer New York LLC. The Journal's web site is located at http://link.springer.de/link/service/journals/00453/index.htmen_HK
dc.relation.ispartofAlgorithmica (New York)en_HK
dc.subjectCompressionen_HK
dc.subjectConstrucionen_HK
dc.subjectPattern mathchingen_HK
dc.subjectText indexingen_HK
dc.titleA space and time efficient algorithm for constructing compressed suffix arraysen_HK
dc.typeArticleen_HK
dc.identifier.openurlhttp://library.hku.hk:4550/resserv?sid=HKU:IR&issn=0178-4617&volume=48:1&spage=23&epage=36&date=2007&atitle=A+Space+and+Time+Efficient+Algorithm+for+Constructing+Compressed+Suffix+Arrays+en_HK
dc.identifier.emailLam, TW:twlam@cs.hku.hken_HK
dc.identifier.emailYiu, SM:smyiu@cs.hku.hken_HK
dc.identifier.authorityLam, TW=rp00135en_HK
dc.identifier.authorityYiu, SM=rp00207en_HK
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1007/s00453-006-1228-8en_HK
dc.identifier.scopuseid_2-s2.0-34547375123en_HK
dc.identifier.hkuros128171en_HK
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-34547375123&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume48en_HK
dc.identifier.issue1en_HK
dc.identifier.spage23en_HK
dc.identifier.epage36en_HK
dc.identifier.isiWOS:000246152800002-
dc.publisher.placeUnited Statesen_HK
dc.identifier.scopusauthoridHon, WK=7004282818en_HK
dc.identifier.scopusauthoridLam, TW=7202523165en_HK
dc.identifier.scopusauthoridSadakane, K=7005716583en_HK
dc.identifier.scopusauthoridSung, WK=13310059700en_HK
dc.identifier.scopusauthoridYiu, SM=7003282240en_HK
dc.identifier.citeulike4736346-
dc.identifier.issnl0178-4617-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats