**Conference Paper:**Compressed index for dynamic text

Title | Compressed index for dynamic text |
---|---|

Authors | Hon, WK2 Lam, TW2 Sadakane, K1 Sung, WK3 Yiu, SM2 |

Issue Date | 2004 |

Citation | Data Compression Conference Proceedings, 2004, p. 102-111 [How to Cite?] |

Abstract | This paper investigates how to index a text which is subject to updates. The best solution in the literature is based on suffix tree using O(n log n) bits of storage, where n is the length of the text. It supports finding all occurrences of a pattern P in O(|P| + occ) time, where occ is the number of occurrences. Each text update consists of inserting or deleting a substring of length y and can be supported in O(y + √n) time. In this paper, we initiate the study of compressed index using only O(n log |Σ|) bits of space, where Σ denotes the alphabet. Our solution supports finding all occurrences of a pattern P in O(|P|log2 n(logε n + log |ΣE|) + occ log1+ε n) time, while insertion or deletion of a substring of length y can be done in O((y + √n) log 2+εn) amortized time, where 0 < ε ≤ 1. The core part of our data structure is based on the recent work on Compressed Suffix Trees (CST) and Compressed Suffix Arrays (CSA). |

ISSN | 1068-0314 2012 SCImago Journal Rankings: 0.489 |

References | References in Scopus |

DC Field | Value |
---|---|

dc.contributor.author | Hon, WK |

dc.contributor.author | Lam, TW |

dc.contributor.author | Sadakane, K |

dc.contributor.author | Sung, WK |

dc.contributor.author | Yiu, SM |

dc.date.accessioned | 2012-06-26T06:30:14Z |

dc.date.available | 2012-06-26T06:30:14Z |

dc.date.issued | 2004 |

dc.description.abstract | This paper investigates how to index a text which is subject to updates. The best solution in the literature is based on suffix tree using O(n log n) bits of storage, where n is the length of the text. It supports finding all occurrences of a pattern P in O(|P| + occ) time, where occ is the number of occurrences. Each text update consists of inserting or deleting a substring of length y and can be supported in O(y + √n) time. In this paper, we initiate the study of compressed index using only O(n log |Σ|) bits of space, where Σ denotes the alphabet. Our solution supports finding all occurrences of a pattern P in O(|P|log2 n(logε n + log |ΣE|) + occ log1+ε n) time, while insertion or deletion of a substring of length y can be done in O((y + √n) log 2+εn) amortized time, where 0 < ε ≤ 1. The core part of our data structure is based on the recent work on Compressed Suffix Trees (CST) and Compressed Suffix Arrays (CSA). |

dc.description.nature | Link_to_subscribed_fulltext |

dc.identifier.citation | Data Compression Conference Proceedings, 2004, p. 102-111 [How to Cite?] |

dc.identifier.epage | 111 |

dc.identifier.issn | 1068-0314 2012 SCImago Journal Rankings: 0.489 |

dc.identifier.scopus | eid_2-s2.0-2642533893 |

dc.identifier.spage | 102 |

dc.identifier.uri | http://hdl.handle.net/10722/151868 |

dc.language | eng |

dc.relation.ispartof | Data Compression Conference Proceedings |

dc.relation.references | References in Scopus |

dc.title | Compressed index for dynamic text |

dc.type | Conference_Paper |

<?xml encoding="utf-8" version="1.0"?> <item><contributor.author>Hon, WK</contributor.author> <contributor.author>Lam, TW</contributor.author> <contributor.author>Sadakane, K</contributor.author> <contributor.author>Sung, WK</contributor.author> <contributor.author>Yiu, SM</contributor.author> <date.accessioned>2012-06-26T06:30:14Z</date.accessioned> <date.available>2012-06-26T06:30:14Z</date.available> <date.issued>2004</date.issued> <identifier.citation>Data Compression Conference Proceedings, 2004, p. 102-111</identifier.citation> <identifier.issn>1068-0314</identifier.issn> <identifier.uri>http://hdl.handle.net/10722/151868</identifier.uri> <description.abstract>This paper investigates how to index a text which is subject to updates. The best solution in the literature is based on suffix tree using O(n log n) bits of storage, where n is the length of the text. It supports finding all occurrences of a pattern P in O(|P| + occ) time, where occ is the number of occurrences. Each text update consists of inserting or deleting a substring of length y and can be supported in O(y + √n) time. In this paper, we initiate the study of compressed index using only O(n log |Σ|) bits of space, where Σ denotes the alphabet. Our solution supports finding all occurrences of a pattern P in O(|P|log2 n(logε n + log |ΣE|) + occ log1+ε n) time, while insertion or deletion of a substring of length y can be done in O((y + √n) log 2+εn) amortized time, where 0 < ε ≤ 1. The core part of our data structure is based on the recent work on Compressed Suffix Trees (CST) and Compressed Suffix Arrays (CSA).</description.abstract> <language>eng</language> <relation.ispartof>Data Compression Conference Proceedings</relation.ispartof> <title>Compressed index for dynamic text</title> <type>Conference_Paper</type> <description.nature>Link_to_subscribed_fulltext</description.nature> <identifier.scopus>eid_2-s2.0-2642533893</identifier.scopus> <relation.references>http://www.scopus.com/mlt/select.url?eid=2-s2.0-2642533893&selection=ref&src=s&origin=recordpage</relation.references> <identifier.spage>102</identifier.spage> <identifier.epage>111</identifier.epage> </item>

Author Affiliations

- Kyushu University
- The University of Hong Kong
- National University of Singapore