File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Durable top-k search in document archives

TitleDurable top-k search in document archives
Authors
Keywordsdocument archives
temporal queries
top-k search
Issue Date2010
PublisherAssociation for Computing Machinery, Inc. The Journal's web site is located at http://www.acm.org/sigmod
Citation
The 2010 International Conference on Management of Data (SIGMOD '10), Indianapolis, IN., 6-11 June 2010. In Proceedings of the ACM Conference on Management of Data, 2010, p. 555-566 How to Cite?
AbstractWe propose and study a new ranking problem in versioned databases. Consider a database of versioned objects which have different valid instances along a history (e.g., documents in a web archive). Durable top-k search finds the set of objects that are consistently in the top-k results of a query (e.g., a keyword query) throughout a given time interval (e.g., from June 2008 to May 2009). Existing work on temporal top-k queries mainly focuses on finding the most representative top-k elements within a time interval. Such methods are not readily applicable to durable top-k queries. To address this need, we propose two techniques that compute the durable top-k result. The first is adapted from the classic top-k rank aggregation algorithm NRA. The second technique is based on a shared execution paradigm and is more efficient than the first approach. In addition, we propose a special indexing technique for archived data. The index, coupled with a space partitioning technique, improves performance even further. We use data from Wikipedia and the Internet Archive to demonstrate the efficiency and effectiveness of our solutions. © 2010 ACM.
Persistent Identifierhttp://hdl.handle.net/10722/129564
ISBN
ISSN
References

 

DC FieldValueLanguage
dc.contributor.authorHou U, Len_HK
dc.contributor.authorMamoulis, Nen_HK
dc.contributor.authorBerberich, Ken_HK
dc.contributor.authorBedathur, Sen_HK
dc.date.accessioned2010-12-23T08:39:19Z-
dc.date.available2010-12-23T08:39:19Z-
dc.date.issued2010en_HK
dc.identifier.citationThe 2010 International Conference on Management of Data (SIGMOD '10), Indianapolis, IN., 6-11 June 2010. In Proceedings of the ACM Conference on Management of Data, 2010, p. 555-566en_HK
dc.identifier.isbn978-1-4503-0032-2-
dc.identifier.issn0730-8078en_HK
dc.identifier.urihttp://hdl.handle.net/10722/129564-
dc.description.abstractWe propose and study a new ranking problem in versioned databases. Consider a database of versioned objects which have different valid instances along a history (e.g., documents in a web archive). Durable top-k search finds the set of objects that are consistently in the top-k results of a query (e.g., a keyword query) throughout a given time interval (e.g., from June 2008 to May 2009). Existing work on temporal top-k queries mainly focuses on finding the most representative top-k elements within a time interval. Such methods are not readily applicable to durable top-k queries. To address this need, we propose two techniques that compute the durable top-k result. The first is adapted from the classic top-k rank aggregation algorithm NRA. The second technique is based on a shared execution paradigm and is more efficient than the first approach. In addition, we propose a special indexing technique for archived data. The index, coupled with a space partitioning technique, improves performance even further. We use data from Wikipedia and the Internet Archive to demonstrate the efficiency and effectiveness of our solutions. © 2010 ACM.en_HK
dc.languageengen_US
dc.publisherAssociation for Computing Machinery, Inc. The Journal's web site is located at http://www.acm.org/sigmoden_HK
dc.relation.ispartofProceedings of the ACM SIGMOD International Conference on Management of Dataen_HK
dc.rightsProceedings of the ACM Conference on Management of Data. Copyright © Association for Computing Machinery.-
dc.subjectdocument archivesen_HK
dc.subjecttemporal queriesen_HK
dc.subjecttop-k searchen_HK
dc.titleDurable top-k search in document archivesen_HK
dc.typeConference_Paperen_HK
dc.identifier.emailMamoulis, N:nikos@cs.hku.hken_HK
dc.identifier.authorityMamoulis, N=rp00155en_HK
dc.description.naturelink_to_OA_fulltext-
dc.identifier.doi10.1145/1807167.1807228en_HK
dc.identifier.scopuseid_2-s2.0-77954751022en_HK
dc.identifier.hkuros176423en_US
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-77954751022&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.spage555en_HK
dc.identifier.epage566en_HK
dc.publisher.placeUnited Statesen_HK
dc.description.otherThe 2010 International Conference on Management of Data (SIGMOD '10), Indianapolis, IN., 6-11 June 2010. In Proceedings of the ACM Conference on Management of Data, 2010, p. 555-566-
dc.identifier.scopusauthoridHou U, L=13605267100en_HK
dc.identifier.scopusauthoridMamoulis, N=6701782749en_HK
dc.identifier.scopusauthoridBerberich, K=15130456300en_HK
dc.identifier.scopusauthoridBedathur, S=22833788900en_HK

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats