File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1007/978-3-319-68783-4_23
- Scopus: eid_2-s2.0-85031417029
- WOS: WOS:000739665100023
- Find via
Supplementary
- Citations:
- Appears in Collections:
Conference Paper: Reliable Retrieval of Top-k Tags
Title | Reliable Retrieval of Top-k Tags |
---|---|
Authors | |
Issue Date | 2017 |
Publisher | Springer International Publishing. |
Citation | The 18th International Conference on Web Information Systems Engineering, Puschino, Russia, 7-11 October 2017. In Bouguettaya, A ... (et al) (Eds.). Web Information Systems Engineering – WISE 2017 (Lecture Notes in Computer Science, v. 10569), p. 330-346. Cham: Springer International Publishing, 2017 How to Cite? |
Abstract | Collaborative tagging systems, such as Flickr and Del.icio.us, allow users to provide keyword labels, or tags, for various Internet resources (e.g., photos, songs, and bookmarks). These tags, which provide a rich source of information, have been used in important applications such as resource searching, webpage clustering, etc. However, tags are provided by casual users, and so their quality cannot be guaranteed. In this paper, we examine a question: given a resource r and a set of user-provided tags associated with r, can r be correctly described by the k most frequent tags? To answer this question, we develop the metric top- k sliding average similarity (top- k SAS) which measures the reliability of k most frequent tags. One threshold is then set to estimate whether the reliability is sufficient for retrieving the top-k tags. Our experiments on real datasets show that the threshold-based evaluation on top-k SAS is effective and efficient to determine whether the k most frequent tags can be considered as high-quality top-k tags for r.
Experiments also indicate that setting an appropriate threshold is challenging. The threshold-based strategy is sensitive to a little change of the threshold. To solve this problem, we introduce a parameter-free evaluation strategy that utilizes machine learning models to estimate whether the k most frequent tags are qualified to be the top-k tags. Experiment results demonstrate that the learning-based method achieves comparable performance to the threshold-based method, while overcoming the difficulty of setting a threshold. |
Persistent Identifier | http://hdl.handle.net/10722/243246 |
ISBN | |
ISSN | 2023 SCImago Journal Rankings: 0.606 |
ISI Accession Number ID | |
Series/Report no. | Information Systems and Applications, incl. Internet/Web, and HCI Lecture Notes in Computer Science book series (LNCS) ; v. 10569 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Xu, Y | - |
dc.contributor.author | Cheng, CK | - |
dc.contributor.author | Zheng, Y | - |
dc.date.accessioned | 2017-08-25T02:52:10Z | - |
dc.date.available | 2017-08-25T02:52:10Z | - |
dc.date.issued | 2017 | - |
dc.identifier.citation | The 18th International Conference on Web Information Systems Engineering, Puschino, Russia, 7-11 October 2017. In Bouguettaya, A ... (et al) (Eds.). Web Information Systems Engineering – WISE 2017 (Lecture Notes in Computer Science, v. 10569), p. 330-346. Cham: Springer International Publishing, 2017 | - |
dc.identifier.isbn | 978-3-319-68782-7 | - |
dc.identifier.issn | 0302-9743 | - |
dc.identifier.uri | http://hdl.handle.net/10722/243246 | - |
dc.description.abstract | Collaborative tagging systems, such as Flickr and Del.icio.us, allow users to provide keyword labels, or tags, for various Internet resources (e.g., photos, songs, and bookmarks). These tags, which provide a rich source of information, have been used in important applications such as resource searching, webpage clustering, etc. However, tags are provided by casual users, and so their quality cannot be guaranteed. In this paper, we examine a question: given a resource r and a set of user-provided tags associated with r, can r be correctly described by the k most frequent tags? To answer this question, we develop the metric top- k sliding average similarity (top- k SAS) which measures the reliability of k most frequent tags. One threshold is then set to estimate whether the reliability is sufficient for retrieving the top-k tags. Our experiments on real datasets show that the threshold-based evaluation on top-k SAS is effective and efficient to determine whether the k most frequent tags can be considered as high-quality top-k tags for r. Experiments also indicate that setting an appropriate threshold is challenging. The threshold-based strategy is sensitive to a little change of the threshold. To solve this problem, we introduce a parameter-free evaluation strategy that utilizes machine learning models to estimate whether the k most frequent tags are qualified to be the top-k tags. Experiment results demonstrate that the learning-based method achieves comparable performance to the threshold-based method, while overcoming the difficulty of setting a threshold. | - |
dc.language | eng | - |
dc.publisher | Springer International Publishing. | - |
dc.relation.ispartof | Web Information Systems Engineering – WISE 2017 | - |
dc.relation.ispartofseries | Information Systems and Applications, incl. Internet/Web, and HCI | - |
dc.relation.ispartofseries | Lecture Notes in Computer Science book series (LNCS) ; v. 10569 | - |
dc.title | Reliable Retrieval of Top-k Tags | - |
dc.type | Conference_Paper | - |
dc.identifier.email | Cheng, CK: ckcheng@cs.hku.hk | - |
dc.identifier.authority | Cheng, CK=rp00074 | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1007/978-3-319-68783-4_23 | - |
dc.identifier.scopus | eid_2-s2.0-85031417029 | - |
dc.identifier.hkuros | 275517 | - |
dc.identifier.spage | 330 | - |
dc.identifier.epage | 346 | - |
dc.identifier.eissn | 1611-3349 | - |
dc.identifier.isi | WOS:000739665100023 | - |
dc.publisher.place | Cham | - |
dc.identifier.issnl | 0302-9743 | - |