File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Keyword and fact identification and annotation : machine learning approaches
Title | Keyword and fact identification and annotation : machine learning approaches |
---|---|
Authors | |
Advisors | |
Issue Date | 2017 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Liang, Y. [梁予之]. (2017). Keyword and fact identification and annotation : machine learning approaches. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | In the big data era, automatic retrieving useful information from various documents is a hot topic in artificial intelligence study. The popularity of social media makes people overwhelmed by the massive amount of information, it is necessary to develop some techniques to identify useful information efficiently. To better analysis this problem, we divide the documents into two types, one is the traditional document and the other is social media document. The traditional documents include news and articles, etc., which using formal language structures and strictly obey the grammatical rules. On the other hand, social media documents are such as posts on Facebook or Twitter, which are often short and informal. In this dissertation, we investigate keyword and key sentence identification and annotation in various types of documents. Machine learning is used in the information retrieval process. For social media documents, we first propose two novel frameworks of new word detection in Chinese tweets. Then we design a word annotation mechanism which interprets the Tweet-born words by automatic tagging with text labels. In addition, a hierarchical clustering algorithm is introduced to realize relevant words clustering. For the traditional documents, we propose a relevant sentence selection method which can improve the performance of question-answering systems. |
Degree | Doctor of Philosophy |
Subject | Data mining Machine learning |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/255035 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Hui, CK | - |
dc.contributor.advisor | Yiu, SM | - |
dc.contributor.author | Liang, Yuzhi | - |
dc.contributor.author | 梁予之 | - |
dc.date.accessioned | 2018-06-21T03:42:00Z | - |
dc.date.available | 2018-06-21T03:42:00Z | - |
dc.date.issued | 2017 | - |
dc.identifier.citation | Liang, Y. [梁予之]. (2017). Keyword and fact identification and annotation : machine learning approaches. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/255035 | - |
dc.description.abstract | In the big data era, automatic retrieving useful information from various documents is a hot topic in artificial intelligence study. The popularity of social media makes people overwhelmed by the massive amount of information, it is necessary to develop some techniques to identify useful information efficiently. To better analysis this problem, we divide the documents into two types, one is the traditional document and the other is social media document. The traditional documents include news and articles, etc., which using formal language structures and strictly obey the grammatical rules. On the other hand, social media documents are such as posts on Facebook or Twitter, which are often short and informal. In this dissertation, we investigate keyword and key sentence identification and annotation in various types of documents. Machine learning is used in the information retrieval process. For social media documents, we first propose two novel frameworks of new word detection in Chinese tweets. Then we design a word annotation mechanism which interprets the Tweet-born words by automatic tagging with text labels. In addition, a hierarchical clustering algorithm is introduced to realize relevant words clustering. For the traditional documents, we propose a relevant sentence selection method which can improve the performance of question-answering systems. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Data mining | - |
dc.subject.lcsh | Machine learning | - |
dc.title | Keyword and fact identification and annotation : machine learning approaches | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_991044014362103414 | - |
dc.date.hkucongregation | 2018 | - |
dc.identifier.mmsid | 991044014362103414 | - |