File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Keyword and fact identification and annotation : machine learning approaches

TitleKeyword and fact identification and annotation : machine learning approaches
Authors
Advisors
Advisor(s):Hui, CKYiu, SM
Issue Date2017
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Liang, Y. [梁予之]. (2017). Keyword and fact identification and annotation : machine learning approaches. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractIn the big data era, automatic retrieving useful information from various documents is a hot topic in artificial intelligence study. The popularity of social media makes people overwhelmed by the massive amount of information, it is necessary to develop some techniques to identify useful information efficiently. To better analysis this problem, we divide the documents into two types, one is the traditional document and the other is social media document. The traditional documents include news and articles, etc., which using formal language structures and strictly obey the grammatical rules. On the other hand, social media documents are such as posts on Facebook or Twitter, which are often short and informal. In this dissertation, we investigate keyword and key sentence identification and annotation in various types of documents. Machine learning is used in the information retrieval process. For social media documents, we first propose two novel frameworks of new word detection in Chinese tweets. Then we design a word annotation mechanism which interprets the Tweet-born words by automatic tagging with text labels. In addition, a hierarchical clustering algorithm is introduced to realize relevant words clustering. For the traditional documents, we propose a relevant sentence selection method which can improve the performance of question-answering systems.
DegreeDoctor of Philosophy
SubjectData mining
Machine learning
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/255035

 

DC FieldValueLanguage
dc.contributor.advisorHui, CK-
dc.contributor.advisorYiu, SM-
dc.contributor.authorLiang, Yuzhi-
dc.contributor.author梁予之-
dc.date.accessioned2018-06-21T03:42:00Z-
dc.date.available2018-06-21T03:42:00Z-
dc.date.issued2017-
dc.identifier.citationLiang, Y. [梁予之]. (2017). Keyword and fact identification and annotation : machine learning approaches. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/255035-
dc.description.abstractIn the big data era, automatic retrieving useful information from various documents is a hot topic in artificial intelligence study. The popularity of social media makes people overwhelmed by the massive amount of information, it is necessary to develop some techniques to identify useful information efficiently. To better analysis this problem, we divide the documents into two types, one is the traditional document and the other is social media document. The traditional documents include news and articles, etc., which using formal language structures and strictly obey the grammatical rules. On the other hand, social media documents are such as posts on Facebook or Twitter, which are often short and informal. In this dissertation, we investigate keyword and key sentence identification and annotation in various types of documents. Machine learning is used in the information retrieval process. For social media documents, we first propose two novel frameworks of new word detection in Chinese tweets. Then we design a word annotation mechanism which interprets the Tweet-born words by automatic tagging with text labels. In addition, a hierarchical clustering algorithm is introduced to realize relevant words clustering. For the traditional documents, we propose a relevant sentence selection method which can improve the performance of question-answering systems.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshData mining-
dc.subject.lcshMachine learning-
dc.titleKeyword and fact identification and annotation : machine learning approaches-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_991044014362103414-
dc.date.hkucongregation2018-
dc.identifier.mmsid991044014362103414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats