File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Time and Location Topic Model for analyzing Linkg forum data

TitleTime and Location Topic Model for analyzing Linkg forum data
Authors
Keywordsdata mining
information retrieval
learning (artificial intelligence)
multilayer perceptrons
natural language processing
Issue Date2020
PublisherIEEE, Computer Society. The Journal's web site is located at https://ieeexplore.ieee.org/xpl/conhome/1001100/all-proceedings
Citation
Proceedings of 2020 13th International Conference on Systematic Approaches to Digital Forensic Engineering (SADFE), Virtual Conference, New York, NY, USA, 15 May 2020, p. 32-37 How to Cite?
AbstractOpen Source Intelligence (OSINT) is a choice for collecting information today for law enforcement to monitor illegal activities and allocate police resources effectively. However, massive amounts of public information cannot be analyzed by humans alone and so automatic pre-processing must be performed in advance. In traditional text analysis, the common word segmentation tools do not match the needs in special fields and special words (such as proper nouns, dialects, acronyms, metaphors, and so on). In the context of the Chinese language, we consider the problem of automatically determining the time and location of major public gatherings and demonstrations using public available information. As experimental scenario, we use the Lihkg online forum from August 1st to October 10th, 2019 as a corpus, and propose a topic vectorization method based on character embedding and Chinese word segmentation, using MLP (multi-layer perceptron) neural network as a location topic model. The result proves that the method and the model can correctly identify the time and the location of discussed activities by learning the existing location corpus.
Persistent Identifierhttp://hdl.handle.net/10722/289853
ISBN

 

DC FieldValueLanguage
dc.contributor.authorShen, A-
dc.contributor.authorChow, KP-
dc.date.accessioned2020-10-22T08:18:24Z-
dc.date.available2020-10-22T08:18:24Z-
dc.date.issued2020-
dc.identifier.citationProceedings of 2020 13th International Conference on Systematic Approaches to Digital Forensic Engineering (SADFE), Virtual Conference, New York, NY, USA, 15 May 2020, p. 32-37-
dc.identifier.isbn9781728188447-
dc.identifier.urihttp://hdl.handle.net/10722/289853-
dc.description.abstractOpen Source Intelligence (OSINT) is a choice for collecting information today for law enforcement to monitor illegal activities and allocate police resources effectively. However, massive amounts of public information cannot be analyzed by humans alone and so automatic pre-processing must be performed in advance. In traditional text analysis, the common word segmentation tools do not match the needs in special fields and special words (such as proper nouns, dialects, acronyms, metaphors, and so on). In the context of the Chinese language, we consider the problem of automatically determining the time and location of major public gatherings and demonstrations using public available information. As experimental scenario, we use the Lihkg online forum from August 1st to October 10th, 2019 as a corpus, and propose a topic vectorization method based on character embedding and Chinese word segmentation, using MLP (multi-layer perceptron) neural network as a location topic model. The result proves that the method and the model can correctly identify the time and the location of discussed activities by learning the existing location corpus.-
dc.languageeng-
dc.publisherIEEE, Computer Society. The Journal's web site is located at https://ieeexplore.ieee.org/xpl/conhome/1001100/all-proceedings-
dc.relation.ispartof2020 13th International Conference on Systematic Approaches to Digital Forensic Engineering (SADFE)-
dc.rightsInternational Conference on Systematic Approaches to Digital Forensic Engineering (SADFE). Copyright © IEEE, Computer Society.-
dc.rights©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.-
dc.subjectdata mining-
dc.subjectinformation retrieval-
dc.subjectlearning (artificial intelligence)-
dc.subjectmultilayer perceptrons-
dc.subjectnatural language processing-
dc.titleTime and Location Topic Model for analyzing Linkg forum data-
dc.typeConference_Paper-
dc.identifier.emailChow, KP: chow@cs.hku.hk-
dc.identifier.authorityChow, KP=rp00111-
dc.description.naturepostprint-
dc.identifier.doi10.1109/SADFE51007.2020.00009-
dc.identifier.scopuseid_2-s2.0-85092153485-
dc.identifier.hkuros317163-
dc.identifier.spage32-
dc.identifier.epage37-
dc.publisher.placeUnited States-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats