File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Insider threat investigation through unsupervised learning
Title | Insider threat investigation through unsupervised learning |
---|---|
Authors | |
Advisors | Advisor(s):Chow, KP |
Issue Date | 2020 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Wei, Y. [衛易辰]. (2020). Insider threat investigation through unsupervised learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Insider threat investigation is one of the major challenges in the field of digital forensics. Being different with external attackers, insiders possess the tokens to access the digital asset within the organization, of which the deviations from normal behaviors are hard to seize. The complexity, concealment and infrequency of malicious internal actions make it difficult to detect insider threat. In this dissertation, we employ unsupervised deep learning approaches for investigating insider threat from digital evidence. The novel frameworks for insider threat detection, prediction and investigation are proposed. The proposed techniques are based on unsupervised data filtering, joint optimization and graph representation learning.
First, we propose a real unsupervised deep learning framework for detecting insider threat from system log files. Being widely used for producing the nonlinear representation as low-dimensional codes of the input data, autoencoder is used for insider threat detection through automatic filtering in this thesis. We design cascaded autoencoder insider threat detection framework, a real unsupervised learning model, in which we can filter out insider records by cascaded autoencoder filters (CAFs) automatically and estimate the distribution of encoded normal data with Gaussian mixture model, then identify insider threats’ log records if they have low probabilities.
In the process of traditional reactive forensic investigation, analysis and interpretation of the digital evidence are performed after a crime has been committed. Even if insiders can be detected, they have already caused huge damage to the organizations. In this thesis, we propose a novel general unsupervised anomaly detection scheme based on CAFs and joint optimization network.
The core idea is to utilize CAFs to do data purification among unlabeled imbalanced dataset then jointly optimize the dimension reduction and density estimation network. Basing on this scheme, we design an end-to-end insider threat prediction framework for proactive forensic investigation, through which we can make real time response to prevent the harmful influences of insider threat. We extract the tractable and scalable feature representation automatically through the data driven Bidirectional Long Short-Term Memory feature extractor, which eliminates the time-consuming and customarily expert dependable feature engineering work. A hypergraph correction module is applied to decrease the commonly existed relatively high false positive rate in insider threat detection.
Additionally, most existing deep learning solutions for insider threat investigation ignore considering the underlying correlation relationship among the data and only work for data with Euclidean structure. This thesis proposes Log2graph, an unsupervised variational graph autoencoder based scheme to detect insider threat entities through huge amount of data. We construct a graph representing an insider attack case from raw log files and design a novel graph neural network model to detect suspicious anomalous insiders in the graph. Subsequently, we perform a post-analysis to analyze the anomaly-instructure, which can help investigators attribute potential insiders.
We evaluate our proposed models on public benchmark datasets. The empirical experiments demonstrate that our models outperform state-of-the-art methods.
|
Degree | Doctor of Philosophy |
Subject | Machine learning Computer security Computer crimes - Investigation |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/308624 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Chow, KP | - |
dc.contributor.author | Wei, Yichen | - |
dc.contributor.author | 衛易辰 | - |
dc.date.accessioned | 2021-12-06T01:04:01Z | - |
dc.date.available | 2021-12-06T01:04:01Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | Wei, Y. [衛易辰]. (2020). Insider threat investigation through unsupervised learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/308624 | - |
dc.description.abstract | Insider threat investigation is one of the major challenges in the field of digital forensics. Being different with external attackers, insiders possess the tokens to access the digital asset within the organization, of which the deviations from normal behaviors are hard to seize. The complexity, concealment and infrequency of malicious internal actions make it difficult to detect insider threat. In this dissertation, we employ unsupervised deep learning approaches for investigating insider threat from digital evidence. The novel frameworks for insider threat detection, prediction and investigation are proposed. The proposed techniques are based on unsupervised data filtering, joint optimization and graph representation learning. First, we propose a real unsupervised deep learning framework for detecting insider threat from system log files. Being widely used for producing the nonlinear representation as low-dimensional codes of the input data, autoencoder is used for insider threat detection through automatic filtering in this thesis. We design cascaded autoencoder insider threat detection framework, a real unsupervised learning model, in which we can filter out insider records by cascaded autoencoder filters (CAFs) automatically and estimate the distribution of encoded normal data with Gaussian mixture model, then identify insider threats’ log records if they have low probabilities. In the process of traditional reactive forensic investigation, analysis and interpretation of the digital evidence are performed after a crime has been committed. Even if insiders can be detected, they have already caused huge damage to the organizations. In this thesis, we propose a novel general unsupervised anomaly detection scheme based on CAFs and joint optimization network. The core idea is to utilize CAFs to do data purification among unlabeled imbalanced dataset then jointly optimize the dimension reduction and density estimation network. Basing on this scheme, we design an end-to-end insider threat prediction framework for proactive forensic investigation, through which we can make real time response to prevent the harmful influences of insider threat. We extract the tractable and scalable feature representation automatically through the data driven Bidirectional Long Short-Term Memory feature extractor, which eliminates the time-consuming and customarily expert dependable feature engineering work. A hypergraph correction module is applied to decrease the commonly existed relatively high false positive rate in insider threat detection. Additionally, most existing deep learning solutions for insider threat investigation ignore considering the underlying correlation relationship among the data and only work for data with Euclidean structure. This thesis proposes Log2graph, an unsupervised variational graph autoencoder based scheme to detect insider threat entities through huge amount of data. We construct a graph representing an insider attack case from raw log files and design a novel graph neural network model to detect suspicious anomalous insiders in the graph. Subsequently, we perform a post-analysis to analyze the anomaly-instructure, which can help investigators attribute potential insiders. We evaluate our proposed models on public benchmark datasets. The empirical experiments demonstrate that our models outperform state-of-the-art methods. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Machine learning | - |
dc.subject.lcsh | Computer security | - |
dc.subject.lcsh | Computer crimes - Investigation | - |
dc.title | Insider threat investigation through unsupervised learning | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2021 | - |
dc.identifier.mmsid | 991044448906703414 | - |