File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Collaborative privacy-preserving data mining in the cloud
Title | Collaborative privacy-preserving data mining in the cloud |
---|---|
Authors | |
Advisors | Advisor(s):Yiu, SM |
Issue Date | 2018 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Zhang, J. [张君]. (2018). Collaborative privacy-preserving data mining in the cloud. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | With the popularity of cloud computing, an increasing number of institutions outsource their data to a third-party cloud system. In such an environment, demands exist for both the privacy and mining of data. To protect data privacy, different institutions encrypt their data with different keys. To discover knowledge from existing data, mining techniques are used widely but computationally intensive, especially for large datasets.
Combining data from different institutions for a big and varied training set helps enhance data mining performance, which motivates the study of collaborative privacy-preserving data mining. The integration of cloud computing with collaborative privacy-preserving data mining provides a promising solution to the above needs. In this thesis, we study three important issues: secure dot product, privacy-preserving support vector machine and privacy-preserving elastic net.
First, the dot product is a fundamental operation in data mining algorithms. We design a secure dot product scheme for encrypted vectors under different keys on one cloud server. We extend the application scenario of an integer vector encryption that supports dot product computation from one institution to multiple institutions. We prove our extension is secure. To allow the cloud to decrypt an encrypted dot product, we generate a key-switching matrix to transform the encrypted dot product. Our secure dot product scheme does not require the institutions to stay online. The experimental results confirm that our system is efficient.
Second, the support vector machine is one of the most popular classifiers. We present a privacy-preserving support vector machine protocol, which runs on encrypted data under different keys on one cloud server. As training a support vector machine requires the knowledge of a gram matrix that consists of dot products, we base our scheme on the secure dot product protocol. The institutions could stay offline in our system. To enhance efficiency and security, we adopt a reduced support vector machine with a secure kernel matrix. Each element of the secure kernel matrix is a dot product between a training sample and a random vector. We prove the security of our protocol and the experimental results highlight its practicality.
Finally, the elastic net is an essential regression tool that has important applications. We construct the first privacy-preserving elastic net protocol for encrypted data under different keys on two non-colluding cloud servers. We reduce the elastic net regression to a support vector machine. As reduction leads to dataset transformation, we propose a cryptosystem that supports one multiplication and multiple addition operations. We employ secret sharing to keep the institutions offline. Our scheme is proven to be secure. The experimental results demonstrate that our system can be applied in practice.
To conclude, we solved three issues in collaborative privacy-preserving data mining in the cloud. We plan to leverage parallel computing to accelerate our schemes, design privacy-preserving schemes for other data mining algorithms, and utilize verifiable computing to prevent malicious cloud in the future. |
Degree | Doctor of Philosophy |
Subject | Data protection Data mining Cloud computing |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/276468 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Yiu, SM | - |
dc.contributor.author | Zhang, Jun | - |
dc.contributor.author | 张君 | - |
dc.date.accessioned | 2019-09-17T04:54:58Z | - |
dc.date.available | 2019-09-17T04:54:58Z | - |
dc.date.issued | 2018 | - |
dc.identifier.citation | Zhang, J. [张君]. (2018). Collaborative privacy-preserving data mining in the cloud. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/276468 | - |
dc.description.abstract | With the popularity of cloud computing, an increasing number of institutions outsource their data to a third-party cloud system. In such an environment, demands exist for both the privacy and mining of data. To protect data privacy, different institutions encrypt their data with different keys. To discover knowledge from existing data, mining techniques are used widely but computationally intensive, especially for large datasets. Combining data from different institutions for a big and varied training set helps enhance data mining performance, which motivates the study of collaborative privacy-preserving data mining. The integration of cloud computing with collaborative privacy-preserving data mining provides a promising solution to the above needs. In this thesis, we study three important issues: secure dot product, privacy-preserving support vector machine and privacy-preserving elastic net. First, the dot product is a fundamental operation in data mining algorithms. We design a secure dot product scheme for encrypted vectors under different keys on one cloud server. We extend the application scenario of an integer vector encryption that supports dot product computation from one institution to multiple institutions. We prove our extension is secure. To allow the cloud to decrypt an encrypted dot product, we generate a key-switching matrix to transform the encrypted dot product. Our secure dot product scheme does not require the institutions to stay online. The experimental results confirm that our system is efficient. Second, the support vector machine is one of the most popular classifiers. We present a privacy-preserving support vector machine protocol, which runs on encrypted data under different keys on one cloud server. As training a support vector machine requires the knowledge of a gram matrix that consists of dot products, we base our scheme on the secure dot product protocol. The institutions could stay offline in our system. To enhance efficiency and security, we adopt a reduced support vector machine with a secure kernel matrix. Each element of the secure kernel matrix is a dot product between a training sample and a random vector. We prove the security of our protocol and the experimental results highlight its practicality. Finally, the elastic net is an essential regression tool that has important applications. We construct the first privacy-preserving elastic net protocol for encrypted data under different keys on two non-colluding cloud servers. We reduce the elastic net regression to a support vector machine. As reduction leads to dataset transformation, we propose a cryptosystem that supports one multiplication and multiple addition operations. We employ secret sharing to keep the institutions offline. Our scheme is proven to be secure. The experimental results demonstrate that our system can be applied in practice. To conclude, we solved three issues in collaborative privacy-preserving data mining in the cloud. We plan to leverage parallel computing to accelerate our schemes, design privacy-preserving schemes for other data mining algorithms, and utilize verifiable computing to prevent malicious cloud in the future. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Data protection | - |
dc.subject.lcsh | Data mining | - |
dc.subject.lcsh | Cloud computing | - |
dc.title | Collaborative privacy-preserving data mining in the cloud | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_991044058183903414 | - |
dc.date.hkucongregation | 2018 | - |
dc.identifier.mmsid | 991044058183903414 | - |