File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Efficient training and inference of deep neural networks
Title | Efficient training and inference of deep neural networks |
---|---|
Authors | |
Advisors | |
Issue Date | 2020 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Wang, M. [王茂林]. (2020). Efficient training and inference of deep neural networks. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Deep Neural Networks (DNNs) are widely used in many fields due to their superior performance. However, their computation complexities in both training and inference make them difficult for deployment. This thesis studies how to address the DNNs computation difficulties from three different perspectives.
The first part of the thesis focuses on reducing the DNNs end-to-end inference latency under hardware resource and model accuracy constraint. A deeply pipelined Convolutional Neural Network(CNN) inference architecture that operated on partial input is proposed. A series of designs are implemented on Field Programmable Gate Array(FPGA), which provide different trade-offs among inference latency, accuracy, and hardware resource usage.
The second part of the thesis presents an efficient training framework that trains deep neural networks with integer-only arithmetic. The framework has three major innovations. Firstly, all the model parameters are stored directly with 8-bit signed integers. Secondly, a pseudo stochastic rounding scheme is proposed, which achieves commonly used stochastic rounding while without the need of external random number generation. Thirdly, a segment approximation of cross entropy loss backpropagation scheme with integer-only arithmetic is presented. Combined with the above contributions, this thesis presented the world's first integer-only arithmetic training framework.
The last part of the thesis uses ultrafast single-cell image classification as a concrete example to demonstrate how to use the proposed methods in previous parts of the thesis to meet the DNNs deployment requirements. Firstly, integer training is used to find low precision alternatives for the floating point model. Then a series of hardware designs for real-time inference are presented to explore the trade-off between hardware resources usage and classification latency. Finally, this thesis presents a real-time image-based single-cell detection and classification system with state-of-the-art inference latency.
|
Degree | Doctor of Philosophy |
Subject | Neural networks (Computer science) |
Dept/Program | Electrical and Electronic Engineering |
Persistent Identifier | http://hdl.handle.net/10722/286786 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | So, HKH | - |
dc.contributor.advisor | Lam, EYM | - |
dc.contributor.author | Wang, Maolin | - |
dc.contributor.author | 王茂林 | - |
dc.date.accessioned | 2020-09-05T01:20:56Z | - |
dc.date.available | 2020-09-05T01:20:56Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | Wang, M. [王茂林]. (2020). Efficient training and inference of deep neural networks. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/286786 | - |
dc.description.abstract | Deep Neural Networks (DNNs) are widely used in many fields due to their superior performance. However, their computation complexities in both training and inference make them difficult for deployment. This thesis studies how to address the DNNs computation difficulties from three different perspectives. The first part of the thesis focuses on reducing the DNNs end-to-end inference latency under hardware resource and model accuracy constraint. A deeply pipelined Convolutional Neural Network(CNN) inference architecture that operated on partial input is proposed. A series of designs are implemented on Field Programmable Gate Array(FPGA), which provide different trade-offs among inference latency, accuracy, and hardware resource usage. The second part of the thesis presents an efficient training framework that trains deep neural networks with integer-only arithmetic. The framework has three major innovations. Firstly, all the model parameters are stored directly with 8-bit signed integers. Secondly, a pseudo stochastic rounding scheme is proposed, which achieves commonly used stochastic rounding while without the need of external random number generation. Thirdly, a segment approximation of cross entropy loss backpropagation scheme with integer-only arithmetic is presented. Combined with the above contributions, this thesis presented the world's first integer-only arithmetic training framework. The last part of the thesis uses ultrafast single-cell image classification as a concrete example to demonstrate how to use the proposed methods in previous parts of the thesis to meet the DNNs deployment requirements. Firstly, integer training is used to find low precision alternatives for the floating point model. Then a series of hardware designs for real-time inference are presented to explore the trade-off between hardware resources usage and classification latency. Finally, this thesis presents a real-time image-based single-cell detection and classification system with state-of-the-art inference latency. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Neural networks (Computer science) | - |
dc.title | Efficient training and inference of deep neural networks | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Electrical and Electronic Engineering | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2020 | - |
dc.identifier.mmsid | 991044268206503414 | - |