Efficient training and inference of deep neural networks

Wang, Maolin; 王茂林

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Electrical & Electronic Engineering: Theses

postgraduate thesis: Efficient training and inference of deep neural networks

Title	Efficient training and inference of deep neural networks
Authors	Wang, Maolin 王茂林
Advisors	Advisor(s):So, HKH Lam, EYM
Issue Date	2020
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Wang, M. [王茂林]. (2020). Efficient training and inference of deep neural networks. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Deep Neural Networks (DNNs) are widely used in many fields due to their superior performance. However, their computation complexities in both training and inference make them difficult for deployment. This thesis studies how to address the DNNs computation difficulties from three different perspectives. The first part of the thesis focuses on reducing the DNNs end-to-end inference latency under hardware resource and model accuracy constraint. A deeply pipelined Convolutional Neural Network(CNN) inference architecture that operated on partial input is proposed. A series of designs are implemented on Field Programmable Gate Array(FPGA), which provide different trade-offs among inference latency, accuracy, and hardware resource usage. The second part of the thesis presents an efficient training framework that trains deep neural networks with integer-only arithmetic. The framework has three major innovations. Firstly, all the model parameters are stored directly with 8-bit signed integers. Secondly, a pseudo stochastic rounding scheme is proposed, which achieves commonly used stochastic rounding while without the need of external random number generation. Thirdly, a segment approximation of cross entropy loss backpropagation scheme with integer-only arithmetic is presented. Combined with the above contributions, this thesis presented the world's first integer-only arithmetic training framework. The last part of the thesis uses ultrafast single-cell image classification as a concrete example to demonstrate how to use the proposed methods in previous parts of the thesis to meet the DNNs deployment requirements. Firstly, integer training is used to find low precision alternatives for the floating point model. Then a series of hardware designs for real-time inference are presented to explore the trade-off between hardware resources usage and classification latency. Finally, this thesis presents a real-time image-based single-cell detection and classification system with state-of-the-art inference latency.
Degree	Doctor of Philosophy
Subject	Neural networks (Computer science)
Dept/Program	Electrical and Electronic Engineering
Persistent Identifier	http://hdl.handle.net/10722/286786

DC Field	Value	Language
dc.contributor.advisor	So, HKH	-
dc.contributor.advisor	Lam, EYM	-
dc.contributor.author	Wang, Maolin	-
dc.contributor.author	王茂林	-
dc.date.accessioned	2020-09-05T01:20:56Z	-
dc.date.available	2020-09-05T01:20:56Z	-
dc.date.issued	2020	-
dc.identifier.citation	Wang, M. [王茂林]. (2020). Efficient training and inference of deep neural networks. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/286786	-
dc.description.abstract	Deep Neural Networks (DNNs) are widely used in many fields due to their superior performance. However, their computation complexities in both training and inference make them difficult for deployment. This thesis studies how to address the DNNs computation difficulties from three different perspectives. The first part of the thesis focuses on reducing the DNNs end-to-end inference latency under hardware resource and model accuracy constraint. A deeply pipelined Convolutional Neural Network(CNN) inference architecture that operated on partial input is proposed. A series of designs are implemented on Field Programmable Gate Array(FPGA), which provide different trade-offs among inference latency, accuracy, and hardware resource usage. The second part of the thesis presents an efficient training framework that trains deep neural networks with integer-only arithmetic. The framework has three major innovations. Firstly, all the model parameters are stored directly with 8-bit signed integers. Secondly, a pseudo stochastic rounding scheme is proposed, which achieves commonly used stochastic rounding while without the need of external random number generation. Thirdly, a segment approximation of cross entropy loss backpropagation scheme with integer-only arithmetic is presented. Combined with the above contributions, this thesis presented the world's first integer-only arithmetic training framework. The last part of the thesis uses ultrafast single-cell image classification as a concrete example to demonstrate how to use the proposed methods in previous parts of the thesis to meet the DNNs deployment requirements. Firstly, integer training is used to find low precision alternatives for the floating point model. Then a series of hardware designs for real-time inference are presented to explore the trade-off between hardware resources usage and classification latency. Finally, this thesis presents a real-time image-based single-cell detection and classification system with state-of-the-art inference latency.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Neural networks (Computer science)	-
dc.title	Efficient training and inference of deep neural networks	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Electrical and Electronic Engineering	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2020	-
dc.identifier.mmsid	991044268206503414	-

File Download

Supplementary

postgraduate thesis: Efficient training and inference of deep neural networks

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats