File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TCAD.2023.3342730
- Scopus: eid_2-s2.0-85179783284
- Find via

Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference
| Title | DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference |
|---|---|
| Authors | |
| Keywords | Accelerator deep neural networks (DNNs) FPGAs machine learning quantization |
| Issue Date | 13-Dec-2023 |
| Publisher | Institute of Electrical and Electronics Engineers |
| Citation | IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024, v. 42, n. 5, p. 1613-1617 How to Cite? |
| Abstract | To accelerate the inference of deep neural networks (DNNs), quantization with low-bitwidth numbers is actively researched. A prominent challenge is to quantize the DNN models into low-bitwidth numbers without significant accuracy degradation, especially at very low bitwidths (< 8 bits). This work targets an adaptive data representation with variablelength encoding called DyBit. DyBit can dynamically adjust the precision and range of separate bit-fields to be adapted to the DNN weights/activations distribution. We also propose a hardware-aware quantization framework with a mixed-precision accelerator to trade-off the inference accuracy and speedup. Experimental results demonstrate that the ImageNet inference accuracy via DyBit is 1.97% higher than the state-of-the-art at 4-bit quantization, and the proposed framework can achieve up to 8.1× speedup compared with the original ResNet-50 model. |
| Persistent Identifier | http://hdl.handle.net/10722/366081 |
| ISSN | 2023 Impact Factor: 2.7 2023 SCImago Journal Rankings: 0.957 |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Zhou, Jiajun | - |
| dc.contributor.author | Wu, Jiajun | - |
| dc.contributor.author | Gao, Yizhao | - |
| dc.contributor.author | Ding, Yuhao | - |
| dc.contributor.author | Tao, Chaofan | - |
| dc.contributor.author | Li, Boyu | - |
| dc.contributor.author | Tu, Fengbin | - |
| dc.contributor.author | Cheng, Kwang Ting | - |
| dc.contributor.author | So, Hayden Kwok Hay | - |
| dc.contributor.author | Wong, Ngai | - |
| dc.date.accessioned | 2025-11-15T00:35:25Z | - |
| dc.date.available | 2025-11-15T00:35:25Z | - |
| dc.date.issued | 2023-12-13 | - |
| dc.identifier.citation | IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024, v. 42, n. 5, p. 1613-1617 | - |
| dc.identifier.issn | 0278-0070 | - |
| dc.identifier.uri | http://hdl.handle.net/10722/366081 | - |
| dc.description.abstract | To accelerate the inference of deep neural networks (DNNs), quantization with low-bitwidth numbers is actively researched. A prominent challenge is to quantize the DNN models into low-bitwidth numbers without significant accuracy degradation, especially at very low bitwidths (< 8 bits). This work targets an adaptive data representation with variablelength encoding called DyBit. DyBit can dynamically adjust the precision and range of separate bit-fields to be adapted to the DNN weights/activations distribution. We also propose a hardware-aware quantization framework with a mixed-precision accelerator to trade-off the inference accuracy and speedup. Experimental results demonstrate that the ImageNet inference accuracy via DyBit is 1.97% higher than the state-of-the-art at 4-bit quantization, and the proposed framework can achieve up to 8.1× speedup compared with the original ResNet-50 model. | - |
| dc.language | eng | - |
| dc.publisher | Institute of Electrical and Electronics Engineers | - |
| dc.relation.ispartof | IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | - |
| dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
| dc.subject | Accelerator | - |
| dc.subject | deep neural networks (DNNs) | - |
| dc.subject | FPGAs | - |
| dc.subject | machine learning | - |
| dc.subject | quantization | - |
| dc.title | DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference | - |
| dc.type | Article | - |
| dc.identifier.doi | 10.1109/TCAD.2023.3342730 | - |
| dc.identifier.scopus | eid_2-s2.0-85179783284 | - |
| dc.identifier.volume | 42 | - |
| dc.identifier.issue | 5 | - |
| dc.identifier.spage | 1613 | - |
| dc.identifier.epage | 1617 | - |
| dc.identifier.eissn | 1937-4151 | - |
| dc.identifier.issnl | 0278-0070 | - |
