File Download
There are no files associated with this item.
Supplementary
-
Citations:
- Appears in Collections:
Conference Paper: FTDL: An FPGA-tailored Architecture for Deep Learning Systems
Title | FTDL: An FPGA-tailored Architecture for Deep Learning Systems |
---|---|
Authors | |
Issue Date | 2020 |
Publisher | Association for Computing Machinery (ACM). |
Citation | Proceedings of the 28th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2020), Seaside CA, USA, 23-25 February 2020, p. 320 How to Cite? |
Abstract | Hardware acceleration of deep learning (DL) systems has been increasingly studied to achieve desirable performance and energy efficiency. The FPGA strikes a balance between high energy efficiency and fast development cycle and therefore is widely used as a DNN accelerator. However, there exists an architecture-layout mismatch in the current designs, which introduces scalability and flexibility issues, leading to irregular routing and resource imbalance problems. To address these limitations, in this work, we propose FTDL, an FPGA-tailored architecture with a parameterized and hierarchical hardware that is adaptive to different FPGA devices. FTDL has the following novelties: (i) At the architecture level, FTDL consists of Tiled Processing Elements (TPE) and super blocks, to achieve a near-to-theoretical digital signal processing (DSP) operating-frequency of 650 MHz. More importantly, FTDL is configurable and delivers good scalability, i.e., the timing is stabilized even when the design is scaled-up to 100% resource utilization for different deep learning systems. (ii) In workload compilation, FTDL provides a compiler that manages to map the DL workloads to the architecture level in an optimal manner. Experimental results show that for most benchmark layers in MLPerf, FTDL achieves an over 80% hardware efficiency. |
Description | Poster Session II |
Persistent Identifier | http://hdl.handle.net/10722/287980 |
ISBN |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Shi, R | - |
dc.contributor.author | Ding, Y | - |
dc.contributor.author | Wei, X | - |
dc.contributor.author | Liu, H | - |
dc.contributor.author | So, HKH | - |
dc.contributor.author | Ding, C | - |
dc.date.accessioned | 2020-10-05T12:06:04Z | - |
dc.date.available | 2020-10-05T12:06:04Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | Proceedings of the 28th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2020), Seaside CA, USA, 23-25 February 2020, p. 320 | - |
dc.identifier.isbn | 9781450370998 | - |
dc.identifier.uri | http://hdl.handle.net/10722/287980 | - |
dc.description | Poster Session II | - |
dc.description.abstract | Hardware acceleration of deep learning (DL) systems has been increasingly studied to achieve desirable performance and energy efficiency. The FPGA strikes a balance between high energy efficiency and fast development cycle and therefore is widely used as a DNN accelerator. However, there exists an architecture-layout mismatch in the current designs, which introduces scalability and flexibility issues, leading to irregular routing and resource imbalance problems. To address these limitations, in this work, we propose FTDL, an FPGA-tailored architecture with a parameterized and hierarchical hardware that is adaptive to different FPGA devices. FTDL has the following novelties: (i) At the architecture level, FTDL consists of Tiled Processing Elements (TPE) and super blocks, to achieve a near-to-theoretical digital signal processing (DSP) operating-frequency of 650 MHz. More importantly, FTDL is configurable and delivers good scalability, i.e., the timing is stabilized even when the design is scaled-up to 100% resource utilization for different deep learning systems. (ii) In workload compilation, FTDL provides a compiler that manages to map the DL workloads to the architecture level in an optimal manner. Experimental results show that for most benchmark layers in MLPerf, FTDL achieves an over 80% hardware efficiency. | - |
dc.language | eng | - |
dc.publisher | Association for Computing Machinery (ACM). | - |
dc.relation.ispartof | The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays | - |
dc.title | FTDL: An FPGA-tailored Architecture for Deep Learning Systems | - |
dc.type | Conference_Paper | - |
dc.identifier.email | So, HKH: hso@eee.hku.hk | - |
dc.identifier.authority | So, HKH=rp00169 | - |
dc.description.nature | abstract | - |
dc.identifier.doi | 10.1145/3373087.3375384 | - |
dc.identifier.hkuros | 315346 | - |
dc.identifier.spage | 320 | - |
dc.identifier.epage | 320 | - |
dc.publisher.place | New York, NY | - |