File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/DAC18072.2020.9218581
- Scopus: eid_2-s2.0-85093973629
- WOS: WOS:000628528400089
- Find via
Supplementary
- Citations:
- Appears in Collections:
Conference Paper: FTDL: A Tailored FPGA-Overlay for Deep Learning with High Scalability
Title | FTDL: A Tailored FPGA-Overlay for Deep Learning with High Scalability |
---|---|
Authors | |
Keywords | Field programmable gate arrays Computer architecture Random access memory Machine learning System-on-chip |
Issue Date | 2020 |
Publisher | IEEE, Computer Society. The Journal's web site is located at https://ieeexplore.ieee.org/xpl/conhome/1000196/all-proceedings |
Citation | Proceedings of 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 20-24 July 2020, p. 1-6 How to Cite? |
Abstract | Fast inference is of paramount value to a wide range of deep learning applications. This work presents FTDL, a highly-scalable FPGA overlay framework for deep learning applications, to address the architecture and hardware mismatch faced by traditional efforts. The FTDL overlay is specifically optimized for the tiled structure of FPGAs, thereby achieving post-place-and-route operating frequencies exceeding 88 % of the theoretical maximum across different devices and design scales. A flexible compilation framework efficiently schedules matrix multiply and convolution operations of large neural network inference on the overlay and achieved over 80 % hardware efficiency on average. Taking advantage of both high operating frequency and hardware efficiency, FTDL achieves 402.6 and 151.2 FPS with GoogLeNet and ResNet50 on ImageNet, respectively, while operating at a power efficiency of 27.6 GOPS/W, making it up to 7.7× higher performance and 1.9× more power-efficient than the state-of-the-art. |
Persistent Identifier | http://hdl.handle.net/10722/289185 |
ISSN | 2020 SCImago Journal Rankings: 0.518 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Shi, R | - |
dc.contributor.author | Ding, Y | - |
dc.contributor.author | Wei, X | - |
dc.contributor.author | Li, H | - |
dc.contributor.author | Liu, H | - |
dc.contributor.author | So, HKH | - |
dc.contributor.author | Ding, C | - |
dc.date.accessioned | 2020-10-22T08:09:03Z | - |
dc.date.available | 2020-10-22T08:09:03Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | Proceedings of 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 20-24 July 2020, p. 1-6 | - |
dc.identifier.issn | 0738-100X | - |
dc.identifier.uri | http://hdl.handle.net/10722/289185 | - |
dc.description.abstract | Fast inference is of paramount value to a wide range of deep learning applications. This work presents FTDL, a highly-scalable FPGA overlay framework for deep learning applications, to address the architecture and hardware mismatch faced by traditional efforts. The FTDL overlay is specifically optimized for the tiled structure of FPGAs, thereby achieving post-place-and-route operating frequencies exceeding 88 % of the theoretical maximum across different devices and design scales. A flexible compilation framework efficiently schedules matrix multiply and convolution operations of large neural network inference on the overlay and achieved over 80 % hardware efficiency on average. Taking advantage of both high operating frequency and hardware efficiency, FTDL achieves 402.6 and 151.2 FPS with GoogLeNet and ResNet50 on ImageNet, respectively, while operating at a power efficiency of 27.6 GOPS/W, making it up to 7.7× higher performance and 1.9× more power-efficient than the state-of-the-art. | - |
dc.language | eng | - |
dc.publisher | IEEE, Computer Society. The Journal's web site is located at https://ieeexplore.ieee.org/xpl/conhome/1000196/all-proceedings | - |
dc.relation.ispartof | ACM/IEEE Design Automation Conference Proceedings | - |
dc.rights | ACM/IEEE Design Automation Conference Proceedings. Copyright © IEEE, Computer Society. | - |
dc.rights | ©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | - |
dc.subject | Field programmable gate arrays | - |
dc.subject | Computer architecture | - |
dc.subject | Random access memory | - |
dc.subject | Machine learning | - |
dc.subject | System-on-chip | - |
dc.title | FTDL: A Tailored FPGA-Overlay for Deep Learning with High Scalability | - |
dc.type | Conference_Paper | - |
dc.identifier.email | So, HKH: hso@eee.hku.hk | - |
dc.identifier.authority | So, HKH=rp00169 | - |
dc.description.nature | postprint | - |
dc.identifier.doi | 10.1109/DAC18072.2020.9218581 | - |
dc.identifier.scopus | eid_2-s2.0-85093973629 | - |
dc.identifier.hkuros | 316791 | - |
dc.identifier.spage | 1 | - |
dc.identifier.epage | 6 | - |
dc.identifier.isi | WOS:000628528400089 | - |
dc.publisher.place | United States | - |
dc.identifier.issnl | 0738-100X | - |