File Download
Supplementary

postgraduate thesis: A soft processor overlay with tightly-coupled FPGA accelerator

TitleA soft processor overlay with tightly-coupled FPGA accelerator
Authors
Issue Date2015
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Ng, H. [吳浩彰]. (2015). A soft processor overlay with tightly-coupled FPGA accelerator. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5760970
AbstractFPGA overlays have shown the potential to improve designers’ productivity through balancing flexibility and ease of configuration of the underlying fabric while maintaining considerable overall performance promised by FPGAs. To truly facilitate full application acceleration, it is often necessary to also include a highly efficient processor that integrates and collaborates with the accelerators while maintaining the benefits of being implemented within the same overlay framework. This thesis presents an open-source soft processor that is tightly-coupled with FPGA accelerator as part of an overlay framework. RISC-V is chosen as the instruction set for its openness and simplicity, and the soft processor is designed as a 4-stage pipeline to balance resource consumption and performance when implemented on FPGAs. The processor is generically implemented so as to promote design portability and compatibility across different FPGA platforms. Experiment shows that the integrated software-hardware applications using the proposed tightly-coupled architecture achieve comparable performance as hardware-only accelerators while the proposed architecture provides additional run-time flexibility. The processor can be synthesized to both low-end and high-performance FPGA families from different vendors, achieving the highest frequency of 268:67MHz on Virtex-7 device. Synthesized results of the soft processor also display improvement on FPGA resource consumption and efficiency when compared to existing RISC-V design. In addition, this thesis also presents an FPGA-centric approach that allows gateware to directly access the virtual memory space as part of the executing process without involving the CPU. It allows efficient access to memory in heterogeneous systems and complements traditional software-centric approach by providing a simplified memory access model to improve designers’ productivity and high-level compilation tools portability. In this approach, a caching address translation buffer was implemented alongside the user FPGA gateware to provide runtime mapping between virtual and physical memory addresses. It coordinates with the OS running on the CPU to update address translations and to maintain memory consistency. The system was implemented on a commercial off-the-shelf FPGA add-on card to demonstrate the viability of such approach in low-cost systems. Experiment with a 2D stencil computing application implemented with this FPGA-centric approach results in reasonable performance improvement when compared to a typical software-centric implementation; while the number of context switches between FPGA and CPU in both kernel and user mode was significantly reduced, freeing the CPU for other concurrent user tasks.
DegreeMaster of Philosophy
SubjectField programmable gate arrays
Dept/ProgramElectrical and Electronic Engineering
Persistent Identifierhttp://hdl.handle.net/10722/226791

 

DC FieldValueLanguage
dc.contributor.authorNg, Ho-cheung-
dc.contributor.author吳浩彰-
dc.date.accessioned2016-06-30T04:24:11Z-
dc.date.available2016-06-30T04:24:11Z-
dc.date.issued2015-
dc.identifier.citationNg, H. [吳浩彰]. (2015). A soft processor overlay with tightly-coupled FPGA accelerator. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5760970-
dc.identifier.urihttp://hdl.handle.net/10722/226791-
dc.description.abstractFPGA overlays have shown the potential to improve designers’ productivity through balancing flexibility and ease of configuration of the underlying fabric while maintaining considerable overall performance promised by FPGAs. To truly facilitate full application acceleration, it is often necessary to also include a highly efficient processor that integrates and collaborates with the accelerators while maintaining the benefits of being implemented within the same overlay framework. This thesis presents an open-source soft processor that is tightly-coupled with FPGA accelerator as part of an overlay framework. RISC-V is chosen as the instruction set for its openness and simplicity, and the soft processor is designed as a 4-stage pipeline to balance resource consumption and performance when implemented on FPGAs. The processor is generically implemented so as to promote design portability and compatibility across different FPGA platforms. Experiment shows that the integrated software-hardware applications using the proposed tightly-coupled architecture achieve comparable performance as hardware-only accelerators while the proposed architecture provides additional run-time flexibility. The processor can be synthesized to both low-end and high-performance FPGA families from different vendors, achieving the highest frequency of 268:67MHz on Virtex-7 device. Synthesized results of the soft processor also display improvement on FPGA resource consumption and efficiency when compared to existing RISC-V design. In addition, this thesis also presents an FPGA-centric approach that allows gateware to directly access the virtual memory space as part of the executing process without involving the CPU. It allows efficient access to memory in heterogeneous systems and complements traditional software-centric approach by providing a simplified memory access model to improve designers’ productivity and high-level compilation tools portability. In this approach, a caching address translation buffer was implemented alongside the user FPGA gateware to provide runtime mapping between virtual and physical memory addresses. It coordinates with the OS running on the CPU to update address translations and to maintain memory consistency. The system was implemented on a commercial off-the-shelf FPGA add-on card to demonstrate the viability of such approach in low-cost systems. Experiment with a 2D stencil computing application implemented with this FPGA-centric approach results in reasonable performance improvement when compared to a typical software-centric implementation; while the number of context switches between FPGA and CPU in both kernel and user mode was significantly reduced, freeing the CPU for other concurrent user tasks.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License-
dc.subject.lcshField programmable gate arrays-
dc.titleA soft processor overlay with tightly-coupled FPGA accelerator-
dc.typePG_Thesis-
dc.identifier.hkulb5760970-
dc.description.thesisnameMaster of Philosophy-
dc.description.thesislevelMaster-
dc.description.thesisdisciplineElectrical and Electronic Engineering-
dc.description.naturepublished_or_final_version-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats