File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: A software shared virtual memory system with three way coherence protocols on the intel single-chip cloud computer
Title | A software shared virtual memory system with three way coherence protocols on the intel single-chip cloud computer |
---|---|
Authors | |
Issue Date | 2015 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Hung, C. D. [熊哲皓]. (2015). A software shared virtual memory system with three way coherence protocols on the intel single-chip cloud computer. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5570812 |
Abstract | With the advancement of design and fabrication of high-performance integrated circuits technology, it is foreseeable that processors with more than 1,000 cores per die will appear in the near future. However, these many-core architectures have introduced a lot of challenges at the memory system level, such as complicated cache coherence and limited memory access speed, to name a few.
This thesis focuses on one prominent many-core prototype — the Intel’s Single-chip Cloud Computer (SCC). The SCC architecture does not provide hardware cache coherency. Instead, it relies on on-chip programmable memory.
The baseline coherence protocol for the SCC is the Software Managed Coherence (SMC) layer. To achieve memory consistency, it accesses shared memory without part of the typical cache hierarchy for efficient invalidation and flushing. We found that performance provided by this coherence layer in this manner is sub-optimal because accesses of shared memory would all turn into data update messages within the network mesh. As cache locality could not be exploited to its full potential, the execution pipelines stall much often for memory fetches from outside the chip.
This research is to address the performance problem of shared virtual memory consistency for this cache in-coherent architecture. Oriented at sitting data on-chip as much as possible to reduce memory accesses external to the chip, we propose two techniques to leverage the cache hierarchy to full and reside data in the on-chip scratchpad memory. First, targeted at the architectural specificity of the hardware, we redesigned traditional software distributed shared memory (SDSM) to allow shared data be treated transparently like private memory so the cache hierarchy can be fully utilised without sacrificing memory consistency. Second, we propose a distance-aware page allocation scheme that samples access frequencies and select the most frequently-recently used pages to be stored on the on-chip scratchpad memory.
Our experimental results show that our first technique, the ordinary SDSM outperforms the current SMC approach by 5 times. Moreover, in some cases, with the second technique that is based on scratchpad memory, our proposed system outperforms further by an additional 1.57 times. Our experiments also demonstrated that the SMC approach is not scalable due to congestion of the network mesh by coherence traffic generated while the two new approaches continued to scale well.
The main contribution of this research is the implementation of a cache coherence software library system built for an architecture that comes with non-coherent cache hardware and just relies on software-defined cache. This new cache hierarchy has evidently opened the door for smarter and faster inter-processor-core data sharing without the need of complicated cache coherence hardware. |
Degree | Master of Philosophy |
Subject | Distributed shared memory Cloud computing |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/219988 |
HKU Library Item ID | b5570812 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Hung, Chit-ho, Dominic | - |
dc.contributor.author | 熊哲皓 | - |
dc.date.accessioned | 2015-10-08T23:12:17Z | - |
dc.date.available | 2015-10-08T23:12:17Z | - |
dc.date.issued | 2015 | - |
dc.identifier.citation | Hung, C. D. [熊哲皓]. (2015). A software shared virtual memory system with three way coherence protocols on the intel single-chip cloud computer. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5570812 | - |
dc.identifier.uri | http://hdl.handle.net/10722/219988 | - |
dc.description.abstract | With the advancement of design and fabrication of high-performance integrated circuits technology, it is foreseeable that processors with more than 1,000 cores per die will appear in the near future. However, these many-core architectures have introduced a lot of challenges at the memory system level, such as complicated cache coherence and limited memory access speed, to name a few. This thesis focuses on one prominent many-core prototype — the Intel’s Single-chip Cloud Computer (SCC). The SCC architecture does not provide hardware cache coherency. Instead, it relies on on-chip programmable memory. The baseline coherence protocol for the SCC is the Software Managed Coherence (SMC) layer. To achieve memory consistency, it accesses shared memory without part of the typical cache hierarchy for efficient invalidation and flushing. We found that performance provided by this coherence layer in this manner is sub-optimal because accesses of shared memory would all turn into data update messages within the network mesh. As cache locality could not be exploited to its full potential, the execution pipelines stall much often for memory fetches from outside the chip. This research is to address the performance problem of shared virtual memory consistency for this cache in-coherent architecture. Oriented at sitting data on-chip as much as possible to reduce memory accesses external to the chip, we propose two techniques to leverage the cache hierarchy to full and reside data in the on-chip scratchpad memory. First, targeted at the architectural specificity of the hardware, we redesigned traditional software distributed shared memory (SDSM) to allow shared data be treated transparently like private memory so the cache hierarchy can be fully utilised without sacrificing memory consistency. Second, we propose a distance-aware page allocation scheme that samples access frequencies and select the most frequently-recently used pages to be stored on the on-chip scratchpad memory. Our experimental results show that our first technique, the ordinary SDSM outperforms the current SMC approach by 5 times. Moreover, in some cases, with the second technique that is based on scratchpad memory, our proposed system outperforms further by an additional 1.57 times. Our experiments also demonstrated that the SMC approach is not scalable due to congestion of the network mesh by coherence traffic generated while the two new approaches continued to scale well. The main contribution of this research is the implementation of a cache coherence software library system built for an architecture that comes with non-coherent cache hardware and just relies on software-defined cache. This new cache hierarchy has evidently opened the door for smarter and faster inter-processor-core data sharing without the need of complicated cache coherence hardware. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.subject.lcsh | Distributed shared memory | - |
dc.subject.lcsh | Cloud computing | - |
dc.title | A software shared virtual memory system with three way coherence protocols on the intel single-chip cloud computer | - |
dc.type | PG_Thesis | - |
dc.identifier.hkul | b5570812 | - |
dc.description.thesisname | Master of Philosophy | - |
dc.description.thesislevel | Master | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_b5570812 | - |
dc.identifier.mmsid | 991011109529703414 | - |