File Download
There are no files associated with this item.
Supplementary
-
Citations:
- Appears in Collections:
Conference Paper: LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization
Title | LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization |
---|---|
Authors | |
Issue Date | 2-Mar-2024 |
Persistent Identifier | http://hdl.handle.net/10722/340643 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Zhao, Juntao | - |
dc.contributor.author | Wan, Borui | - |
dc.contributor.author | Wu, Chuan | - |
dc.contributor.author | Peng, Yanghua | - |
dc.contributor.author | Lin, Haibin | - |
dc.date.accessioned | 2024-03-11T10:46:05Z | - |
dc.date.available | 2024-03-11T10:46:05Z | - |
dc.date.issued | 2024-03-02 | - |
dc.identifier.uri | http://hdl.handle.net/10722/340643 | - |
dc.language | eng | - |
dc.relation.ispartof | the 29th ACM SIGPLAN Annual Sympo- sium Principles and Practice of Parallel Programming (PPoPP’24) (02/03/2024-06/03/2024, Edinburgh) | - |
dc.title | LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization | - |
dc.type | Conference_Paper | - |