HKU Scholars Hub: Project

Optimizing Distributed GNN Training on Large Graphs

View Statistics Email Alert RSS Feed

Grant Data

Project Title

Optimizing Distributed GNN Training on Large Graphs

Principal Investigator

Professor Wu, Chuan (Principal Investigator (PI))

Duration

Start Date

2021-09-01

Amount

1093580

Conference Title

Optimizing Distributed GNN Training on Large Graphs

Presentation Title

Keywords

Distributed ML System, Flow Scheduling, GNN Training, Graph Sampling, Placement

Discipline

Network,Others - Computing Science and Information Technology

Panel

Engineering (E)

HKU Project Code

17207621

Grant Type

General Research Fund (GRF)

Funding Year

2021

Status

On-going

Objectives

1 [Algorithms for Communication and Computation Scheduling in Distributed GNN Training]: Design efficient and near-optimal graph sampling/GNN gradient communication and graph store/sampler/trainer execution scheduling algorithms, given graph store/sampler/trainer placement. 2 [Algorithms for Graph Store, Sampler and Trainer Placement]: Design efficient and near-optimal placement strategies for graph stores, samplers and trainers, jointly achieving GNN training time minimization with communication/computation scheduling. 3 [Joint Graph Partition, Sampling and Caching Design]: Design joint, efficient approaches for graph partition, sampling and graph data caching on training machines, to further reduce inter-machine traffic and expedite GNN training convergence. 4 [Implementation and Evaluation]: Implement a distributed GNN training system using our algorithms and strategies, and evaluate it with real-world GNN training workloads in AI clouds.