GraphGym & PyG Integration
GraphGym is a platform for designing and evaluating Graph Neural Networks (GNNs), as originally proposed in the “Design Space for Graph Neural Networks” paper. We now officially support GraphGym as part of of PyG.
We are continuously working on better and deeper GraphGym integration with PyG. We highly welcome any contribution or feedback!
- Highly modularized pipeline for GNN:
- Data: Data loading and data splitting
- Model: Modularized GNN implementations
- Tasks: Node-level, edge-level and graph-level tasks
- Evaluation: Accuracy, ROC AUC, …
- Reproducible experiment configuration:
- Each experiment is fully described by a configuration file
- Scalable experiment management:
- Easily launch thousands of GNN experiments in parallel
- Auto-generate experiment analyses and figures across random seeds and experiments
- Flexible user customization:
- Easily register your own modules, such as data loaders, GNN layers, loss functions, etc
TL;DR: GraphGym is great for GNN beginners, domain experts and GNN researchers.
Scenario 1: You are a beginner to graph representation learning and want to understand how GNNs work:
You probably have read many exciting papers on GNNs, and try to write your own GNN implementation. Even if using raw PyG, you still have to code up the essential pipeline on your own. GraphGym is a perfect place for your to start learning about standardized GNN implementation and evaluation.
Figure 1: Modularized GNN implementation.
Scenario 2: You want to apply GNNs to your exciting application:
You probably know that there are hundreds of possible GNN models, and selecting the best model is notoriously hard. Even worse, the GraphGym paper shows that the best GNN designs for different tasks differ drastically. GraphGym provides a simple interface to try out thousands of GNNs in parallel and understand the best designs for your specific task. GraphGym also recommends a “go-to” GNN design space, after investigating 10 million GNN model-task combinations.
Figure 2: A guideline for desirable GNN design choices.
(Sampling from 10 million GNN model-task combinations.)
Scenario 3: You are a GNN researcher, who wants to innovate new GNN models or propose new GNN tasks:
Say you have proposed a new GNN layer
ExampleConv. GraphGym can help you
convincingly argue that
ExampleConv is better than, e.g.,
GCNConv: When randomly sampling from 10
million possible model-task combinations, how often will
GCNConv when everything else
is fixed (including computational costs)? Moreover, GraphGym can help
you easily do hyper-parameter search, and visualize what design
choices are better. In sum, GraphGym can greatly facilitate your GNN
Figure 3: Evaluation of a given GNN design dimension, e.g., BatchNorm
GraphGym Quick start
After properly installing PyG, you can try out our GraphGym API to easily manage and launch GNN experiments.
conda install pyg -c pyg -c conda-forge # Install PyG git clone https://github.com/pyg-team/pytorch_geometric.git # Get GraphGym pipeline cd pytorch_geometric/graphgym bash run_single.sh # run a single GNN experiment (node/edge/graph-level) bash run_batch.sh # run a batch of GNN experiments, with differnt GNN designs/datasets/tasks
Please find detailed documentation in our PyG documentation.