Introduction
What is GSNN?
Graph Structured Neural Networks (GSNN) is a novel approach that incorporates prior knowledge of latent variable interactions directly into neural network architecture. Unlike traditional neural networks that learn relationships from data alone, GSNN leverages domain-specific knowledge to guide the learning process, making it particularly powerful for applications in perturbation biology and other domains where prior knowledge about variable relationships is available.
The GSNN method was introduced in the paper “Graph Structured Neural Networks for Perturbation Biology” (Evans et al., 2024) and provides a framework for building interpretable, structured neural networks that respect known biological or domain relationships.
Key Concepts
Prior Knowledge Integration
GSNN allows you to encode domain-specific knowledge about how variables interact through a graph structure. This graph defines which variables can directly influence each other, constraining the neural network to learn only biologically or physically plausible relationships.
Three-Node Architecture
GSNN uses a specialized three-node type architecture:
Input nodes: Represent observed variables
Function nodes: Represent latent variables
Output nodes: Represent target variables
Sparse Connectivity
The connections between nodes are sparse and determined by your prior knowledge graph, leading to more interpretable models and better generalization, especially in data-limited scenarios.
Perturbation Biology Applications
GSNN is particularly well-suited for perturbation biology studies where you want to understand how interventions (perturbations) affect biological systems. The graph structure can encode known biological pathways, protein-protein interactions, or regulatory networks.
Why Use GSNN?
Interpretability
Unlike black-box neural networks, GSNN models are interpretable because the learned weights correspond to specific relationships in your prior knowledge graph. You can directly examine which connections are important for predictions.
Data Efficiency
By incorporating prior knowledge, GSNN can learn meaningful relationships from smaller datasets than would be required for traditional neural networks.
Domain Knowledge Constraints
GSNN ensures that your model respects known biological or physical constraints, preventing it from learning spurious correlations that violate domain knowledge.
Flexible Architecture
GSNN supports various normalization strategies and activation functions, and can be combined with advanced techniques like uncertainty quantification, reinforcement learning, and Bayesian optimization.
Core Features
Graph-Based Architecture
Define custom graph structures representing your domain knowledge
Automatic handling of sparse connectivity patterns
Training Options
Gradient checkpointing for memory efficiency
Multiple normalization strategies (Layer, Batch)
Residual connections to improve training stability
Support for various activation functions and weight initialization strategies
Optimization and Inference
Reinforcement learning for graph structure optimization
Uncertainty quantification through hypernetworks
Interpretation Tools
Model explanation and visualization (GSNNExplainer)
Entity function extraction
How Are GSNNs Different from Graph Neural Networks?
While both GSNNs and Graph Neural Networks (GNNs) use graphs, they serve fundamentally different purposes and operate in distinct ways:
GNNs: Learning From Graph Structure
Traditional GNNs treat the graph as data to learn from. They use permutation-invariant aggregation functions to learn local patterns and node representations, and can often generalize to new, unseen graphs.
GSNNs: Constraining With Graph Structure
GSNNs use the graph structure as a constraint mechanism rather than a learning target. The graph defines which variables can directly influence each other, applying inductive biases through feature constraints. GSNNs are transductive—they are trained on a single graph and cannot be applied to new graphs.
Example
This distinction is critical when choosing between approaches. In biological signaling, for example, similar local network structures may produce drastically different signaling patterns. While the network structure is useful for understanding causal interactions, the graph patterns themselves are not necessarily predictive of signaling behaviors. GSNNs leverage this domain knowledge to constrain the model architecture, while GNNs will attempt to map similar local graph structures to signaling patterns.
Getting Started
The GSNN library provides comprehensive tutorials to help you get started:
Basic Usage: Learn to build and train your first GSNN model
Simulation: Use Bayesian networks to generate synthetic data
Comparison: Compare GSNN performance against baseline methods
Advanced Features: Explore reinforcement learning, Bayesian optimization, and uncertainty quantification
Installation
Create the conda/mamba environment and install GSNN:
mamba env create -f environment.yml
conda activate gsnn
pip install -e .
Citation
If you use GSNN in your research, please cite:
@article{Evans2024.02.28.582164,
author = {Nathaniel J. Evans and Gordon B. Mills and Guanming Wu and Xubo Song and Shannon McWeeney},
title = {Graph Structured Neural Networks for Perturbation Biology},
elocation-id = {2024.02.28.582164},
year = {2024},
doi = {10.1101/2024.02.28.582164},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://www.biorxiv.org/content/early/2024/02/29/2024.02.28.582164},
journal = {bioRxiv}
}
Next Steps
Explore the Tutorials for hands-on examples
Check out the API Reference for detailed API documentation
Visit the GitHub repository for the latest updates and issues