gsnn.gsnn.models.NodeAttention
- class gsnn.gsnn.models.NodeAttention(*args: Any, **kwargs: Any)[source]
Bases:
ModuleNode-wise channel attention.
The layer learns a single scalar attention coefficient (alpha_{b,n}) per node n for every sample in the batch b. The coefficient is obtained by first aggregating the (optionally weighted) hidden channels that belong to the node and then normalising the aggregated scores across all nodes with a sigmoid gates per node (no cross-node normalization). The resulting attention weights can be:
Interpreted - (alpha_{b,n}) tells how important node n was for the current forward pass.
Applied - the coefficients are broadcast back to the individual channels that originated from the node and multiplied with the original activations, producing an attention-modulated output.
- Parameters:
channel_groups (Sequence[int] or Tensor) – A 1-D list/array mapping global channel index → node index. Length equals the total number of hidden channels across all nodes.
dropout (float, optional (default=0.0)) – Dropout probability applied to the node-level attention weights.
temperature (float, optional (default=1.0)) – Softmax temperature. Lower values produce sharper distributions.
Examples
>>> # Suppose we have 2 nodes with 3 channels each (total 6 channels) >>> ch_groups = [0, 0, 0, 1, 1, 1] >>> attn = NodeAttention(ch_groups, dropout=0.1) >>> x = torch.randn(8, 6) # (batch=8, channels=6) >>> out, alpha = attn(x, return_alpha=True) >>> out.shape # same shape as input torch.Size([8, 6]) >>> alpha.shape # one scalar per node torch.Size([8, 2])
- __init__(channel_groups, dropout: float = 0.0, temperature: float = 1.0, channels=16, edge_index=None, edge_weight=None)[source]
Methods
__init__(channel_groups[, dropout, ...])forward(x, *[, return_alpha])Apply node attention.
- forward(x: torch.Tensor, *, return_alpha: bool = False)[source]
Apply node attention.
- Parameters:
x (Tensor of shape (B, C)) – Input activations ordered so that channels belonging to the same node are indexed according to channel_groups.
return_alpha (bool, optional (default=False)) – If True, the method returns a tuple
(out, alpha)wherealphais the attention matrix of shape (B, n_nodes).
- Returns:
The attention-modulated activations (and, optionally, the node coefficients).
- Return type:
Tensor or Tuple[Tensor, Tensor]