gsnn.interpret

Classes

ContrastiveIGExplainer(model, data[, ...])

Edge-level Integrated-Gradients explainer for contrastive questions.

ContrastiveOcclusionExplainer(model, data[, ...])

Simple batched edge occlusion explainer for contrastive questions.

CounterfactualExplainer(model[, data, ...])

Feature-level counterfactual explainer using gradient descent.

GSNNExplainer(model, data[, ignore_cuda, ...])

Edge/node mask optimiser that produces sparse explanations.

IGExplainer(model, data[, ignore_cuda, ...])

Integrated-Gradients explainer for GSNN models (non-contrastive).

NoiseTunnel(explainer[, n_samples, ...])

Edge-level NoiseTunnel wrapper for IGExplainer and ContrastiveIGExplainer.

OcclusionExplainer(model, data[, ...])

Edge/node occlusion explainer for single observations.

class gsnn.interpret.ContrastiveIGExplainer(model: torch.nn.Module, data, n_steps: int = 50, ignore_cuda: bool = False)[source]

Bases: object

Edge-level Integrated-Gradients explainer for contrastive questions.

This module attributes the prediction difference:

Δf = f(x1)[target_idx] - f(x2)[target_idx]

to every edge e in the graph by integrating along a straight-line mask path m(α)=α·1, α∈[0,1] while keeping the two inputs x1 and x2 fixed. The attribution for an edge equals

\[\mathrm{IG}_e = \int_0^1 \frac{\partial}{\partial m_e} |f(x_1;m(α)) - f(x_2;m(α))|\,dα.\]
  • IG_e > 0 the presence of edge e increases |Δf|.

  • IG_e < 0 the presence of edge e decreases |Δf|.

  • IG_e 0 edge e is irrelevant to the gap.

By construction \(\sum_e \mathrm{IG}_e = |Δf|\) (completeness).

Parameters:
  • model (torch.nn.Module) – Trained GSNN model (evaluation mode is enforced internally).

  • data (torch_geometric.data.Data) – Graph data object; only used for human-readable edge names.

  • n_steps (int, optional (default=50)) – Number of interpolation points along the mask path (baseline included).

  • ignore_cuda (bool, optional (default=False)) – Force the explainer to run on CPU even if CUDA is available.

Example

>>> explainer = ContrastiveIGExplainer(model, data, n_steps=64)
>>> df = explainer.explain(x1, x2, target_idx=0)
>>> df.sort_values('score', ascending=False).head()
source target   score
in0    func0    0.42
func0  func3    0.40
func3  out0     0.38
>>> # Compute IG for only a subset of edges
>>> edge_mask = np.array([True, False, True, False, True])  # Only integrate edges 0, 2, 4
>>> df = explainer.explain(x1, x2, target_idx=0, element_mask=edge_mask)
>>> # Edges 1 and 3 will have None scores; edges 0, 2, 4 have IG attributions
>>> # Note: Completeness axiom won't hold when using element_mask
explain(x1: torch.Tensor, x2: torch.Tensor, target_idx: Union[int, List[int]], *, jitter: Optional[torch.Tensor] = None, element_mask=None, target: str = 'edge', reduction: str = 'mean', model_kwargs1=None, model_kwargs2=None) pandas.DataFrame[source]

Compute attributions for f(x₁) − f(x₂).

Parameters:
  • x1 (torch.Tensor (shape: [N_in], [1, N_in], or [B, N_in] for batch)) – Two input feature tensors. They must have identical batch size and ordering of nodes. Each pair (x1[i], x2[i]) is explained.

  • x2 (torch.Tensor (shape: [N_in], [1, N_in], or [B, N_in] for batch)) – Two input feature tensors. They must have identical batch size and ordering of nodes. Each pair (x1[i], x2[i]) is explained.

  • target_idx (int or list[int]) – Output dimension(s) to explain. If a list is provided the attributions refer to the sum of those outputs.

  • jitter (torch.Tensor, optional) – Optional noise to perturb the mask path.

  • element_mask (torch.Tensor or np.ndarray, optional (shape: [E] or [N])) –

    Boolean mask indicating which elements to compute IG attributions for. If None, all elements are integrated. If provided: - True/nonzero elements: integrate from 0 to 1 (normal IG path) - False/zero elements: fixed at 1 throughout the path (no integration) Elements not in the mask will have None scores in the output.

    Note: When using element_mask, the completeness axiom (attributions sum to |Δf|) will not hold since only a subset of elements are integrated. The attributions measure “contribution to |Δf| while holding other elements fixed at full strength”.

  • target (str, optional (default='edge')) – Whether to return ‘edge’ or ‘node’ level attributions.

  • reduction (str, optional (default='mean')) – How to aggregate attributions across batch samples: - ‘mean’: average attributions across samples (default) - ‘sum’: sum attributions across samples - ‘none’: return all per-sample attributions (adds ‘sample_idx’ column)

Returns:

If target=’edge’: columns [‘source’, ‘target’, ‘score’] for edge attributions. If target=’node’: columns [‘node’, ‘score’] for node attributions. If reduction=’none’: additional ‘sample_idx’ column for batch dimension. Elements not in element_mask will have None scores.

Return type:

pd.DataFrame

class gsnn.interpret.ContrastiveOcclusionExplainer(model: torch.nn.Module, data, batch_size: int = 32, ignore_cuda: bool = False, verbose: bool = False)[source]

Bases: object

Simple batched edge occlusion explainer for contrastive questions.

This module attributes the prediction difference:

Δf = f(x1)[target_idx] - f(x2)[target_idx]

to every edge e by systematically removing each edge and measuring the change in the absolute prediction difference:

\[\mathrm{Occ}_e = |Δf_{\text{baseline}}| - |Δf_{\text{without } e}|\]

where the baseline uses all edges present and the occluded version removes edge e completely (edge_mask = 0).

  • Occ_e > 0 removing edge e decreases |Δf| (edge contributes to difference).

  • Occ_e < 0 removing edge e increases |Δf| (edge reduces difference).

  • Occ_e 0 edge e has no impact on the prediction difference.

Parameters:
  • model (torch.nn.Module) – Trained GSNN model (evaluation mode is enforced internally).

  • data (torch_geometric.data.Data) – Graph data object; only used for human-readable edge names.

  • batch_size (int, optional (default=32)) – Number of edge occlusions to process in parallel.

  • ignore_cuda (bool, optional (default=False)) – Force the explainer to run on CPU even if CUDA is available.

  • verbose (bool, optional (default=False)) – Print progress information during explanation computation.

Example

>>> explainer = ContrastiveOcclusionExplainer(model, data, batch_size=64)
>>> df = explainer.explain(x1, x2, target_idx=0)
>>> df.sort_values('score', ascending=False).head()
source target   score
in0    func0    0.42
func0  func3    0.40
func3  out0     0.38
explain(x1: torch.Tensor, x2: torch.Tensor, target_idx: Union[int, List[int]], *, element_mask=None, target: str = 'edge', reduction: str = 'mean', model_kwargs1=None, model_kwargs2=None) pandas.DataFrame[source]

Compute occlusion attributions for f(x₁) − f(x₂).

Parameters:
  • x1 (torch.Tensor (shape: [N_in], [1, N_in], or [B, N_in] for batch)) – Two input feature tensors. They must have identical batch size and ordering of nodes. Each pair (x1[i], x2[i]) is explained.

  • x2 (torch.Tensor (shape: [N_in], [1, N_in], or [B, N_in] for batch)) – Two input feature tensors. They must have identical batch size and ordering of nodes. Each pair (x1[i], x2[i]) is explained.

  • target_idx (int or list[int]) – Output dimension(s) to explain. If a list is provided the attributions refer to the sum of those outputs.

  • element_mask (torch.Tensor or np.ndarray, optional (shape: [E] or [N])) – Boolean mask indicating which elements to compute occlusion for. If None, all elements are considered. If provided, only elements where element_mask[i] is True will have occlusion scores computed.

  • target (str, optional (default='edge')) – Whether to return ‘edge’ or ‘node’ level attributions.

  • reduction (str, optional (default='mean')) – How to aggregate attributions across batch samples: - ‘mean’: average attributions across samples (default) - ‘sum’: sum attributions across samples - ‘none’: return all per-sample attributions (adds ‘sample_idx’ column)

Returns:

If target=’edge’: columns [‘source’, ‘target’, ‘score’] for edge attributions. If target=’node’: columns [‘node’, ‘score’] for node attributions. If reduction=’none’: additional ‘sample_idx’ column for batch dimension. Elements not in element_mask will have None scores.

Return type:

pd.DataFrame

class gsnn.interpret.CounterfactualExplainer(model: torch.nn.Module, data=None, ignore_cuda: bool = False)[source]

Bases: object

Feature-level counterfactual explainer using gradient descent.

This module learns a minimal perturbation δ to an input x such that:

f(x + δ) ≈ target_value

The perturbation is learned via gradient descent with L2 regularization to enforce minimality. The optimization objective is:

\[\min_δ \|f(x + δ) - \text{target}\|^2 + λ\|δ\|^2\]

where λ is the weight decay parameter controlling the trade-off between achieving the target and minimizing the perturbation.

  • δ_i > 0 feature i needs to be increased to reach the target.

  • δ_i < 0 feature i needs to be decreased to reach the target.

  • δ_i 0 feature i is irrelevant for the counterfactual.

Parameters:
  • model (torch.nn.Module) – Trained GSNN model (evaluation mode is enforced internally).

  • data (torch_geometric.data.Data, optional) – Graph data object; used for human-readable feature names.

  • ignore_cuda (bool, optional (default=False)) – Force the explainer to run on CPU even if CUDA is available.

Example

>>> explainer = CounterfactualExplainer(model, data)
>>> # Single observation
>>> df = explainer.explain(x, target_value=0.8, target_idx=0, max_iter=500)
>>> # Multiple observations (same perturbation applied to all)
>>> df = explainer.explain(x_batch, target_value=0.8, target_idx=0, max_iter=500)
>>> df.sort_values('perturbation', key=abs, ascending=False).head()
feature    original  perturbation  counterfactual
in0        0.12      0.45          0.57
in1        0.89     -0.23          0.66
in2        0.34      0.11          0.45
explain(x: torch.Tensor, target_value: Union[float, torch.Tensor], target_idx: Optional[Union[int, List[int]]] = None, trainable_mask: Optional[torch.Tensor] = None, lr: float = 0.01, weight_decay: float = 0.01, dropout: float = 0.0, min_iter: int = 25, max_iter: int = 1000, tolerance: float = 1e-05, verbose: bool = True, transform: Optional[Callable] = torch.nn.Identity) pandas.DataFrame[source]

Learn minimal perturbation to achieve target model output.

Parameters:
  • x (torch.Tensor (shape: [N_in] or [B, N_in])) – Input feature tensor. If 1D, it will be unsqueezed to batch size 1. For multiple observations, the same perturbation will be applied to all.

  • target_value (float or torch.Tensor) – Desired model output. If target_idx is specified, this should be a scalar or tensor matching the number of target indices. If target_idx is None, this should match the full output dimension. The same target value is used for all observations in the batch.

  • target_idx (int, list[int], or None) – Output dimension(s) to target. If None, targets all outputs.

  • trainable_mask (torch.Tensor, optional (shape: [N_in])) – Boolean mask specifying which features can be perturbed. If None, all features are trainable.

  • lr (float, optional (default=0.01)) – Learning rate for gradient descent.

  • weight_decay (float, optional (default=0.01)) – L2 regularization coefficient for minimizing perturbation magnitude.

  • dropout (float, optional (default=0.0)) – Dropout rate for the model.

  • min_iter (int, optional (default=25)) – Minimum number of optimization iterations.

  • max_iter (int, optional (default=1000)) – Maximum number of optimization iterations.

  • tolerance (float, optional (default=1e-6)) – Convergence tolerance for loss change between iterations.

  • verbose (bool, optional (default=False)) – Print optimization progress.

  • transform (Callable, optional) – Transform the perturbation, must be differentiable. E.g., relu(), tanh()

Returns:

DataFrame with columns ‘feature’, ‘original’, ‘perturbation’, ‘counterfactual’ showing the learned perturbations for each input feature.

Return type:

pd.DataFrame

class gsnn.interpret.GSNNExplainer(model, data, ignore_cuda=False, gumbel_softmax=True, hard=False, tau0=3, min_tau=0.5, prior=1, iters=250, lr=0.01, weight_decay=1e-05, free_edges=0, grad_norm_clip=0, beta=1, verbose=True, optimizer=torch.optim.Adam, entropy=0, scale_mse_by_variance=True)[source]

Bases: object

Edge/node mask optimiser that produces sparse explanations.

The explainer learns a binary mask m∈{0,1}^{E|N} that maximises fidelity between the model’s prediction on the masked graph and the prediction on the full graph while simultaneously penalising mask size:

L = MSE\bigl(f(x; m), f(x; 1)\bigr)
    + β \max(0, \|m\|₁ − free_elements)
    − λ H(m)            (optional entropy term)

Here m is obtained via a differentiable Gumbel-Softmax relaxation so the optimisation can be performed with vanilla back-prop. After convergence the importance score is the softmax probability p_i = P(m_i=1).

  • score_i 1 element i is essential for reproducing the original prediction.

  • score_i 0 element i can be removed with little impact.

Parameters:
  • model (torch.nn.Module) – Trained GSNN model (its parameters are frozen during explanation).

  • data (torch_geometric.data.Data) – Graph data object (only metadata are used).

  • ignore_cuda (bool, optional (default=False)) – Force CPU even if CUDA is available.

  • gumbel_softmax (bool, optional (default=True)) – Use the Gumbel-Softmax re-parameterisation; otherwise plain Softmax.

  • hard (bool, optional (default=False)) – Use the straight-through estimator to obtain discrete masks at test time while keeping gradients continuous.

  • tau0 (float, optional (default=3.0)) – Initial temperature for the (hard) Gumbel-Softmax.

  • min_tau (float, optional (default=0.5)) – Minimum temperature reached after exponential decay.

  • prior (float, optional (default=1.0)) – Initial bias added to the positive/negative logits.

  • iters (int, optional (default=250)) – Number of optimisation steps.

  • lr (float, optional (default=1e-2)) – Learning rate for the optimiser.

  • weight_decay (float, optional (default=1e-5)) – Weight decay applied to the mask logits.

  • free_edges (int, optional (default=0)) – Number of elements allowed before the sparsity penalty activates.

  • beta (float, optional (default=1.0)) – Coefficient of the sparsity term.

  • entropy (float, optional (default=0.0)) – Strength of the entropy bonus (encourages exploration).

  • scale_mse_by_variance (bool, optional (default=True)) – If True, normalise the MSE term by Var(target_preds) so that the fidelity loss is scale-invariant across samples (an 1 - style objective). This makes beta interpretable across samples with very different prediction magnitudes. Falls back to plain MSE when the target has fewer than 2 elements.

Example

>>> explainer = GSNNExplainer(model, data, iters=400, beta=5)
>>> # Edge-level attributions
>>> edge_df = explainer.explain(x, targets=[0], target='edge')
>>> edge_df.sort_values('score', ascending=False).head()
>>> # Node-level attributions
>>> node_df = explainer.explain(x, targets=[0], target='node')
>>> node_df.sort_values('score', ascending=False).head()
explain(x, target_idx=None, return_weights=False, target='edge', model_kwargs=None)[source]

Initializes and runs gradient descent to select a minimal subset of edges or nodes that produce comparable predictions to the full graph.

Parameters:
  • x (torch.tensor) – Input features to explain; in shape (B, I).

  • targets (list, optional) – Target output indices to explain.

  • return_weights (bool, optional (default=False)) – Whether to return raw weights along with the DataFrame.

  • target (str, optional (default='edge')) – Whether to return ‘edge’ or ‘node’ level attributions.

  • model_kwargs (dict, optional (default=None)) – Extra keyword arguments forwarded to every self.model(...) call (e.g. {'x_fn': x_fn} for models trained with node_activity=True). edge_mask / node_mask are reserved and should not be included.

Returns:

If target=’edge’: columns [‘source’, ‘target’, ‘score’] for edge attributions. If target=’node’: columns [‘node’, ‘score’] for node attributions.

Return type:

pd.DataFrame

tune(x, target_ixs=None, min_r2=0.7, beta_step=1.5, max_trials=20, tolerance=0.001, verbose=True, target='edge', **explain_kwargs)[source]

Tune beta parameter starting from current value to find maximum sparsity while maintaining minimum performance.

Starts from the user’s initial beta and adjusts up/down based on performance: - If R² >= min_r2: increase beta (more sparsity) until performance drops - If R² < min_r2: decrease beta (less sparsity) until performance recovers

Much more efficient than wide search since user provides good starting point.

Parameters:
  • x – torch.Tensor Input data for explanation

  • target_ixs – list, optional Target output indices to explain

  • min_r2 – float, optional (default=0.7) Minimum R² threshold to maintain

  • beta_step – float, optional (default=1.5) Multiplicative step size for beta adjustment (1.5 = 50% increase/decrease)

  • max_trials – int, optional (default=20) Maximum number of beta adjustments to try

  • tolerance – float, optional (default=1e-3) Convergence tolerance for fine search

  • verbose – bool, optional (default=True) Whether to print search progress

  • target – str, optional (default=’edge’) Whether to tune for ‘edge’ or ‘node’ level attributions

  • **explain_kwargs – dict, optional Override any explainer parameters during tuning: - iters: number of optimization steps - lr: learning rate - weight_decay: weight decay - free_edges: elements allowed before penalty - prior: initial bias for element selection - tau0: initial temperature - min_tau: minimum temperature - hard: use straight-through estimator - entropy: entropy bonus strength

Returns:

Results containing optimal beta, achieved R², number of elements, and final DataFrame

Return type:

dict

class gsnn.interpret.IGExplainer(model, data, ignore_cuda=False, n_steps=50, baseline=None)[source]

Bases: object

Integrated-Gradients explainer for GSNN models (non-contrastive).

Computes per-edge or per-node attributions for a prediction f(x)[target_idx] by integrating the gradient along a straight-line path in feature space from a baseline input x′ (default all zeros) to the observation x.

For edge-level attributions:

IG_e = (x - x′) · \int_0^1 ∂f(x′ + α(x-x′))/∂m_e dα.

For node-level attributions:

IG_n = (x - x′) · \int_0^1 ∂f(x′ + α(x-x′))/∂n_n dα.

When the baseline masks are zero this reduces to the EdgeIG/NodeIG variants. The attributions satisfy the completeness axiom for their respective domains.

Node-level and edge-level attributions are computed independently using separate masking mechanisms in the GSNN model.

Parameters:
  • model (torch.nn.Module) – Trained GSNN model (copied and frozen internally).

  • data (torch_geometric.data.Data) – Graph data object; only used for edge names.

  • ignore_cuda (bool, optional (default=False)) – Force the explainer to run on CPU even if CUDA is available.

  • n_steps (int, optional (default=50)) – Number of points on the IG path (baseline included).

  • baseline (torch.Tensor or None, optional) – Custom baseline edge-mask of shape (1,E). None defaults to an all-zeros mask.

Example

>>> explainer = IGExplainer(model, data, n_steps=64)
>>> # Edge-level attributions
>>> df_edge = explainer.explain(x, target_idx=0, target='edge')
>>> df_edge.nlargest(5, 'score')
>>> # Node-level attributions
>>> df_node = explainer.explain(x, target_idx=0, target='node')
>>> df_node.nlargest(5, 'score')
>>> # Compute IG for only a subset of edges
>>> edge_mask = np.array([True, False, True, False, True])  # Only integrate edges 0, 2, 4
>>> df_edge = explainer.explain(x, target_idx=0, target='edge', element_mask=edge_mask)
>>> # Edges 1 and 3 will have None scores; edges 0, 2, 4 have IG attributions
>>> # Note: Completeness axiom won't hold when using element_mask
explain(x, target_idx, *, jitter: Optional[torch.Tensor] = None, element_mask=None, target='edge', reduction='mean', model_kwargs=None)[source]

Compute integrated gradients attributions for GSNN predictions.

Parameters:
  • x (torch.Tensor) – Input features of shape (N_in,), (1, N_in), or (B, N_in) for batch.

  • target_idx (int) – Index of the target output node to explain.

  • jitter (torch.Tensor, optional) – Optional noise to add to baseline, shape (E,) or (1, E) for edge target, shape (N,) or (1, N) for node target.

  • element_mask (torch.Tensor or np.ndarray, optional (shape: [E] or [N])) –

    Boolean mask indicating which elements to compute IG attributions for. If None, all elements are integrated. If provided: - True/nonzero elements: integrate from baseline to 1 (normal IG) - False/zero elements: fixed at 1 throughout the path (no integration) Elements not in the mask will have None scores in the output.

    Note: When using element_mask, the completeness axiom (attributions sum to f(x) - f(baseline)) will not hold since only a subset of elements are integrated. The attributions measure “contribution while holding other elements fixed at full strength”.

  • target (str, optional (default='edge')) – Whether to return ‘edge’ or ‘node’ level attributions.

  • reduction (str, optional (default='mean')) – How to aggregate attributions across batch samples: - ‘mean’: average attributions across samples (default) - ‘sum’: sum attributions across samples - ‘none’: return all per-sample attributions (adds ‘sample_idx’ column)

  • model_kwargs (dict, optional (default=None)) – Extra keyword arguments forwarded to every self.model(...) call (e.g. {'x_fn': x_fn} for models trained with node_activity=True). Tensor values must have leading dim equal to x.shape[0] (or 1 to broadcast); they will be sliced per sample and replicated to n_steps+1 along the IG path. edge_mask / node_mask are reserved and should not be included.

Returns:

  • pd.DataFrame – If target=’edge’: columns [‘source’, ‘target’, ‘score’] for edge attributions. If target=’node’: columns [‘node’, ‘score’] for node attributions. If reduction=’none’: additional ‘sample_idx’ column for batch dimension. Elements not in element_mask will have None scores.

  • Following approach and style from (https://github.com/ankurtaly/Integrated-Gradients/blob/master/IntegratedGradients/integrated_gradients.py)

  • Reference

  • @article{DBLP (journals/corr/SundararajanTY17,) –

    author = {Mukund Sundararajan and

    Ankur Taly and Qiqi Yan},

    title = {Axiomatic Attribution for Deep Networks}, journal = {CoRR}, volume = {abs/1703.01365}, year = {2017}, url = {http://arxiv.org/abs/1703.01365}, eprinttype = {arXiv}, eprint = {1703.01365}, timestamp = {Mon, 13 Aug 2018 16:48:32 +0200}, biburl = {https://dblp.org/rec/journals/corr/SundararajanTY17.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

class gsnn.interpret.NoiseTunnel(explainer, n_samples: int = 20, noise_std: float = 0.05, agg: str = 'mean')[source]

Bases: object

Edge-level NoiseTunnel wrapper for IGExplainer and ContrastiveIGExplainer.

This module runs the wrapped explainer multiple times while injecting Gaussian noise in the edge-mask space and finally aggregates the obtained attributions. The procedure is inspired by SmoothGrad / NoiseTunnel (Smilkov et al. 2017) but adapted to GSNNs where the inputs are the edge weights rather than node features.

Parameters:
  • explainer (IGExplainer or ContrastiveIGExplainer) – A configured explainer instance whose explain method will be executed repeatedly. The explainer must expose the underlying GSNN model via the attribute model.

  • n_samples (int, optional (default=20)) – Number of noisy repetitions.

  • noise_std (float, optional (default=0.05)) – Standard deviation of the Gaussian noise added to the edge weights.

  • agg ({'mean', 'median'}, optional (default='mean')) – Aggregation statistic used to combine the per-sample attributions.

Notes

  1. For IGExplainer we add noise to its baseline edge-mask (explainer.baseline). This is equivalent to sampling different straight-line paths m(α) = α·(1 + ε) where ε ~ 𝓝(0, σ²).

  2. ContrastiveIGExplainer does not expose a baseline. Therefore we perturb the terminal mask m=1 only, which yields a noisy path m(α)=α·(1+ε). The implementation copies the internal logic of the contrastive explainer because the original method does not accept external masks.

  3. The injected noise is clipped to the valid range [0, 1].

Example

>>> ig = ContrastiveIGExplainer(model, data, n_steps=64)
>>> nt = NoiseTunnel(ig, n_samples=30, noise_std=0.1)
>>> df = nt.explain(x1, x2, target_idx=0)
>>> df.sort_values('score', ascending=False).head()
explain(*args, **kwargs) pandas.DataFrame[source]

Compute noise-tunnel edge attributions.

The positional / keyword arguments are forwarded verbatim to the wrapped explainer’s explain method.

class gsnn.interpret.OcclusionExplainer(model, data, ignore_cuda=False, batch_size=32)[source]

Bases: object

Edge/node occlusion explainer for single observations.

Computes per-edge or per-node attributions for a prediction f(x)[target_idx] by systematically removing each element and measuring the change in prediction.

For edge-level attributions:

Occ_e = f(x; mask_baseline) - f(x; mask_e_removed)

For node-level attributions:

Occ_n = f(x; mask_baseline) - f(x; mask_n_removed)

where mask_baseline uses all elements present and mask_element_removed removes only the specified element (sets mask[element] = 0).

  • Occ > 0 element contributes positively to the prediction

  • Occ < 0 element inhibits the prediction (removing it increases output)

  • Occ 0 element has no impact on the prediction

The occlusion approach provides a direct, model-agnostic measure of element importance by directly measuring the effect of completely removing each element.

Parameters:
  • model (torch.nn.Module) – Trained GSNN model (copied and frozen internally).

  • data (torch_geometric.data.Data) – Graph data object; only used for element names.

  • ignore_cuda (bool, optional (default=False)) – Force the explainer to run on CPU even if CUDA is available.

  • batch_size (int, optional (default=32)) – Number of element occlusions to process in parallel.

Example

>>> explainer = OcclusionExplainer(model, data, batch_size=64)
>>> # Edge-level attributions
>>> edge_df = explainer.explain(x, target_idx=0, target='edge')
>>> edge_df.nlargest(5, 'score')
source target   score
in0    func0    0.42
func0  func3    0.40
func3  out0     0.38
>>> # Node-level attributions
>>> node_df = explainer.explain(x, target_idx=0, target='node')
>>> node_df.nlargest(5, 'score')
>>> # Occlude only a subset of edges
>>> edge_mask = np.array([True, False, True, False, True])  # Only occlude edges 0, 2, 4
>>> edge_df = explainer.explain(x, target_idx=0, target='edge', element_mask=edge_mask)
>>> # Edges 1 and 3 will have None scores
explain(x, target_idx, element_mask=None, target='edge', reduction='mean', model_kwargs=None)[source]

Compute edge or node occlusion attributions for f(x)[target_idx].

Parameters:
  • x (torch.Tensor (shape: [N_in], [1, N_in], or [B, N_in] for batch)) – Input feature tensor. Will be moved to appropriate device.

  • target_idx (int) – Output dimension to explain.

  • element_mask (torch.Tensor or np.ndarray, optional (shape: [E] or [N])) – Boolean mask indicating which elements to compute occlusion for. If None, all elements are considered. If provided, only elements where element_mask[i] is True will have occlusion scores computed.

  • target (str, optional (default='edge')) – Whether to return ‘edge’ or ‘node’ level attributions.

  • reduction (str, optional (default='mean')) – How to aggregate attributions across batch samples: - ‘mean’: average attributions across samples (default) - ‘sum’: sum attributions across samples - ‘none’: return all per-sample attributions (adds ‘sample_idx’ column)

  • model_kwargs (dict, optional (default=None)) – Extra keyword arguments forwarded to every self.model(...) call (e.g. {'x_fn': x_fn} for models trained with node_activity=True). Tensor values must have leading dim equal to x.shape[0] (or 1 to broadcast); they will be tiled to match the per-element-occluded grid. edge_mask / node_mask are reserved.

Returns:

If target=’edge’: columns [‘source’, ‘target’, ‘score’] for edge attributions. If target=’node’: columns [‘node’, ‘score’] for node attributions. If reduction=’none’: additional ‘sample_idx’ column for batch dimension. Elements not in element_mask will have None scores.

Return type:

pd.DataFrame

Modules

gsnn.interpret.ContrastiveGSNNExplainer

gsnn.interpret.ContrastiveIGExplainer(model, ...)

Edge-level Integrated-Gradients explainer for contrastive questions.

gsnn.interpret.ContrastiveOcclusionExplainer(...)

Simple batched edge occlusion explainer for contrastive questions.

gsnn.interpret.CounterfactualExplainer(model)

Feature-level counterfactual explainer using gradient descent.

gsnn.interpret.GSNNExplainer(model, data[, ...])

Edge/node mask optimiser that produces sparse explanations.

gsnn.interpret.IGExplainer(model, data[, ...])

Integrated-Gradients explainer for GSNN models (non-contrastive).

gsnn.interpret.NoiseTunnel(explainer[, ...])

Edge-level NoiseTunnel wrapper for IGExplainer and ContrastiveIGExplainer.

gsnn.interpret.OcclusionExplainer(model, data)

Edge/node occlusion explainer for single observations.

gsnn.interpret.extract_entity_function

gsnn.interpret.plot_explanation_graph

gsnn.interpret.utils