gsnn.gsnn.interpret.ContrastiveIGExplainer

class gsnn.gsnn.interpret.ContrastiveIGExplainer(model: torch.nn.Module, data, n_steps: int = 50, ignore_cuda: bool = False)[source]

Bases: object

Edge-level Integrated-Gradients explainer for contrastive questions.

This module attributes the prediction difference:

Δf = f(x1)[target_idx] - f(x2)[target_idx]

to every edge e in the graph by integrating along a straight-line mask path m(α)=α·1, α∈[0,1] while keeping the two inputs x1 and x2 fixed. The attribution for an edge equals

\[\mathrm{IG}_e = \int_0^1 \frac{\partial}{\partial m_e} |f(x_1;m(α)) - f(x_2;m(α))|\,dα.\]
  • IG_e > 0 the presence of edge e increases |Δf|.

  • IG_e < 0 the presence of edge e decreases |Δf|.

  • IG_e 0 edge e is irrelevant to the gap.

By construction \(\sum_e \mathrm{IG}_e = |Δf|\) (completeness).

Parameters:
  • model (torch.nn.Module) – Trained GSNN model (evaluation mode is enforced internally).

  • data (torch_geometric.data.Data) – Graph data object; only used for human-readable edge names.

  • n_steps (int, optional (default=50)) – Number of interpolation points along the mask path (baseline included).

  • ignore_cuda (bool, optional (default=False)) – Force the explainer to run on CPU even if CUDA is available.

Example

>>> explainer = ContrastiveIGExplainer(model, data, n_steps=64)
>>> df = explainer.explain(x1, x2, target_idx=0)
>>> df.sort_values('score', ascending=False).head()
source target   score
in0    func0    0.42
func0  func3    0.40
func3  out0     0.38
>>> # Compute IG for only a subset of edges
>>> edge_mask = np.array([True, False, True, False, True])  # Only integrate edges 0, 2, 4
>>> df = explainer.explain(x1, x2, target_idx=0, element_mask=edge_mask)
>>> # Edges 1 and 3 will have None scores; edges 0, 2, 4 have IG attributions
>>> # Note: Completeness axiom won't hold when using element_mask
__init__(model: torch.nn.Module, data, n_steps: int = 50, ignore_cuda: bool = False) None[source]

Methods

__init__(model, data[, n_steps, ignore_cuda])

explain(x1, x2, target_idx, *[, jitter, ...])

Compute attributions for f(x₁) − f(x₂).

explain(x1: torch.Tensor, x2: torch.Tensor, target_idx: Union[int, List[int]], *, jitter: Optional[torch.Tensor] = None, element_mask=None, target: str = 'edge', reduction: str = 'mean', model_kwargs1=None, model_kwargs2=None) pandas.DataFrame[source]

Compute attributions for f(x₁) − f(x₂).

Parameters:
  • x1 (torch.Tensor (shape: [N_in], [1, N_in], or [B, N_in] for batch)) – Two input feature tensors. They must have identical batch size and ordering of nodes. Each pair (x1[i], x2[i]) is explained.

  • x2 (torch.Tensor (shape: [N_in], [1, N_in], or [B, N_in] for batch)) – Two input feature tensors. They must have identical batch size and ordering of nodes. Each pair (x1[i], x2[i]) is explained.

  • target_idx (int or list[int]) – Output dimension(s) to explain. If a list is provided the attributions refer to the sum of those outputs.

  • jitter (torch.Tensor, optional) – Optional noise to perturb the mask path.

  • element_mask (torch.Tensor or np.ndarray, optional (shape: [E] or [N])) –

    Boolean mask indicating which elements to compute IG attributions for. If None, all elements are integrated. If provided: - True/nonzero elements: integrate from 0 to 1 (normal IG path) - False/zero elements: fixed at 1 throughout the path (no integration) Elements not in the mask will have None scores in the output.

    Note: When using element_mask, the completeness axiom (attributions sum to |Δf|) will not hold since only a subset of elements are integrated. The attributions measure “contribution to |Δf| while holding other elements fixed at full strength”.

  • target (str, optional (default='edge')) – Whether to return ‘edge’ or ‘node’ level attributions.

  • reduction (str, optional (default='mean')) – How to aggregate attributions across batch samples: - ‘mean’: average attributions across samples (default) - ‘sum’: sum attributions across samples - ‘none’: return all per-sample attributions (adds ‘sample_idx’ column)

Returns:

If target=’edge’: columns [‘source’, ‘target’, ‘score’] for edge attributions. If target=’node’: columns [‘node’, ‘score’] for node attributions. If reduction=’none’: additional ‘sample_idx’ column for batch dimension. Elements not in element_mask will have None scores.

Return type:

pd.DataFrame