gsnn.optim.MagnitudeEdgeRegressor
Online Tier-0 edge inference via auxiliary linear regression during GSNN training.
For each adjacent layer pair (n-1, n) and source aggregator k, fit a shared (N, N) weight matrix W so that activation magnitudes at layer n-1 predict gradient magnitudes at layer n:
Y_hat[:, j] = sum_i W[i, j] * Xtilde[:, i]
Magnitudes are taken from ResBlock._last_pre_norm_activation (post-lin_in,
pre-norm) and corresponding activation gradients, matching
MagnitudeEdgeInferer information flow.
The regressor trains jointly with the GSNN (detached features, separate optimizer). Held-out validation edges drive best-checkpoint selection, mitigating gradient absorption at equilibrium.
See docs/notes/edge_inference_notes.md section 4 and tutorial 14.
Classes
|
Post-hoc inferrer for function -> function edges via activation/gradient magnitude correlation across adjacent layers. |
|
Online auxiliary linear regressor for function -> function edge inference. |
- class gsnn.optim.MagnitudeEdgeRegressor.MagnitudeEdgeRegressor(*args: Any, **kwargs: Any)[source]
Bases:
ModuleOnline auxiliary linear regressor for function -> function edge inference.
Learns a single shared weight matrix
Wof shape(N, N)during GSNN training. Source activations (layer n-1) predict target gradient magnitudes (layer n) across adjacent ResBlock pairs and multiple source aggregators.- Parameters:
model (GSNN) – Model being trained. Must have
checkpoint=False.data (HeteroData-like) – Graph container with
node_names_dictandedge_index_dict.aggregators (sequence of str) – Source-side channel reductions:
'sum','max','mean','l2'. Target gradients always use L1 (sum of absolute values).use_pre_norm (bool) – If True (default), use post-
lin_inpre-norm activations.standardize (bool) – If True (default), EMA z-score features per (pair, aggregator).
lr (float) – AdamW hyperparameters for
Wonly.weight_decay (float) – AdamW hyperparameters for
Wonly.ridge (float) – Additional L2 penalty on
Wbeyondweight_decay.score_mode ({'abs', 'relu', 'signed'}) – How to convert
Wentries into edge scores for ranking.ema_momentum (float) – Momentum for running mean/variance updates during standardization.
- aux_step() dict[str, float][source]
Build features from cached activations/grads, update
W, return metrics.Must be called after
loss.backward()so activation gradients exist. Features are detached — no gradient flows into the GSNN from this step.
- evaluate(*, exclude_self: bool = True) pandas.DataFrame[source]
Build edge score DataFrame from current
W.Returns columns compatible with
MagnitudeEdgeInferer.evaluate:src_func, dst_func, src_idx, dst_idx, score, has_edge, p_value, q_value.
- evaluate_against(positive_edges: set[tuple[str, str]] | list[tuple[str, str]], *, top_k: tuple[int, ...] = (1, 3, 5)) dict[str, float][source]
Score held-out edges against non-edges using current
W.Returns global ROC-AUC plus within-target MRR and top@k rates.
- static evaluate_target_ranking(res: pandas.DataFrame, positive_edges: set[tuple[str, str]] | list[tuple[str, str]], score_col: str = 'score', top_k: tuple[int, ...] = (1, 3, 5)) tuple[pandas.DataFrame, dict[str, float]][source]
Delegate to
MagnitudeEdgeInferer.evaluate_target_ranking.