gsnn.models.GroupRMSNorm

Group-wise Root Mean Square Layer Normalization.

RMSNorm is a simpler alternative to layer normalization that only uses the RMS for normalization without mean centering. It’s particularly stable for small batch sizes and computationally more efficient than layer norm.

Classes

GroupRMSNorm(*args, **kwargs)

Applies Root Mean Square normalization within each channel group.

class gsnn.models.GroupRMSNorm.GroupRMSNorm(*args: Any, **kwargs: Any)[source]

Bases: Module

Applies Root Mean Square normalization within each channel group.

RMSNorm normalizes using only the RMS (root mean square) without mean centering, making it simpler and more stable than layer normalization, especially for small batch sizes.

Parameters:
  • channel_groups (list or tensor) – Specifies which group each channel belongs to. For example, [0,0,1,1,2,2] specifies 3 groups with 2 channels each.

  • eps (float) – Small value to avoid division by zero. Default: 1e-6

  • affine (bool) – If True, applies learnable scale parameter. Default: True

forward(x)[source]