Skip to content

igl.spd

Riemannian extension for SPD-valued data. Not flattened into the top-level igl namespace — import explicitly::

from igl.spd import AIRMLoss, IGLReconSPDClassifier, LogEigVectorizer

Linear algebra

igl.spd.linalg.matrix_log_sym(c, *, eps=1e-08)

Batched matrix logarithm of SPD matrices via eigendecomposition.

Parameters:

Name Type Description Default
c Tensor

[B, d, d] SPD matrices.

required
eps float

Eigenvalue clamp for numerical safety (default 1e-8).

1e-08

Returns:

Type Description
Tensor

[B, d, d] symmetric matrices (log of input).

igl.spd.linalg.matrix_exp_sym(s)

Batched matrix exponential of symmetric matrices via eigendecomposition.

For a real symmetric S = U Λ U^T, exp(S) = U diag(exp Λ) U^T.

Parameters:

Name Type Description Default
s Tensor

[B, d, d] symmetric matrices.

required

Returns:

Type Description
Tensor

[B, d, d] symmetric positive-definite matrices.

igl.spd.linalg.matrix_pow_sym(c, p, *, eps=1e-08)

Batched real-power of SPD matrices via eigendecomposition.

For SPD C and real p, C^p = U diag(Λ^p) U^T.

Parameters:

Name Type Description Default
c Tensor

[B, d, d] SPD matrices.

required
p float

Real exponent (e.g. -0.5 for matrix-inverse-square-root).

required
eps float

Eigenvalue clamp.

1e-08

Returns:

Type Description
Tensor

[B, d, d] SPD matrices.

igl.spd.linalg.unpack_sym_vec(vec, d)

Invert :class:igl.spd.LogEigVectorizer's upper-triangle packing.

Parameters:

Name Type Description Default
vec Tensor

[B, D] where D = d * (d + 1) / 2 — the upper triangle of a symmetric matrix in row-major order, with off-diagonal entries scaled by √2 (so the Frobenius norm in vector form matches the matrix Frobenius norm).

required
d int

Side length of the resulting symmetric matrices.

required

Returns:

Type Description
Tensor

[B, d, d] symmetric matrices.

Raises:

Type Description
IGLConfigError

If vec.shape[-1] != d * (d + 1) / 2.

Log-Eig vectorizer

igl.spd.log_eig.LogEigVectorizer

Bases: BaseEstimator, TransformerMixin

Vectorize SPD matrices via the log-Euclidean embedding.

fit(X) records the matrix size from X.shape[1] and caches the upper-triangle indices + √2 scaling. transform(X) returns one row per input SPD matrix.

Parameters:

Name Type Description Default
eps float

Eigenvalue clamp for the matrix-log step (default 1e-8).

1e-08

fit(x, y=None)

Cache the matrix size and packing indices from X.

transform(x)

Apply S → vec(log(S)) with √2 off-diagonal scaling.

Parameters:

Name Type Description Default
x NDArray[floating]

[B, n, n] SPD matrices.

required

Returns:

Type Description
NDArray[floating]

[B, n * (n + 1) / 2] log-Eig vectors.

AIRM loss

igl.spd.airm.airm_loss(c, c_hat, *, eps=1e-08, reduction='mean')

Affine-Invariant Riemannian Metric² between batched SPD matrices.

Parameters:

Name Type Description Default
c Tensor

[B, d, d] ground-truth SPD matrices.

required
c_hat Tensor

[B, d, d] predicted SPD matrices.

required
eps float

Eigenvalue clamp forwarded to matrix log / power routines.

1e-08
reduction str

One of "mean" (default), "sum", "none". "none" returns the per-sample AIRM² as a [B] tensor.

'mean'

Returns:

Type Description
Tensor

Scalar (mean/sum) or per-sample AIRM² values.

Raises:

Type Description
IGLConfigError

For an unknown reduction value.

igl.spd.airm.AIRMLoss

:class:igl.LossStrategy for SPD reconstruction via the AIRM metric.

The trainer feeds the strategy:

  • y: log-Eig vectors of shape [B, D] where D = d (d + 1) / 2.
  • pred: lstsq-fit predictions of the same shape — interpreted as log-Eig vectors of predicted SPD matrices.

:meth:target returns y unchanged (the lstsq operates in log-Euclidean tangent space, where Euclidean lstsq is geometry-respecting). :meth:loss lifts both vectors back to SPD via unpack_sym_vec → matrix_exp_sym and computes AIRM². :meth:metric is the same AIRM² value.

Parameters:

Name Type Description Default
latent_dim int

Side length d of the underlying SPD matrices.

required
eps float

Eigenvalue clamp for matrix log/exp/power.

1e-08
jitter float

Tikhonov-style ridge added on both sides of the AIRM call (jitter * I to the predicted SPD after matrix_exp_sym and to the target SPD). Real EEG covariances often span 6+ orders of magnitude on their eigenvalue spectrum; without a small ridge the eigh backward pass produces NaN gradients on ill-conditioned batches. Default 1e-5 matches the reference trainer's discipline. Set 0.0 to disable.

1e-05
covs Tensor | None

Optional [N, d, d] raw SPD targets aligned with the trainer's training tensor (x_train). When set together with a trainer reference, :meth:loss slices covs by the trainer's :attr:current_batch_indices and uses the raw covariance as the AIRM target instead of round-tripping the log-Eig vector through unpack_sym_vec → matrix_exp_sym. This avoids ~1e-6 round-trip noise per element that compounds to measurable AUC drift over 1000 epochs.

None
trainer MatryoshkaTrainer | None

The :class:MatryoshkaTrainer instance the loss is attached to. Required when covs is set. Stored as a back-reference so the loss can read trainer.current_batch_indices per batch.

None

Attributes:

Name Type Description
higher_is_better bool

Always False — AIRM² is a distance.

curve_score(pred, target)

Dimension-curve score = AIRM² (already lower-is-better and not saturating).

loss(pred, target)

AIRM² between the predicted SPD and the target SPD.

When :attr:covs and :attr:trainer are set AND the trainer has published a current_batch_indices tensor (i.e. we are inside the trainer's training batch loop), the AIRM target is covs[indices] + jitter * I (avoiding the log-Eig → matrix-exp round-trip). Otherwise — validation pass, dimension-curve eval, or any external caller — the target is reconstructed from target via unpack_sym_vec. The training-time path is the bit-exact one; the val/eval path is the original round-trip (its small numerical noise only affects monitoring, not gradients).

On the predicted side, jitter is added to the symmetric matrix before matrix_exp_sym (see :meth:_pred_to_spd).

target(y)

Orthogonality penalty

igl.spd.orthogonality.OrthogonalityPenalty

:class:igl.types.ExtraLoss driving the pullback metric toward diagonality.

The contribution is

weight × ‖off-diag(g)‖² / (‖diag(g)‖² + ε)

where g = J J^T and J is the encoder Jacobian at the truncated output. Skipped for k < 2 (the metric is trivially diagonal for a single dimension).

Parameters:

Name Type Description Default
weight float

Multiplier applied by the trainer. 0.1 is the value validated on EEG data.

0.1
every int

Call frequency in batches. 20 is a reasonable default — the Jacobian computation is O(d · D) so applying it on every batch is expensive.

20
eps float

Diagonal-squared floor for numerical stability.

1e-06

igl.spd.orthogonality.jacobian(encoder, x, *, output_dim)

Compute J[b, j, i] = ∂encoder(x_b)_j / ∂x_b_i keeping the autograd graph.

Used to drive the orthogonality penalty: the resulting Jacobian gradient flows back to the encoder parameters so the penalty actually updates them.

Parameters:

Name Type Description Default
encoder Module

The encoder module.

required
x Tensor

Input batch [B, D].

required
output_dim int

Number of output dimensions d of the encoder.

required

Returns:

Type Description
Tensor

Jacobian tensor of shape [B, d, D].

igl.spd.orthogonality.pullback_metric(j)

Pullback metric g = J J^T of shape [B, d, d].

Parameters:

Name Type Description Default
j Tensor

Jacobian tensor [B, d, D].

required

igl.spd.orthogonality.orthogonality_loss(g, *, eps=1e-06)

Sum of squared off-diagonals divided by sum of squared diagonals.

Parameters:

Name Type Description Default
g Tensor

Pullback metric [B, d, d].

required
eps float

Numerical floor on the diagonal-squared denominator.

1e-06

Returns:

Type Description
Tensor

Scalar penalty (mean across the batch).

igl.spd.orthogonality.init_encoder_orthogonal_(encoder)

Initialize every nn.Linear weight in encoder with QR-orthogonal init.

Operates in-place. Returns the number of linear layers initialised.

Preconditioning

igl.spd.preconditioning.precondition(c, mode=PreconditionMode.TIKHONOV, epsilon=1e-06)

Apply SPD-side preconditioning to a batch of covariance matrices.

Parameters:

Name Type Description Default
c Tensor

[..., d, d] batched SPD matrices. The last two dims must be the matrix dimensions; any leading dims are batched.

required
mode PreconditionModeLike

Which preconditioner to apply. Strings are accepted via the :class:PreconditionModeLike alias and coerced through the enum at entry.

TIKHONOV
epsilon float

Ridge magnitude for the Tikhonov branches. Ignored for NONE / TRACE. Default 1e-6 is the value characterised in the memo as universally safe.

1e-06

Returns:

Type Description
Tensor

[..., d, d] preconditioned SPD matrices, on the same device

Tensor

and dtype as c.

Raises:

Type Description
IGLConfigError

For an unknown mode.

Reconstruction classifier

igl.spd.reconstruction.IGLReconSPDClassifier

Bases: BaseEstimator, ClassifierMixin

Two-stage SPD classifier: AIRM-reconstruction encoder + sklearn readout.

Inputs are flat log-Eig vectors (use :class:igl.spd.LogEigVectorizer to obtain them from SPD matrices). The encoder reconstructs the input via AIRM, then a logistic-regression head learns y_label from the Green-kernel design matrix Φ.

Parameters:

Name Type Description Default
latent_dim int

Side length d of the underlying SPD matrices — controls the dimensionality of the reconstruction target D = d * (d + 1) / 2.

required
max_dim int

Matryoshka maximum latent dimension for the IGL encoder.

16
n_anchors int

Anchor count for the Green kernel.

64
n_scales int

Scale count for the Green kernel.

4
orthogonality_weight float

When > 0, an :class:igl.spd.OrthogonalityPenalty is added to stage-A training.

0.0
orthogonality_every int

Frequency (in batches) of the orthogonality penalty.

20
encoder_hidden int | tuple[int, ...] | None

Encoder hidden shorthand.

None
encoder_depth int | None

Encoder depth shorthand.

None
normalize_input bool

Whether the encoder wraps its input in a BatchNorm1d(affine=False). Default False — log-Eig tangent vectors have already been centred upstream, and a second input-BN would strip the per-feature scale the Green's kernel and the AIRM reconstruction depend on.

False
normalize NormalizeModeLike | None

Φ-normalisation mode. None (default) defers to config.kernel.normalize if a config is supplied, else the package default :data:igl.NormalizeMode.NW.

None
validation_fraction float

Fraction of the training tensor used for validation. The 80/20 split (the default) is performed via torch.randperm(N) so the RNG-consumption profile matches the EEG reference trainer's split bit-for-bit.

0.2
fork_rng bool

When True (default), :meth:fit runs inside torch.random.fork_rng() so the caller's global torch / numpy RNG state is preserved. Set False to mutate the global RNG instead — required for bit-exact reproduction of EEG headline numbers that were produced with the reference trainer's bare torch.manual_seed discipline.

True
precondition PreconditionModeLike

SPD-side preconditioning applied to every input covariance before AIRM. Default "tikhonov": bit-identical to "none" at d ≤ 64 (the encoder BatchNorm absorbs the constant offset) and rescues torch.linalg.eigh from LAPACK error 8481 at d ≥ 128. See :class:igl.PreconditionMode for the full mode catalogue and alex-eeg-igl/MAINTAINER_MEMO_lwf_tikh_rules.md for the empirical justification.

TIKHONOV
precondition_epsilon float

Ridge magnitude for the Tikhonov branches of precondition. Default 1e-6 is the value characterised in the memo as universally safe across the tested datasets.

1e-06
config IGLConfig | None

Optional :class:igl.IGLConfig. Explicit kwargs above override its values.

None
random_state int | None

Optional integer seed.

None

Attributes:

Name Type Description
classes_

Sorted unique training labels.

n_features_in_

Length of the flat log-Eig input vector.

module_

The IGLModule used by the encoder.

history_

:class:igl.TrainingHistory from stage A.

readout_

The fitted :class:sklearn.linear_model.LogisticRegression.

dimension_curve_, effective_dimension_

Same as the Euclidean wrappers — measured in AIRM-curve-score space.

precondition_mode_ effective_dimension_

The resolved :class:PreconditionMode actually applied at fit time (round-trips through pickle).

precondition_epsilon_ effective_dimension_

The ridge value actually used at fit time.

fit(x, y, *, covs=None)

Fit both stages.

Parameters:

Name Type Description Default
x NDArray[floating]

Either log-Eig vectors [N, d * (d + 1) / 2] (the original API) or raw SPD matrices [N, d, d]. When x is SPD-shaped, it is vectorized internally via :class:igl.spd.LogEigVectorizer and also reused as the AIRM target (covs defaults to x in that case). This makes the wrapper drop-in compatible with sklearn pipelines that emit SPDs upstream (e.g. :func:igl.make_igl_airm).

required
y NDArray[generic]

Integer class labels [N].

required
covs Tensor | None

Optional raw SPD targets [N, d, d] aligned with x. When provided, AIRM uses covs[batch_indices] + jitter * I directly as the reconstruction target, bypassing the log-Eig → matrix-exp round-trip and the ~1e-6 element-wise noise it introduces. Required for bit-exact reproduction of AIRM-recon results from the reference trainer. Ignored when x is SPD-shaped (then x itself is the target).

None

predict(x)

Predict class labels. Accepts log-Eig vectors or raw SPDs.

predict_proba(x)

Class probabilities via the readout's :meth:predict_proba.