`igl.spd`

Riemannian extension for SPD-valued data. Not flattened into the top-level igl namespace — import explicitly::

from igl.spd import AIRMLoss, IGLReconSPDClassifier, LogEigVectorizer

Linear algebra

`igl.spd.linalg.matrix_log_sym(c, *, eps=1e-08)`

Batched matrix logarithm of SPD matrices via eigendecomposition.

Parameters:

Name	Type	Description	Default
`c`	`Tensor`	`[B, d, d]` SPD matrices.	required
`eps`	`float`	Eigenvalue clamp for numerical safety (default `1e-8`).	`1e-08`

Returns:

Type	Description
`Tensor`	`[B, d, d]` symmetric matrices (log of input).

`igl.spd.linalg.matrix_exp_sym(s)`

Batched matrix exponential of symmetric matrices via eigendecomposition.

For a real symmetric S = U Λ U^T, exp(S) = U diag(exp Λ) U^T.

Parameters:

Name	Type	Description	Default
`s`	`Tensor`	`[B, d, d]` symmetric matrices.	required

Returns:

Type	Description
`Tensor`	`[B, d, d]` symmetric positive-definite matrices.

`igl.spd.linalg.matrix_pow_sym(c, p, *, eps=1e-08)`

Batched real-power of SPD matrices via eigendecomposition.

For SPD C and real p, C^p = U diag(Λ^p) U^T.

Parameters:

Name	Type	Description	Default
`c`	`Tensor`	`[B, d, d]` SPD matrices.	required
`p`	`float`	Real exponent (e.g. `-0.5` for matrix-inverse-square-root).	required
`eps`	`float`	Eigenvalue clamp.	`1e-08`

Returns:

Type	Description
`Tensor`	`[B, d, d]` SPD matrices.

`igl.spd.linalg.unpack_sym_vec(vec, d)`

Invert :class:igl.spd.LogEigVectorizer's upper-triangle packing.

Parameters:

Name	Type	Description	Default
`vec`	`Tensor`	`[B, D]` where `D = d * (d + 1) / 2` — the upper triangle of a symmetric matrix in row-major order, with off-diagonal entries scaled by √2 (so the Frobenius norm in vector form matches the matrix Frobenius norm).	required
`d`	`int`	Side length of the resulting symmetric matrices.	required

Returns:

Type	Description
`Tensor`	`[B, d, d]` symmetric matrices.

Raises:

Type	Description
`IGLConfigError`	If `vec.shape[-1] != d * (d + 1) / 2`.

Log-Eig vectorizer

`igl.spd.log_eig.LogEigVectorizer`

Bases: BaseEstimator, TransformerMixin

Vectorize SPD matrices via the log-Euclidean embedding.

fit(X) records the matrix size from X.shape[1] and caches the upper-triangle indices + √2 scaling. transform(X) returns one row per input SPD matrix.

Parameters:

Name	Type	Description	Default
`eps`	`float`	Eigenvalue clamp for the matrix-log step (default `1e-8`).	`1e-08`

`fit(x, y=None)`

Cache the matrix size and packing indices from X.

`transform(x)`

Apply S → vec(log(S)) with √2 off-diagonal scaling.

Parameters:

Name	Type	Description	Default
`x`	`NDArray[floating]`	`[B, n, n]` SPD matrices.	required

Returns:

Type	Description
`NDArray[floating]`	`[B, n * (n + 1) / 2]` log-Eig vectors.

AIRM loss

`igl.spd.airm.airm_loss(c, c_hat, *, eps=1e-08, reduction='mean', c_inv_half=None)`

Affine-Invariant Riemannian Metric² between batched SPD matrices.

Parameters:

Name	Type	Description	Default
`c`	`Tensor`	`[B, d, d]` ground-truth SPD matrices.	required
`c_hat`	`Tensor`	`[B, d, d]` predicted SPD matrices.	required
`eps`	`float`	Eigenvalue clamp forwarded to matrix log / power routines.	`1e-08`
`reduction`	`str`	One of `"mean"` (default), `"sum"`, `"none"`. `"none"` returns the per-sample AIRM² as a `[B]` tensor.	`'mean'`
`c_inv_half`	`Tensor \| None`	Optional precomputed `C^{-1/2}` aligned with `c`. When `c` is a constant (the data covariance), `C^{-1/2}` is constant too, so a caller may hoist this eigendecomposition out of the training loop and pass it here — bit-identical to computing it inline (per-matrix `eigh` is independent of batching), but skips one of the three per-batch `eigh` calls.	`None`

Returns:

Type	Description
`Tensor`	Scalar (mean/sum) or per-sample AIRM² values.

Raises:

Type	Description
`IGLConfigError`	For an unknown `reduction` value.

`igl.spd.airm.AIRMLoss`

:class:igl.LossStrategy for SPD reconstruction via the AIRM metric.

The trainer feeds the strategy:

y: log-Eig vectors of shape [B, D] where D = d (d + 1) / 2.
pred: lstsq-fit predictions of the same shape — interpreted as log-Eig vectors of predicted SPD matrices.

:meth:target returns y unchanged (the lstsq operates in log-Euclidean tangent space, where Euclidean lstsq is geometry-respecting). :meth:loss lifts both vectors back to SPD via unpack_sym_vec → matrix_exp_sym and computes AIRM². :meth:metric is the same AIRM² value.

Parameters:

Name	Type	Description	Default
`latent_dim`	`int`	Side length `d` of the underlying SPD matrices.	required
`eps`	`float`	Eigenvalue clamp for matrix log/exp/power.	`1e-08`
`jitter`	`float`	Tikhonov-style ridge added on both sides of the AIRM call (`jitter * I` to the predicted SPD after `matrix_exp_sym` and to the target SPD). Real EEG covariances often span 6+ orders of magnitude on their eigenvalue spectrum; without a small ridge the `eigh` backward pass produces NaN gradients on ill-conditioned batches. Default `1e-5` matches the reference trainer's discipline. Set `0.0` to disable.	`1e-05`
`covs`	`Tensor \| None`	Optional `[N, d, d]` raw SPD targets aligned with the trainer's training tensor (`x_train`). When set together with a `trainer` reference, :meth:`loss` slices `covs` by the trainer's :attr:`current_batch_indices` and uses the raw covariance as the AIRM target instead of round-tripping the log-Eig vector through `unpack_sym_vec → matrix_exp_sym`. This avoids ~1e-6 round-trip noise per element that compounds to measurable AUC drift over 1000 epochs.	`None`
`trainer`	`MatryoshkaTrainer \| None`	The :class:`MatryoshkaTrainer` instance the loss is attached to. Required when `covs` is set. Stored as a back-reference so the loss can read `trainer.current_batch_indices` per batch.	`None`

Attributes:

Name	Type	Description
`higher_is_better`	`bool`	Always `False` — AIRM² is a distance.

`curve_score(pred, target)`

Dimension-curve score = AIRM² (already lower-is-better and not saturating).

`loss(pred, target)`

AIRM² between the predicted SPD and the target SPD.

When :attr:covs and :attr:trainer are set AND the trainer has published a current_batch_indices tensor (i.e. we are inside the trainer's training batch loop), the AIRM target is covs[indices] + jitter * I (avoiding the log-Eig → matrix-exp round-trip). Otherwise — validation pass, dimension-curve eval, or any external caller — the target is reconstructed from target via unpack_sym_vec. The training-time path is the bit-exact one; the val/eval path is the original round-trip (its small numerical noise only affects monitoring, not gradients).

On the predicted side, jitter is added to the symmetric matrix before matrix_exp_sym (see :meth:_pred_to_spd).

`target(y)`

Orthogonality penalty

`igl.spd.orthogonality.OrthogonalityPenalty`

:class:igl.types.ExtraLoss driving the pullback metric toward diagonality.

The contribution is

weight × ‖off-diag(g)‖² / (‖diag(g)‖² + ε)

where g = J J^T and J is the encoder Jacobian at the truncated output. Skipped for k < 2 (the metric is trivially diagonal for a single dimension).

Parameters:

Name	Type	Description	Default
`weight`	`float`	Multiplier applied by the trainer. `0.1` is the value validated on EEG data.	`0.1`
`every`	`int`	Call frequency in batches. `20` is a reasonable default — the Jacobian computation is O(d · D) so applying it on every batch is expensive.	`20`
`eps`	`float`	Diagonal-squared floor for numerical stability.	`1e-06`

`igl.spd.orthogonality.jacobian(encoder, x, *, output_dim)`

Compute J[b, j, i] = ∂encoder(x_b)_j / ∂x_b_i keeping the autograd graph.

Used to drive the orthogonality penalty: the resulting Jacobian gradient flows back to the encoder parameters so the penalty actually updates them.

Parameters:

Name	Type	Description	Default
`encoder`	`Module`	The encoder module.	required
`x`	`Tensor`	Input batch `[B, D]`.	required
`output_dim`	`int`	Number of output dimensions `d` of the encoder.	required

Returns:

Type	Description
`Tensor`	Jacobian tensor of shape `[B, d, D]`.

`igl.spd.orthogonality.pullback_metric(j)`

Pullback metric g = J J^T of shape [B, d, d].

Parameters:

Name	Type	Description	Default
`j`	`Tensor`	Jacobian tensor `[B, d, D]`.	required

`igl.spd.orthogonality.orthogonality_loss(g, *, eps=1e-06)`

Sum of squared off-diagonals divided by sum of squared diagonals.

Parameters:

Name	Type	Description	Default
`g`	`Tensor`	Pullback metric `[B, d, d]`.	required
`eps`	`float`	Numerical floor on the diagonal-squared denominator.	`1e-06`

Returns:

Type	Description
`Tensor`	Scalar penalty (mean across the batch).

`igl.spd.orthogonality.init_encoder_orthogonal_(encoder)`

Initialize every nn.Linear weight in encoder with QR-orthogonal init.

Operates in-place. Returns the number of linear layers initialised.

Preconditioning

`igl.spd.preconditioning.precondition(c, mode=PreconditionMode.TIKHONOV, epsilon=1e-06)`

Apply SPD-side preconditioning to a batch of covariance matrices.

Parameters:

Name	Type	Description	Default
`c`	`Tensor`	`[..., d, d]` batched SPD matrices. The last two dims must be the matrix dimensions; any leading dims are batched.	required
`mode`	`PreconditionModeLike`	Which preconditioner to apply. Strings are accepted via the :class:`PreconditionModeLike` alias and coerced through the enum at entry.	`TIKHONOV`
`epsilon`	`float`	Ridge magnitude for the Tikhonov branches. Ignored for `NONE` / `TRACE`. Default `1e-6` is the value characterised in the memo as universally safe.	`1e-06`

Returns:

Type	Description
`Tensor`	`[..., d, d]` preconditioned SPD matrices, on the same device
`Tensor`	and dtype as `c`.

Raises:

Type	Description
`IGLConfigError`	For an unknown `mode`.

Reconstruction classifier

`igl.spd.reconstruction.IGLReconSPDClassifier`

Bases: BaseEstimator, ClassifierMixin

Two-stage SPD classifier: AIRM-reconstruction encoder + sklearn readout.

Inputs are flat log-Eig vectors (use :class:igl.spd.LogEigVectorizer to obtain them from SPD matrices). The encoder reconstructs the input via AIRM, then a logistic-regression head learns y_label from the Green-kernel design matrix Φ.

Parameters:

Name	Type	Description	Default
`latent_dim`	`int`	Side length `d` of the underlying SPD matrices — controls the dimensionality of the reconstruction target `D = d * (d + 1) / 2`.	required
`max_dim`	`int`	Matryoshka maximum latent dimension for the IGL encoder. When a `config` is supplied, its `max_dim` must either match this value or be left at its default (which then inherits it).	`16`
`n_anchors`	`int \| None`	Anchor count for the Green kernel. `None` (default) defers to `config.kernel.n_anchors` (default `64`).	`None`
`n_scales`	`int \| None`	Scale count for the Green kernel. `None` (default) defers to `config.kernel.n_scales` (default `4`).	`None`
`orthogonality_weight`	`float`	When `> 0`, an :class:`igl.spd.OrthogonalityPenalty` is added to stage-A training.	`0.0`
`orthogonality_every`	`int`	Frequency (in batches) of the orthogonality penalty.	`20`
`encoder_hidden`	`int \| tuple[int, ...] \| None`	Encoder `hidden` shorthand.	`None`
`encoder_depth`	`int \| None`	Encoder depth shorthand.	`None`
`normalize_input`	`bool`	Whether the encoder wraps its input in a `BatchNorm1d(affine=False)`. Default `False` — log-Eig tangent vectors have already been centred upstream, and a second input-BN would strip the per-feature scale the Green's kernel and the AIRM reconstruction depend on.	`False`
`normalize`	`NormalizeModeLike \| None`	Φ-normalisation mode. `None` (default) defers to `config.kernel.normalize` if a `config` is supplied, else the package default :data:`igl.NormalizeMode.NW`.	`None`
`validation_fraction`	`float`	Fraction of the training tensor used for validation. The 80/20 split (the default) is performed via `torch.randperm(N)` so the RNG-consumption profile matches the EEG reference trainer's split bit-for-bit.	`0.2`
`fork_rng`	`bool`	When `True` (default), :meth:`fit` runs inside `torch.random.fork_rng()` so the caller's global torch / numpy RNG state is preserved. Set `False` to mutate the global RNG instead — required for bit-exact reproduction of EEG headline numbers that were produced with the reference trainer's bare `torch.manual_seed` discipline.	`True`
`precondition`	`PreconditionModeLike`	SPD-side preconditioning applied to every input covariance before AIRM. Default `"tikhonov"`: bit-identical to `"none"` at `d ≤ 64` (the encoder BatchNorm absorbs the constant offset) and rescues `torch.linalg.eigh` from LAPACK error 8481 at `d ≥ 128`. See :class:`igl.PreconditionMode` for the full mode catalogue and `alex-eeg-igl/MAINTAINER_MEMO_lwf_tikh_rules.md` for the empirical justification.	`TIKHONOV`
`precondition_epsilon`	`float`	Ridge magnitude for the Tikhonov branches of `precondition`. Default `1e-6` is the value characterised in the memo as universally safe across the tested datasets.	`1e-06`
`config`	`IGLConfig \| None`	Optional :class:`igl.IGLConfig`. Its `kernel` block is forwarded to the Green kernel, so `null_space`, `polynomial_degree`, `sigma_log_range` and `anchor_init_std` all take effect here. Explicit kwargs above override its values. `config.max_dim` must either match `max_dim` or be left at its default.	`None`
`random_state`	`int \| None`	Optional integer seed.	`None`

Attributes:

Name	Type	Description
`classes_`		Sorted unique training labels.
`n_features_in_`		Length of the flat log-Eig input vector.
`module_`		The IGLModule used by the encoder.
`history_`		:class:`igl.TrainingHistory` from stage A.
`readout_`		The fitted :class:`sklearn.linear_model.LogisticRegression`.
`dimension_curve_,`	`effective_dimension_`	Same as the Euclidean wrappers — measured in AIRM-curve-score space.
`precondition_mode_`	`effective_dimension_`	The resolved :class:`PreconditionMode` actually applied at fit time (round-trips through pickle).
`precondition_epsilon_`	`effective_dimension_`	The ridge value actually used at fit time.

`fit(x, y, *, covs=None)`

Fit both stages.

Parameters:

Name	Type	Description	Default
`x`	`NDArray[floating]`	Either log-Eig vectors `[N, d * (d + 1) / 2]` (the original API) or raw SPD matrices `[N, d, d]`. When `x` is SPD-shaped, it is vectorized internally via :class:`igl.spd.LogEigVectorizer` and also reused as the AIRM target (`covs` defaults to `x` in that case). This makes the wrapper drop-in compatible with sklearn pipelines that emit SPDs upstream (e.g. :func:`igl.make_igl_airm`).	required
`y`	`NDArray[generic]`	Integer class labels `[N]`.	required
`covs`	`Tensor \| None`	Optional raw SPD targets `[N, d, d]` aligned with `x`. When provided, AIRM uses `covs[batch_indices] + jitter * I` directly as the reconstruction target, bypassing the log-Eig → matrix-exp round-trip and the ~1e-6 element-wise noise it introduces. Required for bit-exact reproduction of AIRM-recon results from the reference trainer. Ignored when `x` is SPD-shaped (then `x` itself is the target).	`None`

`predict(x)`

Predict class labels. Accepts log-Eig vectors or raw SPDs.

`predict_proba(x)`

Class probabilities via the readout's :meth:predict_proba.