Skip to content

igl.data

Synthetic data generators for tests and examples.

igl.data.synthetic.make_flat_torus(n_samples, *, noise=0.0, seed=None)

Flat torus embedded in R⁴ via (cos θ₁, sin θ₁, cos θ₂, sin θ₂).

Intrinsic dimension: 2 (the two angles).

Parameters:

Name Type Description Default
n_samples int

Number of points to sample.

required
noise float

Additive Gaussian noise std in the ambient R⁴. Default 0.

0.0
seed int | None

Optional RNG seed (does not affect the global RNG state).

None

Returns:

Type Description
Tensor

(X, theta) where X is [n_samples, 4] ambient coordinates

Tensor

and theta is [n_samples, 2] intrinsic angles in [0, 2π).

igl.data.synthetic.make_flat_torus_labels(theta, *, task='regression_smooth')

Build labels for a flat-torus task from the intrinsic angles.

Parameters:

Name Type Description Default
theta Tensor

[N, 2] intrinsic coordinates from :func:make_flat_torus.

required
task str

One of:

  • "regression_smooth" (default): sin/cos of both angles, shape [N, 4]. No 0/2π seam discontinuity.
  • "hemisphere": binary classification on θ₁ > π.
  • "xor": binary XOR of the two quadrant indicators.
'regression_smooth'

Returns:

Type Description
Tensor

Tensor of labels (float for regression, long for classification).

Raises:

Type Description
IGLConfigError

For an unknown task.

igl.data.synthetic.make_swiss_roll(n_samples, *, noise=0.0, seed=None)

Swiss roll in : x(t, h) = (t cos t, h, t sin t).

Intrinsic dimension: 2 (t and h).

Parameters:

Name Type Description Default
n_samples int

Number of points to sample.

required
noise float

Additive Gaussian noise std in ambient space.

0.0
seed int | None

Optional RNG seed.

None

Returns:

Type Description
Tensor

(X, params) where X is [n_samples, 3] and params is

Tensor

[n_samples, 2] intrinsic (t, h) coordinates.

igl.data.synthetic.make_moons(n_samples, *, noise=0.1, seed=None)

Two interleaving half-moons in .

Parameters:

Name Type Description Default
n_samples int

Total number of points (split evenly between moons).

required
noise float

Additive Gaussian noise std.

0.1
seed int | None

Optional RNG seed.

None

Returns:

Type Description
Tensor

(X, y) where X is [n_samples, 2] and y is [n_samples]

Tensor

binary class labels (long dtype).

igl.data.synthetic.embed_in_high_dim(x, *, target_dim, seed=None)

Embed low-D points in target_dim via padding + random orthogonal rotation.

Parameters:

Name Type Description Default
x Tensor

Low-D data [N, d].

required
target_dim int

Ambient dimension D. Must satisfy D >= d.

required
seed int | None

Optional RNG seed.

None

Returns:

Type Description
Tensor

[N, D] embedded points.

Raises:

Type Description
IGLConfigError

If target_dim < x.shape[1].

igl.data.synthetic.make_spd_dataset(n_samples, *, d=4, n_classes=2, class_separation=1.0, seed=None)

Generate batches of SPD matrices labelled by class.

Each class is a mixture of two ingredients:

  1. A class-specific mean SPD C_k = exp(S_k) where S_k is a symmetric matrix sampled once per class (controlled by class_separation).
  2. Per-sample noise added in log-Euclidean tangent space, then mapped back to the manifold via :func:igl.spd.matrix_exp_sym.

Useful as a labelled toy dataset for :class:igl.spd.IGLReconSPDClassifier smoke tests and synthetic experiments — no MOABB / mne dependency.

Parameters:

Name Type Description Default
n_samples int

Total number of SPD matrices.

required
d int

Side length of each SPD matrix.

4
n_classes int

Number of distinct class means.

2
class_separation float

Std of the per-class log-Euclidean mean. 0 → all classes collapse to I; 1.0 → well-separated.

1.0
seed int | None

Optional RNG seed.

None

Returns:

Type Description
Tensor

(X, y) where X is [n_samples, d, d] SPD tensors and y

Tensor

is a [n_samples] long-tensor of class labels.