BUG: Fix bug with tolerances #13342

larsoner · 2025-07-22T15:46:39Z

@Genuster see this failure:

https://app.circleci.com/pipelines/github/mne-tools/mne-bids-pipeline/5106/workflows/2b623eb6-112a-4f57-a71c-6962e53b76c3/jobs/82110

locally I can reproduce running pytest mne_bids_pipeline/ -k ERN (you might want to add a --download if you want to try).

This shouldn't fail the check, but it does. The values are generally around 1e11, so the atol here becomes meaningless. A rtol=1e-7 or 1e-6 is generally safer because our data are often stored in float32 so we end up operating near float32 tolerances (especially after a save-load round trip of a covariance for example).

Not 100% sure what to do about atol. A safer value for atol is zero, since in general we don't know the scale of the data. Really it should probably auto-scale based on the abs max of the data -- the current value of 1e-11 will pass for data that are tiny, e.g., if the values are all 1e-13 scale then any matrix will pass the test, which is bad. Here I've made atol=None to mean "use 1e-7 times the largest absolute value in the data". The same argument holds for eval_tol, which I've changed correspondingly.

WDYT? If you agree with these changes, can you take over? You can cherry-pick my commit or start over and open another PR with any additional changes. Or you can review and I can make changes etc.

Genuster

I replicate it locally also. Both cov-cov.T and np.max(cov-cov.T)=0.5 don't look pretty, but given np.max(cov)~2e17 and np.min(cov)~-4e12 your atol fix makes sense to me. In general I think this kind of stuff should be fixed/verified at the level of covariance estimation, but I won't dare going down there myself.

Genuster · 2025-07-22T17:48:24Z

mne/decoding/_ged.py

-        is_pos_semidef = np.all(scipy.linalg.eigvalsh(cov) >= -eval_tol)
+        evals = scipy.linalg.eigvalsh(cov)
+        if eval_tol is None:
+            eval_tol = 1e-7 * np.max(np.abs(evals))


eval_tol shouldn't be relative, I used it here to work around weird cov estimators that return very small near-zero negative eigenvalues that should've been zeros. Making it relative like that will pass big negative evals as zeros when maxabs is bigger than 1e7. I'd suggest returning it to 1e-15, or other very-close-to-zero value. It's the symmetricity check that break this BIDS test.

As of now this eval_tol for positive semi-definiteness isn't even used anywhere as it breaks CSP tests. Here, for example the eigenvalues of some covariance from CSP's own tests:
array([-2.26552620e-11, -8.03291570e-12, 1.91338985e-11, 3.35130268e+02, 3.71512488e+02, 4.06907570e+02, 5.12625554e+02, 5.31962633e+02, 5.73633394e+02, 6.16912893e+02, 6.50660956e+02, 6.90394012e+02, 7.58521774e+02, 7.89443189e+02, 8.61412270e+02, 9.75834757e+02, 1.09661655e+03, 1.12261246e+03, 1.20060347e+03, 1.52876843e+03, 1.53530751e+03, 1.56971614e+03, 1.90257161e+03, 2.11024084e+03, 2.19654123e+03, 2.46030208e+03, 2.74133710e+03, 2.88006187e+03, 3.15945830e+03, 3.73536293e+03, 4.00254104e+03, 4.33736444e+03, 4.95140195e+03, 6.73984573e+03, 7.38063829e+03, 7.99412226e+03, 9.55654794e+03, 1.21489569e+04, 1.34064310e+04, 1.46991932e+04, 2.54027572e+04, 2.90281782e+04, 3.60708863e+04, 5.87350846e+04, 6.38125549e+04, 6.83752532e+04, 9.33762844e+04, 1.15836350e+05, 1.53455748e+05, 5.43566324e+05, 8.57232749e+05])

Not sure whether we can pass the first three as zeros...

eval_tol shouldn't be relative,

I think we have to, IIRC "near zero" is relative to the largest singular value in these computations. To me those first all appear to be zero-like / in the numerical noise for the computation. Given that the magnitude of the data is ~1e3, I'd expect anything with an absolute value less than ~1e-5 to be zero-like. Just to take a concrete simple example by setting `:

+ is_cov = _is_cov_symm_pos_semidef(cov, check_pos_semidef=True)

We see:

mne/decoding/tests/test_csp.py:327: in test_regularized_csp X = csp.fit_transform(epochs_data, epochs.events[:, -1]) ... mne/decoding/base.py:135: in fit self._validate_covariances(covs) mne/decoding/base.py:263: in _validate_covariances raise ValueError( E ValueError: One of covariances is not symmetric (or positive semidefinite), check your cov_callable (Pdb) p np.linalg.eigvalsh(covs[0]) array([-4.63053370e-12, 7.02204761e-12, 4.77606729e-11, 3.35130268e+02, 3.71512488e+02, 4.06907570e+02, 5.12625554e+02, 5.31962633e+02, 5.73633394e+02, 6.16912893e+02, 6.50660956e+02, 6.90394012e+02, 7.58521774e+02, 7.89443189e+02, 8.61412270e+02, 9.75834757e+02, 1.09661655e+03, 1.12261246e+03, 1.20060347e+03, 1.52876843e+03, 1.53530751e+03, 1.56971614e+03, 1.90257161e+03, 2.11024084e+03, 2.19654123e+03, 2.46030208e+03, 2.74133710e+03, 2.88006187e+03, 3.15945830e+03, 3.73536293e+03, 4.00254104e+03, 4.33736444e+03, 4.95140195e+03, 6.73984573e+03, 7.38063829e+03, 7.99412226e+03, 9.55654794e+03, 1.21489569e+04, 1.34064310e+04, 1.46991932e+04, 2.54027572e+04, 2.90281782e+04, 3.60708863e+04, 5.87350846e+04, 6.38125549e+04, 6.83752532e+04, 9.33762844e+04, 1.15836350e+05, 1.53455748e+05, 5.43566324e+05, 8.57232749e+05]) (Pdb) p np.linalg.eigvalsh(covs[0] * 1e10) array([-2.88491603e-01, 4.17431701e-02, 2.01562850e-01, 3.35130268e+12, 3.71512488e+12, 4.06907570e+12, 5.12625554e+12, 5.31962633e+12, 5.73633394e+12, 6.16912893e+12, 6.50660956e+12, 6.90394012e+12, 7.58521774e+12, 7.89443189e+12, 8.61412270e+12, 9.75834757e+12, 1.09661655e+13, 1.12261246e+13, 1.20060347e+13, 1.52876843e+13, 1.53530751e+13, 1.56971614e+13, 1.90257161e+13, 2.11024084e+13, 2.19654123e+13, 2.46030208e+13, 2.74133710e+13, 2.88006187e+13, 3.15945830e+13, 3.73536293e+13, 4.00254104e+13, 4.33736444e+13, 4.95140195e+13, 6.73984573e+13, 7.38063829e+13, 7.99412226e+13, 9.55654794e+13, 1.21489569e+14, 1.34064310e+14, 1.46991932e+14, 2.54027572e+14, 2.90281782e+14, 3.60708863e+14, 5.87350846e+14, 6.38125549e+14, 6.83752532e+14, 9.33762844e+14, 1.15836350e+15, 1.53455748e+15, 5.43566324e+15, 8.57232749e+15])

i.e., if you scale the matrix, everything scales up. But in reality these data should have three zero eigenvalues because they come from data with:

(Pdb) p epochs.info ... projs: PCA-v1: on, PCA-v2: on, PCA-v3: on sfreq: 120.1 Hz > (Pdb) p mne.compute_rank(epochs) {'mag': 48} (Pdb) len(epochs.ch_names) 51

i.e., data with 51 channels and 3 projectors, i.e., the data should have rank 48 and all computations should be "zero-like eigenvalue" aware to compute things properly. Does that make sense?

Makes a lot of sense, thanks for explaining!
I'm pretty much ignorant about covariance estimation theory and its numerical heuristics (seeing negative evals was already a culture shock), but what you're saying is convincing about this eval_tol.

But in is_pos_cov_def check, is it also reasonable to expect that all the small positive evals will be no further than 1e7 from the largest eval? E.g., if the largest one is 1e15, and there are any evals smaller than 1e8, the covariance will not be considered positive definite.

But in is_pos_cov_def check, is it also reasonable to expect that all the small positive evals will be no further than 1e7 from the largest eval? E.g., if the largest one is 1e15, and there are any evals smaller than 1e8, the covariance will not be considered positive definite.

So for the case where the largest is 1e15, we can say:

Anything greater than 1e8 is a non-zero positive eigenvalue

Anything between 1e8 and -1e8 are "zero" eigenvalues (i.e., if such values exist, the matrix is not not positive definite anymore, just at best semidefinite)

Anything less than -1e8 is non-zero negative (i.e., if such values exist then the matrix is not positive semidefinite)

If you play around with some examples of high-dim (e.g., 52x52) matrices of different scale factors -- like the one I messed with above -- and different rank-reduced-ness, you should see that this works reasonably well.

Incidentally, this is also why when working with covariances, we do pre-scaling (or handle independently entirely) channels of different units -- MEG on the scale of ~1e-13 is in the numerical noise compared to EEG data on the scale of ~1e-6.

Anything between 1e8 and -1e8 are "zero" eigenvalues

😭😭

Ok! Then it all looks convincing to me!

@Genuster do you want to take over and try adding the proper checks? I started to but actually got a failure for example in SPoC where the cov matrix being checked was clearly not even positive-semidefinite (eigenvalues went just as far negative as they did positive!). So I think it might be better for you to try adapting the code to check whenever it can...

Ah, I see, will open another PR today/tomorrow then!

Genuster · 2025-07-22T17:50:32Z

mne/decoding/_ged.py

-    is_pos_def = np.all(scipy.linalg.eigvalsh(cov) > eval_tol)
+    evals = scipy.linalg.eigvalsh(cov)
+    if eval_tol is None:
+        eval_tol = 1e-7 * np.max(np.abs(evals))


Same here, I would put it back to 1e-15 to ensure that matrix is positive definite not because of numerical imprecision with zero eigenvalues, without making the tolerance too large when maxabs is very big

BUG: Fix bug with tolerances

be5a36e

larsoner requested a review from jasmainak as a code owner July 22, 2025 15:46

larsoner mentioned this pull request Jul 22, 2025

[pre-commit.ci] pre-commit autoupdate mne-tools/mne-bids-pipeline#1100

Merged

larsoner added the no-changelog-entry-needed label Jul 22, 2025

Genuster suggested changes Jul 22, 2025

View reviewed changes

Genuster approved these changes Jul 22, 2025

View reviewed changes

Genuster mentioned this pull request Jul 22, 2025

BUG: Fix GED tolerances and improve cov validation #13346

Merged

larsoner closed this in #13346 Jul 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BUG: Fix bug with tolerances #13342

BUG: Fix bug with tolerances #13342

Uh oh!

larsoner commented Jul 22, 2025

Uh oh!

Genuster left a comment

Uh oh!

Genuster Jul 22, 2025

Uh oh!

Genuster Jul 22, 2025 •

edited

Loading

Uh oh!

larsoner Jul 22, 2025

Uh oh!

Genuster Jul 22, 2025

Uh oh!

larsoner Jul 22, 2025

Uh oh!

Genuster Jul 22, 2025

Uh oh!

larsoner Jul 22, 2025

Uh oh!

Genuster Jul 22, 2025

Uh oh!

Genuster Jul 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

BUG: Fix bug with tolerances #13342

BUG: Fix bug with tolerances #13342

Uh oh!

Conversation

larsoner commented Jul 22, 2025

Uh oh!

Genuster left a comment

Choose a reason for hiding this comment

Uh oh!

Genuster Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

Genuster Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

larsoner Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

Genuster Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

larsoner Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

Genuster Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

larsoner Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

Genuster Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

Genuster Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Genuster Jul 22, 2025 •

edited

Loading

Genuster Jul 22, 2025 •

edited

Loading