-
Notifications
You must be signed in to change notification settings - Fork 1.4k
BUG: Fix bug with tolerances #13342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
BUG: Fix bug with tolerances #13342
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -55,26 +55,34 @@ def _smart_ged(S, R, restr_mat=None, R_func=None): | |
|
||
|
||
def _is_cov_symm_pos_semidef( | ||
cov, rtol=1e-10, atol=1e-11, eval_tol=1e-15, check_pos_semidef=True | ||
cov, rtol=1e-7, atol=None, eval_tol=None, check_pos_semidef=True | ||
): | ||
if atol is None: | ||
atol = 1e-7 * np.max(np.abs(cov)) | ||
is_symm = scipy.linalg.issymmetric(cov, rtol=rtol, atol=atol) | ||
if not is_symm: | ||
return False | ||
|
||
if check_pos_semidef: | ||
# numerically slightly negative evals are considered 0 | ||
is_pos_semidef = np.all(scipy.linalg.eigvalsh(cov) >= -eval_tol) | ||
evals = scipy.linalg.eigvalsh(cov) | ||
if eval_tol is None: | ||
eval_tol = 1e-7 * np.max(np.abs(evals)) | ||
is_pos_semidef = np.all(evals >= -eval_tol) | ||
return is_pos_semidef | ||
|
||
return True | ||
|
||
|
||
def _is_cov_pos_def(cov, eval_tol=1e-15): | ||
def _is_cov_pos_def(cov, eval_tol=None): | ||
is_symm = _is_cov_symm_pos_semidef(cov, check_pos_semidef=False) | ||
if not is_symm: | ||
return False | ||
# numerically slightly positive evals are considered 0 | ||
is_pos_def = np.all(scipy.linalg.eigvalsh(cov) > eval_tol) | ||
evals = scipy.linalg.eigvalsh(cov) | ||
if eval_tol is None: | ||
eval_tol = 1e-7 * np.max(np.abs(evals)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here, I would put it back to 1e-15 to ensure that matrix is positive definite not because of numerical imprecision with zero eigenvalues, without making the tolerance too large when |
||
is_pos_def = np.all(evals > eval_tol) | ||
return is_pos_def | ||
|
||
|
||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eval_tol
shouldn't be relative, I used it here to work around weird cov estimators that return very small near-zero negative eigenvalues that should've been zeros. Making it relative like that will pass big negative evals as zeros whenmaxabs
is bigger than 1e7. I'd suggest returning it to 1e-15, or other very-close-to-zero value. It's the symmetricity check that break this BIDS test.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now this
eval_tol
for positive semi-definiteness isn't even used anywhere as it breaks CSP tests. Here, for example the eigenvalues of some covariance from CSP's own tests:array([-2.26552620e-11, -8.03291570e-12, 1.91338985e-11, 3.35130268e+02, 3.71512488e+02, 4.06907570e+02, 5.12625554e+02, 5.31962633e+02, 5.73633394e+02, 6.16912893e+02, 6.50660956e+02, 6.90394012e+02, 7.58521774e+02, 7.89443189e+02, 8.61412270e+02, 9.75834757e+02, 1.09661655e+03, 1.12261246e+03, 1.20060347e+03, 1.52876843e+03, 1.53530751e+03, 1.56971614e+03, 1.90257161e+03, 2.11024084e+03, 2.19654123e+03, 2.46030208e+03, 2.74133710e+03, 2.88006187e+03, 3.15945830e+03, 3.73536293e+03, 4.00254104e+03, 4.33736444e+03, 4.95140195e+03, 6.73984573e+03, 7.38063829e+03, 7.99412226e+03, 9.55654794e+03, 1.21489569e+04, 1.34064310e+04, 1.46991932e+04, 2.54027572e+04, 2.90281782e+04, 3.60708863e+04, 5.87350846e+04, 6.38125549e+04, 6.83752532e+04, 9.33762844e+04, 1.15836350e+05, 1.53455748e+05, 5.43566324e+05, 8.57232749e+05])
Not sure whether we can pass the first three as zeros...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have to, IIRC "near zero" is relative to the largest singular value in these computations. To me those first all appear to be zero-like / in the numerical noise for the computation. Given that the magnitude of the data is ~1e3, I'd expect anything with an absolute value less than ~1e-5 to be zero-like. Just to take a concrete simple example by setting `:
We see:
i.e., if you scale the matrix, everything scales up. But in reality these data should have three zero eigenvalues because they come from data with:
i.e., data with 51 channels and 3 projectors, i.e., the data should have rank 48 and all computations should be "zero-like eigenvalue" aware to compute things properly. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes a lot of sense, thanks for explaining!
I'm pretty much ignorant about covariance estimation theory and its numerical heuristics (seeing negative evals was already a culture shock), but what you're saying is convincing about this
eval_tol
.But in
is_pos_cov_def
check, is it also reasonable to expect that all the small positive evals will be no further than 1e7 from the largest eval? E.g., if the largest one is 1e15, and there are any evals smaller than 1e8, the covariance will not be considered positive definite.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So for the case where the largest is 1e15, we can say:
If you play around with some examples of high-dim (e.g., 52x52) matrices of different scale factors -- like the one I messed with above -- and different rank-reduced-ness, you should see that this works reasonably well.
Incidentally, this is also why when working with covariances, we do pre-scaling (or handle independently entirely) channels of different units -- MEG on the scale of ~1e-13 is in the numerical noise compared to EEG data on the scale of ~1e-6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😭😭
Ok! Then it all looks convincing to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Genuster do you want to take over and try adding the proper checks? I started to but actually got a failure for example in
SPoC
where the cov matrix being checked was clearly not even positive-semidefinite (eigenvalues went just as far negative as they did positive!). So I think it might be better for you to try adapting the code to check whenever it can...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, will open another PR today/tomorrow then!