-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
ENH: add fast toeplitz matrix multiplication using FFT #11346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@InfProbSciX this sounds cool. Can you also post some speed test results that demonstrate the speedup? |
Yes! Forgot about that :) Some simple code to run these benchmarks: import numpy as np
from scipy.linalg import toeplitz, matmul_toeplitz
from numpy.testing import assert_allclose
def fast_sim(n):
c = np.random.normal(size=n)
r = np.random.normal(size=n)
b = np.random.normal(size=n)
result = matmul_toeplitz((c, r), b)
def naive_sim(n):
c = np.random.normal(size=n)
r = np.random.normal(size=n)
b = np.random.normal(size=n)
expected = toeplitz(c, r) @ b
# %timeit fast_sim(10000)
# %timeit naive_sim(10000) Results on my laptop:
|
Co-Authored-By: Matt Haberland <mdhaber@mit.edu>
I verified the performance improvement and accuracy on my machine. Impressive! |
r, c, b, dtype, b_shape = _validate_args_for_toeplitz_ops( | ||
c_or_cr, b, check_finite, keep_b_shape=False) | ||
n, m = b.shape | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't check the math here, but it looks pretty straightforward, and it does appear to work.
from numpy.testing import assert_allclose | ||
np.random.seed(42) | ||
|
||
class ToeplitzTestCase: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests look appropriate.
@InfProbSciX Thanks a lot for such a thorough pull request. I wish we get more of such PRs. I am a bit pessimistic about adding this to the public API since to people who need this, they are quite well-known (in particular the netlib links you gave are famous). The use case is for people who are aware that they have a toeplitz matrix and not aware that there are more performant ways to multiply these matrices. That number of people to me doesn't appear to be too many to add another function to the public API. For those who are aware of these facts, it is basically a couple of lines. This applies to circulant and hankel matrices too. Don't get me wrong; I think this is great to have at one's bag of tricks but the curse of SciPy is that once something goes in it takes years to take it out if we decide to have a different way of bundling these together. So I tend to be on the conservative side. But I'll wait for others to chime in too. |
It's true that this is just a few lines, but scipy has functionality for Moreover, there are some implementations of this in the public domain (referenced), that perhaps wouldn't be necessary if scipy had the functionality to begin with. The special matrices in scipy are mentioned but I feel like not a lot can be done with them as the most important operations that make use of their special structure are missing. I actually want to add the other functions as well, like the |
That's certainly true and makes a good segway to our mantra "Try to avoid small tricks as public functions". Note that solve_toeplitz uses a LAPACK routine behind the scenes and in my opinion still doesn't fit our mantra but historically these additions came in a very heterogeneous way and now we don't dare to decommission them. That is also why BLAS/LAPACK maintainers don't add every function to the catalogue because they are thinking along the same manner. There are many things I wish we had, for example, one of the longest discussions we had is to have a I have a PR waiting for All these to say, even though the functionality is nice, we might not want to add because it pollutes the namespace and also makes it very difficult a few years later to maintain if we figure out a better way to do this. I actually expect this to be handled by the matmul operator of |
I definitely see what you're saying - and I completely agree that we don't want to pollute the public namespace. The main reason I wrote this the way I did was consistency - as I don't see this as a small trick, as I also want to write the determinant functions which wouldn't be such a trivial implementation. I'm trying to do this so that the scipy toeplitz/special matrices section isn't so incomplete. If the others are unhappy with the public function as well, here are some proposals to make this hidden from the user:
I want the special matrices section to be more complete, and I completely understand that the implementation looks trivial, but I don't see the other functions that I'd like to contribute in the future being just tricks, as these are great routines to use. There's a good use case for it - toeplitz matrices are common in statistics, particularly stationary processes - perhaps more people would come to use the routines if they were more easily accessible. I see numpy as the more minimalistic linalg package; scipy's design states that "A good rule of thumb is that if it’s covered in a general textbook on numerical computing ... it’s probably implemented in SciPy." I would like to really complete the special matrices part of scipy to live up to that, because at the moment, there's lots of stuff missing. |
There are other obvious implementations that currently SciPy doesn't support for matrix multiplication say, diagonal matrices, triangular matrices etc. even when they have a corresponding BLAS routines for them say I would actually put the actual lines into the documentation of The main issue here for a possible generalization is that discovering a Toeplitz matrix takes more time than just straightforward matrix multiplication hence the benefit of the whole thing is reduced down to the user who is aware that they have a toeplitz matrix at hand and would likely to look at |
As another approach for Toeplitz matrix-vector products, there's an implementation for symmetric Toeplitz matrices in the It sounds like one solution to not wanting all the special matrix implementations in scipy proper would be to make a package/repository of I hope at least some of this was useful to someone. |
@ilayn The log determinant calculation seems to be a straightforward implementation that utilises the levinson algorithm. Would you be okay if I modified the existing hidden cython implementation by adding an additional flag like There wouldn't be any additional public functions and if everyone else agrees, I'll put the matrix multiplication into the |
Hi @DWesl, thanks for mentioning fa PyLops. I second you here, and this is the main reason I started developing this library some years ago: I didn’t feel that scipy would ever allow/be the best place to host many specialized ‘matrix’ implementations. It is true that geophysics is currently the main target (given that it is a very data hungry field) but there is a very extensive set of basic functionalities and we have been even discussing about the possibility of moving all geophysics specific components out to a new library and keep PyLops for just general purpose operators. If people are interested we could speed this up. Regarding this PR, @InfProbSciX, I am glad to see that more and more people feel the need to optimizing matrix-vector multiplications for special matrices. My main question is: why not subclassing a LinearOperator already (as you also mention eventually users would do that...), this may seem a break in the current trend in scipy but it may pay off in the long run. For example if you want to add log_det* as you say, you will need another public method and raise the same concerns as in this thread. If instead you went for a Toeplitz class this could be just a method of the class, so you limit what it is effectively exposed in the API? Just a suggestion :) |
@mrava87 I'd love to add a Toeplitz class to SciPy but this adds more publicly accessible stuff in SciPy, so the decision is up to the devs really. This is the cleanest way to add the functionality imo. My suggestion for adding the |
I see your point. I was just thinking that in your case you still add a new public function (while in the case of making Toeplitz class you would add a class). Of course this is a bit different from current scipy structure so it’s obviously up to the devs. Regarding the det computation, my personal opinion as user is that I prefer more well defined methods with ‘atomic’ scope than fewer with of additional parameters that lead to many slightly different behaviors, but again I’m not a core dev of scipy so it’s just a comment from user point of view. |
Sorry for the delay, I will come back to this as soon as possible. In the meantime, perhaps, can whoever is available from @scipy/scipy-core-team also chime in so we get a different point of view ? |
This seems to be in good shape and from the history it looks like we're leaning towards merging this in its current form. Does anyone have objections to that? |
Don't merge yet, I'm working on a more detailed review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice addition to SciPy. I requested a bunch of changes inline, and have one more requested change, as follows.
The PR assumes that the Toeplitz matrix is square, but this is not necessary for the fast Toeplitz multiplication algorithm. The code can be made to work with an array with shape (nrows, ncols)
with just a few small changes.
First, the validation function will have to be modified to allow an input that is not square. This could be an additional boolean flag added to the arguments.
Then, adjust the sizes in the main code as follows.
In the code, there is already n, m = b.shape()
. Define
# The shape of the Toeplitiz matrix is (nrows, ncols)
nrows = len(c)
ncols = len(r)
p = nrows + ncols - 1 # equivalent to len(embedded_col)
In the complex case of the if
statement, change the second two lines to:
fft_b = fft(b, n=p, axis=0, workers=workers)
mat_times_b = ifft(fft_mat*fft_b, axis=0, workers=workers)[:nrows, :]
and in the real case, change the two lines to
fft_b = rfft(b, n=p, axis=0, workers=workers)
mat_times_b = irfft(fft_mat*fft_b, axis=0, workers=workers,
n=p)[:nrows, :]
Finally, change the return
statement to
return mat_times_b.reshape(nrows, m)
I think that covers all the needed changes to the code, but double-check to be sure!
Tests with nonsquare inputs will have to be added, too.
Thanks for the comments, I'll make these changes. |
@WarrenWeckesser I've made the suggested changes - thank you for such a detailed comment on the case of non-square matrices :) (I've also added a few tests for non-square matrices.) |
All the CI checks now pass. |
Hi @rgommers @WarrenWeckesser I'm just following up to see if there's any further changes that you'd suggest for this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I requested some nitpicky changes, but otherwise, this looks good.
Co-authored-by: Warren Weckesser <warren.weckesser@gmail.com>
@WarrenWeckesser All done! |
I've given this the green check mark, but I'll wait a few days before merging, in case anyone else want to comment first. When it comes time to merge, the commits should be squashed to a single commit. |
@WarrenWeckesser Just wanted to check if you'd like me to squash these commits (or if GitHub's squash and merge is fine). |
@InfProbSciX, squashed and merged, thanks! I undid the change to |
I added a "thanks" to https://github.com/scipy/scipy/wiki/*THANKS.txt-additions-modifications-for-1.6.0*, and a release note in https://github.com/scipy/scipy/wiki/Release-note-entries-for-SciPy-1.6.0. Feel free to edit those as you see fit. |
This commit adds a function,
matmul_toeplitz
, for efficient (loglinear time) matrix-matrix products involving a Toeplitz matrix. The Toeplitz matrix is embedded in a circulant matrix, which is diagonalisable using its discrete Fourier transform. It follows that matrix-vector products and hence matrix-matrix products can be computed using the FFT.This is described in this book, this MIT page and in this math.stackexchange post.
Further, I split some code from the
solve_toeplitz
function into a cleaning and validation function_validate_args_for_toeplitz_ops
because the cleaning needed for both functions is very similar. In the future, if someone implements other fast methods such asmatmul_circulant
,log_det_circulant
,log_det_toeplitz
and so on, they could rely on the same validation function.There are three reference implementations (that I could find) available online which are written in python, these are:
Available online
Available online
Implementation available online
My implementation seems to be consistent with all of them (the code isn't a direct 'copy-paste' of any of them - the github implementations have MIT licenses and I've referenced these in the function), and I've tried to remain consistent with existing Toeplitz functions in SciPy.
The possible use-cases for this function include fast matrix-matrix multiplications, fast solving using preconditioned conjugate gradient methods (which can readily be built using the
scipy.sparse.linalg.LinearOperator
class and thescipy.sparse.linalg.cg
function).Tests show that the implementation is correct, and the tests corresponding to
solve_toeplitz
also all pass. The code passes PEP8 checks locally.