On Sonntag 14 Februar 2010, Bogdan Opanchuk wrote:
The project I am working on relies heavily on batched
3D FFTs. You all
know about the situation with CUFFT and PyCuda, and I decided that I
must put some effort in it. So, I ported Apple's OpenCL implementation
of FFT to PyCuda. The result you can see on
. It is currently in beta stage,
but I will work on it - in case somebody needs it. It works with
experimental PyCuda branch, the one with complex numbers support.
In addition, the package contains CUFFT wrapper by Ying Wai (Daniel)
Fan (it appeared in this mail list, I just added class for plan and
batch support), if you prefer nVidia's implementation. I used it just
to test my code.
Main problems at the moment:
- On some problem sizes it is much slower than CUFFT (see table on PyPi
page). - The library requires heavy testing for different problem sizes
and videocards For other plans, see TODO.txt in package.
To sum it all up: the development of this library depends mainly on
your reaction. Any comments/bug reports/propositions are appreciated.
For my personal purposes it works fine already )
This looks like awesome work. Thanks for making it available. To make
your life easier, I've just merged complex number support into PyCUDA
proper. If you'd like pycudafft to be part of PyCUDA itself, we can
discuss how that could happen. Otherwise, please let me know if there's
anything you need from PyCUDA.
Thanks again for your work,