Hi Andreas, (Hi Bryan),
Last December I was asking you about CodePy. See how far I went with
it with your help: http://dev.math.canterbury.ac.nz/home/pub/17/
Note, there is no CUDA or thrust code in the CodePY example. There
seemed to be no easy way to do it. I'll paste some excerpts from our
emails from Dec 16,17,
I: "My next question, suppose MODULE_CODE contains some thrust code and
would have to be compiled through nvcc (and g++). Simply using
nvcc_toolchain = codepy.toolchain.guess_nvcc_toolchain()
cmod = extension_from_string(nvcc_toolchain, "module", MODULE_CODE)
Didn't work of course. Do you have a similar function that takes a
STRING, both host_toolchain, and nvcc_toolchain, and compiles it? If
not, what is the right way?"
B: "NVCC can't parse Boost well, so I have to segregate the host code
which binds it to Python from the CUDA code compiled by NVCC.
The way I do this is to create a codepy.bpl.BoostPythonModule which
has the host entry point (and will be compiled by g++). Then I create
a codepy.cuda.CudaModule which references the BoostPythonModule
(making this link explicit lets codepy compile them together into a
single binary). Then I call compile on the CudaModule, which should
to the right thing. You can see code that does this here:
A: "I'd just like to add that I recently split out the code generation bits
of codepy and called them cgen.
(but compatibility wrappers that wrap cgen into codepy will stay in
place for a while)"
Has something changed since then?
ctypes works fine and it has the advantage of not having to use boost.
It's just an unaltered C++/CUDA/thrust code. Invoking systems' nvcc
was as easy as gcc. As for the caching, I check the hash of the source
string: if it has changed, I build and load a new (small!) .so module
with the hash value attached to the name. The pointers into the old
.so get garbage collected and unloaded; if they are stored in a tmp
folder -- the .so files get deleted eventually.
I remember you preferred Boost::Python to ctypes in general for its
better performance; but if we make calls to ctypes library rarely,
small additional overheads, if there were some, aren't important.
A better programme would be to port all the algorithms and interfaces
of Thrust to PyCUDA. The only reason I need thrust for example, is
that it can find me the extremum element's _location_ which I still
don't know how to do in PyCUDA.
On Thu, May 24, 2012 at 11:58 AM, Andreas Kloeckner
On Thu, 24 May 2012 10:51:55 +1200, Igor <rychphd(a)gmail.com> wrote:
Andreas, thanks, but it currently implies Linux,
I'll see if I can
make it work on Windows. Or maybe I'll submit and someone will try it
on Windows. I just need to extract it from Sage into a plain Python
script. Give me a couple of days.
I would actually suggest you use the codepy machinery to let nvcc do the
compilation--this has the advantage that a) there is code out there that
makes this work on Windows (Bryan?) and b) you get compiler caching for
All you'd need to do is build an analog of extension_from_string, say
ctypes_dll_from_string. Just imitate this code here, where
compile_from_string does all the hard work:
In any case, even if you can make something that's Linux-only, it would
likely help a big bunch of people. Windows support can always be added