I've been trying to use PyCUDA to pass data to and from a GPU for
processing with the device functions in the free version of the CULA
toolkit . Based upon past postings to the list and the ctypes
approach used in parret , I gather that this is possible with CUDA
3.0. However, it seems that the CULA device functions I invoke are
having no effect upon the data copied over to the CPU. (See attached
Has anyone managed to do the above successfully? I'm using PyCUDA
0.94rc with CUDA 3.0 on 64-bit Linux.
Sorry for the delay, we found a workaround and proceeded past this problem,
but have since decided to revisit this error. Below is the error and the
relavent bit of code we are using. Our workaround (commented out) is to
manually pass as an 'int'.
garrett@alienbox:~/Dropbox/CULA$ python cula_debugging.py
cula_gpuarray: 1 2 4
allocated at: 1048832 1048832 with pitch 64
Traceback (most recent call last):
File "cula_debugging.py", line 2383, in <module>
File "cula_debugging.py", line 2352, in gpu_devsyev
File "/usr/local/lib/python2.6/dist-packages/pycuda/gpuarray.py", line
115, in get
Boost.Python.ArgumentError: Python argument types in
did not match C++ signature:
memcpy_dtoh(boost::python::api::object dest, unsigned int src)
Relevant python code:
_culaDeviceMalloc = libcula.culaDeviceMalloc
_culaDeviceMalloc.restype = int
_culaDeviceMalloc.argtypes = [ctypes.POINTER(ctypes.c_int),
_culaDeviceFree = libcula.culaDeviceFree
_culaDeviceFree.restype = int
_culaDeviceFree.argtypes = [ctypes.c_int]
# CULA allocation object
self.datapointer = data
self.pitch = pitch
print 'cula_gpuarray: %d %d %d' % (m,n,dtype.itemsize)
mem = ctypes.c_int()
pitch = ctypes.c_int()
print '%s %s' % (mem, pitch)
alloc = CULAAllocation(mem,pitch)
print 'allocated at: %d %d with pitch %d' % (alloc.datapointer.value,
# A = GPUArray(n,dtype,gpudata=int(alloc))
A = GPUArray(n,dtype,gpudata=alloc)
# A = GPUArray((m,n),dtype,gpudata=int(alloc))
A = GPUArray((m,n),dtype,gpudata=alloc)
On Fri, May 7, 2010 at 2:00 PM, <pycuda-request(a)tiker.net> wrote:
> Send PyCUDA mailing list submissions to
> To subscribe or unsubscribe via the World Wide Web, visit
> or, via email, send a message with subject or body 'help' to
> You can reach the person managing the list at
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of PyCUDA digest..."
> Today's Topics:
> 1. Re: How to manually free GPUarray to avoid leak?
> (Andreas Kl?ckner)
> Message: 1
> Date: Thu, 6 May 2010 19:26:28 -0400
> From: Andreas Kl?ckner <lists(a)informa.tiker.net>
> To: pycuda(a)tiker.net
> Subject: Re: [PyCUDA] How to manually free GPUarray to avoid leak?
> Message-ID: <201005061926.29645.lists(a)informa.tiker.net>
> Content-Type: text/plain; charset="iso-8859-15"
> On Mittwoch 05 Mai 2010, Louis Theran wrote:
> > I have a question about the .gpudata contract that I couldn't figure out
> > experimentation. If I construct gpuarrays with a call like
> > GPUArray(...,gpudata=xxx)
> > is it sufficient that
> > xxx.___int___()
> > and
> > xxx.free()
> > be defined for things to work out correctly? That's what I gleaned from
> > the documentation, but it didn't work when I tried it with a Python class
> > that had those two methods. (I wanted to allocate memory on the device
> > some other way.)
> Can you post the error message you got? I'm thinking it should work, but
> I'm wondering what I'm missing.
I used the binary installer to install pycuda on Windows 7. Everything was fine, I could import all the modules, etc. BUT, when I tried the pycuda example, specifically compiler, it raised RuntimeError "Could not find PyCUDA's C header files". So I went to compiler.py, modified it so that it returns only the correct path to the headers (I.e. I removed the linux2 condition). But then I get a bunch of other errors.
Traceback (most recent call last):
File "<pyshell#9>", line 1, in <module>
a_d = (2*a_gpu).get()
File "C:\Enthought Python\lib\site-packages\pycuda\gpuarray.py", line 279, in __rmul__
return self._axpbz(scalar, 0, result)
File "C:\Enthought Python\lib\site-packages\pycuda\gpuarray.py", line 160, in _axpbz
func = elementwise.get_axpbz_kernel(self.dtype)
File "<string>", line 1, in <lambda>
File "C:\Enthought Python\lib\site-packages\pycuda\tools.py", line 484, in context_dependent_memoize
result = func(*args)
File "C:\Enthought Python\lib\site-packages\pycuda\elementwise.py", line 323, in get_axpbz_kernel
File "C:\Enthought Python\lib\site-packages\pycuda\elementwise.py", line 99, in get_elwise_kernel
arguments, operation, name, keep, options, **kwargs)
File "C:\Enthought Python\lib\site-packages\pycuda\elementwise.py", line 85, in get_elwise_kernel_and_types
keep, options, **kwargs)
File "C:\Enthought Python\lib\site-packages\pycuda\elementwise.py", line 74, in get_elwise_module
File "C:\Enthought Python\lib\site-packages\pycuda\compiler.py", line 228, in __init__
arch, code, cache_dir, include_dirs)
File "C:\Enthought Python\lib\site-packages\pycuda\compiler.py", line 218, in compile
return compile_plain(source, options, keep, nvcc, cache_dir)
File "C:\Enthought Python\lib\site-packages\pycuda\compiler.py", line 89, in compile_plain
result, stdout, stderr = call_capture_output(cmdline, cwd=file_dir)
File "C:\Enthought Python\lib\site-packages\pytools\prefork.py", line 196, in call_capture_output
return forker.call_capture_output(cmdline, cwd)
File "C:\Enthought Python\lib\site-packages\pytools\prefork.py", line 53, in call_capture_output
% ( " ".join(cmdline), e))
ExecError: error invoking 'nvcc --cubin -arch sm_11 -IC:\Enthought Python\include\pycuda kernel.cu': status -1 invoking 'nvcc --cubin -arch sm_11 -IC:\Enthought Python\include\pycuda kernel.cu': nvcc fatal : nvcc cannot find a supported cl version. Only MSVC 8.0 and MSVC 9.0 are supported
For the last one, I guess I should use VS2008 instead of 2010.
I also get the error 'Keyword: 'Home'' when trying to run demo.py or hello_gpu.py. But I read somewhere this is a setuptools problem, I will try distribute instead.
I'm using Enthought Python distribution. Looking to try Python(x,y) if I still got stuck on this thing.
Thanks for reading.
I could not find a bugtracker, so I'm posting this patch on this mailing list.
PyCuda had problems finding its header files using Python 2.6 on Windows. This
is because it is installed in Lib\site-packages\pycuda, while the header files
reside in Include\pycuda. To solve this, I replaced _find_pycuda_include_path()
in compiler.py with the attached code
from imp import find_module
file, pathname, descr = find_module("pycuda")
from os.path import join, exists
installed_path = join(pathname, "..", "include", "pycuda")
installed_path2 = join(pathname, "..", '..', '..', "include", "pycuda")
development_path = join(pathname, "..", "src", "cuda")
development_path2 = join(pathname, "..", "..", "..", "src", "cuda")
usr_path = "/usr/include/pycuda"
usr_local_path = "/usr/local/include/pycuda"
prefix_path = join(sys.prefix, "include" , "pycuda")
if sys.platform == "linux2":
raise RuntimeError("could not find path to PyCUDA's C header files")
pardon the somewhat off-topic question, but I am looking for feedback on
a suitable laptop that
can serve as a good development system for a project using PyCUDA. Can
anything that is known to work well? I'm planning to use Linux, possibly
dual-booted with the
pre-installed OS. A MacBook Pro would probably be the upper limit in
terms of cost.
I do have a home desktop PC with a Nvidia G98 graphics card (8 cuda
cores @ 540 MHz)
but I think some laptops provide much more than that (then again I doubt
I would want a
100 W card in a laptop). Thanks for any helpful info!
I recently upgraded CUDA on an Ubuntu machine, and, well, everything broke.
I thought I had it working, but now scripts that I remember being fine are
crashing ( actually, most scripts are crashing with 'launch failure'). Its
likely my own **** fault, but I went ahead and tried to upgrade PyCUDA to
see if that would fix anything. Well, I went back and tried the usual steps
*** CUDA_ROOT not set, and nvcc not in path. Giving up.
when I tryed to do "sudo make install".
So, I apologize for the spam but at the moment I'm just too tired to be able
to figure out what to do myself. Anyone have any links or solved this
problem before ?
On Fri, 21 May 2010 13:24:51 +0100, "Samuel Powell" <spowell(a)medphys.ucl.ac.uk> wrote:
> Hi Andreas,
> You are correct in your assertion.
> I followed your advice by packaging some of the pyCuda example scripts using
> py2exe before executing them. In the majority of cases, I found no problem
> executing the packaged code.
> If, as you suggest, the launch failure is likely caused by a segfault on the
> device, I think the best course of action is for me to dissect the CUDA
> kernel until it runs correctly, thus identifying where the bad addressing
> comes into play.
> Would you agree with this analysis?
That sounds plausible.
Good luck with your bug hunting,
PS: Please keep the list cc'd when replying. Thx!
Dnia 2010-05-20, czw o godzinie 04:40 -0400, Yaroslav Halchenko pisze:
> lintian is your friend -- make sure you address all the warnings one
> way or another -- I looked at pytools briefly:
> $> lintian pytools_10_amd64.changes
> W: pytools source: non-native-package-with-native-version
> W: pytools source: build-depends-on-1-revision build-depends: python (>= 2.5-1)
> W: pytools source: build-depends-on-1-revision build-depends: python-dev (>= 2.5-1)
> W: python-pytools: extended-description-line-too-long
> W: python-pytools: possible-unindented-list-in-extended-description
> W: python-pytools: wrong-section-according-to-package-name python-pytools => python
> W: python-pytools: copyright-without-copyright-notice
> W: python-pytools: binary-without-manpage usr/bin/logtool
> W: python-pytools: binary-without-manpage usr/bin/runalyzer
> W: python-pytools: binary-without-manpage usr/bin/runalyzer-gather
Thanks for advice - I have removed all problems pointed by lintian.
> also I bet they would work fine with python >= 2.5 so why
> XS-Python-Version: 2.5
I also managed to create private repository.
Now anyone using Debian with architecture amd64 can add
following lines to /etc/apt/sources.list
deb http://www.bogomips.w.tkb.pl . .
deb-src http://www.bogomips.w.tkb.pl . .
(there are two dots separated by space)
then `apt-get update && apt-get install python-pycuda`
and PyCUDA should be installed.
With Ubuntu there can be problem with missing package
libcuda1 - but I would like for someone to try it
nonetheless (just download *.deb files and dpkg -i *.deb)
and report whether packages are working.
Tomasz Rybak <bogomips(a)post.pl> GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860