Hi,
first of all: nice piece of work Andreas!
Now to my problem: I installed CUDA 2.1 on my T61 laptop computer with a
Nvidia Quadro NVS 140M graphics card w/ 128MB memory.
git version of PyCUDA (0.93 beta I assume) installs fine but I get some
errors when I try to run tests. First test_driver:
/opt/PyCUDA/pycuda-git/test$python test_driver.py
...E....Eterminate called after throwing an instance of 'cuda::error'
what(): cuMemFree failed: invalid context
Aborted
Or runnung gpuarray speed test:
/opt/PyCUDA/pycuda-git/test$python undistributed/test_gpuarray_speed.py
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
Traceback (most recent call last):
File "undistributed/test_gpuarray_speed.py", line 83, in <module>
main()
File "undistributed/test_gpuarray_speed.py", line 27, in main
b = gpuarray.zeros((size,), dtype=numpy.float32)
File
"/usr/lib/python2.5/site-packages/pycuda-0.93beta-py2.5-linux-x86_64.egg/pycuda/gpuarray.py",
line 409, in zeros
result = GPUArray(shape, dtype, stream, allocator)
File
"/usr/lib/python2.5/site-packages/pycuda-0.93beta-py2.5-linux-x86_64.egg/pycuda/gpuarray.py",
line 75, in __init__
self.gpudata = self.allocator(self.size * self.dtype.itemsize)
pycuda._driver.MemoryError: cuMemAlloc failed: out of memory
-----------------------------------------------------------
PyCUDA WARNING: I'm being asked to destroy a
context that's part of the current context stack.
-----------------------------------------------------------
I will pick the next lower active context from the
context stack. Since this choice is happening
at an unspecified point in time, your code
may be making false assumptions about which
context is active at what point.
Call Context.pop() to avoid this warning.
-----------------------------------------------------------
If Python is terminating abnormally (eg. exiting upon an
unhandled exception), you may ignore this.
-----------------------------------------------------------
test_gpuarray only threw some warnings:
python test_gpuarray.py
/usr/lib/python2.5/site-packages/pycuda-0.93beta-py2.5-linux-x86_64.egg/pycuda/gpuarray.py:498:
UserWarning: behavior change: arange guessed dtype other than float32.
suggest specifying explicit dtype.
warn("behavior change: arange guessed dtype other than float32. "
..................
----------------------------------------------------------------------
Ran 18 tests in 14.756s
OK
Here are some debug information you might need:
In [61]: pycuda.autoinit.device.get_attributes()
Out[62]:
{pycuda._driver.device_attribute.MAX_THREADS_PER_BLOCK: 512,
pycuda._driver.device_attribute.MAX_BLOCK_DIM_X: 512,
pycuda._driver.device_attribute.MAX_BLOCK_DIM_Y: 512,
pycuda._driver.device_attribute.MAX_BLOCK_DIM_Z: 64,
pycuda._driver.device_attribute.MAX_GRID_DIM_X: 65535,
pycuda._driver.device_attribute.MAX_GRID_DIM_Y: 65535,
pycuda._driver.device_attribute.MAX_GRID_DIM_Z: 1,
pycuda._driver.device_attribute.MAX_SHARED_MEMORY_PER_BLOCK: 16384,
pycuda._driver.device_attribute.TOTAL_CONSTANT_MEMORY: 65536,
pycuda._driver.device_attribute.WARP_SIZE: 32,
pycuda._driver.device_attribute.MAX_PITCH: 262144,
pycuda._driver.device_attribute.MAX_REGISTERS_PER_BLOCK: 8192,
pycuda._driver.device_attribute.CLOCK_RATE: 800000,
pycuda._driver.device_attribute.TEXTURE_ALIGNMENT: 256,
pycuda._driver.device_attribute.GPU_OVERLAP: 1,
pycuda._driver.device_attribute.MULTIPROCESSOR_COUNT: 2}
In [63]: pycuda.autoinit.device.compute_capability()
Out[63]: (1, 1)
In [65]: pycuda.autoinit.device.total_memory()
Out[65]: 133496832
In [66]: pycuda.autoinit.device.make_context()
terminate called after throwing an instance of 'cuda::error'
what(): cuMemFree failed: invalid context
Aborted
All the other tests were fine, so PyCUDA works nicely.
Could you please check where the "invalid context" error comes from?
And maybe add a memory checker to your tests so we low-mem GPU user are
not discriminated ;-).
Keep up the good work!
Philipp
Hi Andreas,
Is there any support for functions like the CUDA profiler, which
currently will output the number of non-coalesced loads, etc.? Are these
functions available through the driver API?
Thanks,
Nicholas
I'm not sure, I saw similar behavior with empty headers when I did a
sys.exit() in some places, though in others it worked fine. I'll tell you if
I find out more. I did a tail -f and it looks like the file is only zeroed
once, so it might be waiting for a context pop call to write everything; I'm
not sure.
btw, /usr/local/cuda/doc/CUDA_Profiler_*.txt has more options / details than
what Andreas listed.
regards,
Nicholas
I took care of that fact when I did it :P . I was able to fix the
> problem though. I dont know why, but the error occured when I replaced
>
> /import pycuda.driver as cuda
> import //pycuda.autoinit
> /
>
> *with *
>
>
> /import pycuda.driver as cuda
> import pycuda.autoinit
>
> cuda.init()
> assert cuda.Device.count() >= 1
>
> dev = cuda.Device(0)
> ctx = dev.make_context()/
>
> Is this a bug or did I miss something in the above?
>
> Siddhardh
>
The test failed because math.ldexp expect an int or long as second
argument. Otherwise TypeError raised.
Change the line in test_math.py from
self.assert_(math.ldexp(a[i], a2[i]) == b[i])
to
self.assert_(math.ldexp(a[i], int(a2[i])) == b[i])
will get it passed.
PyCUDA: git version
Python: 2.6.1
Hi,
I tried to following code because I was getting very weird errors in my
main program. I create a variable ed which is initialized to zero on
host and then copied over to the device and back to host without any
manipulation. For some reason unknown to me, the value of the variable
changes. Also, is there a better way to copy the variable back other
than creating a numpy array for it? If I try copying it to a simple
variable, it says "TypeError: expected a writeable buffer object".
/ 1 import pycuda.driver as cuda
2 import pycuda.autoinit
3
4 cuda.init()
5 assert cuda.Device.count() >= 1
6
7 dev = cuda.Device(0)
8 ctx = dev.make_context()
9
10 import numpy
11
12 ed = numpy.float32(0.0)
13
14 edd = cuda.mem_alloc(ed.nbytes)
15
16 cuda.memcpy_htod(edd, ed)
17
18 result = numpy.array([0.0])
19 cuda.memcpy_dtoh(result, edd)
20 print ed
21 print result
/
*On running the program, I get the following output*
/
msid ~/python $ python cudaTest.py
0.0
[-0.00476747]
-----------------------------------------------------------
PyCuda WARNING: I'm being asked to destroy a
context that's part of the current context stack.
-----------------------------------------------------------
I will pick the next lower active context from the
context stack. Since this choice is happening
at an unspecified point in time, your code
may be making false assumptions about which
context is active at what point.
Call Context.pop() to avoid this warning.
-----------------------------------------------------------
If Python is terminating abnormally (eg. exiting upon an
unhandled exception), you may ignore this.
-----------------------------------------------------------
terminate called after throwing an instance of 'cuda::error'
what(): cuCtxPushCurrent failed: invalid value
Aborted
/
Could someone please tell me why this is happening(pardon me if this is
common knowledge)
Thank you.
Siddhardh
Salut
> Mouai, en regardant le code, a priori, cela n'arrive que
> si l'entête ne contient pas une taille qui serait réelle...
Ce qui me semble, serais une attaque peu complexe et efficace!!! A patcher
absolument.
> Non, c'est le retassement en cas où il y avait des données
> résiduelles (paquet suivant) à la fin du traitement du paquet
> courant. Je ne dis pas qu'il n'y a pas un bug. Faudra que tu
> me file un exe qui fasse planter systématiquement.
pas vraiment d'exec, c'est du code 100% python, mais je peux te filer quelque
chose si tu veux.
a demain
I keep getting warning for the deprecated md5 library in python 2.6.
/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/pycuda-0.93beta-py2.6-macosx-10.5-i386.egg/pycuda/compiler.py:23:
DeprecationWarning: the md5 module is deprecated; use hashlib instead
import md5
I lifted a patch off another site. if you change line 23+ in
compiler.py to the following, you can get rid of the warnings (sorry
for not generating a diff). I compiled and tested. It seems to work
fine.
# import md5
# checksum = md5.new()
try:
import hashlib
checksum = hashlib.md5()
except ImportError:
# for Python << 2.5
import md5
checksum = md5.new()
PyCuda 0.93 (~ 1 week old from git),
boost 1.38
python 2.6.1
OSX 10.5.6
--
Dr. Michael A. Freitas
Research Assistant Professor
Ohio State University Medical Center
906 Biomedical Research Tower
460 West 12th Avenue
Columbus, OH 43210
email (OSU): freitas.5(a)osu.edu
phone: 614-688-8432
fax: 614-688-8675
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ok, for someone new to python(but very familiar with perl, java, php)
and new to CUDA. I have been able to setup my test enviroment and get
the test_driver.py and all to run properly.
I would like to help document(while I learn) how to use PyCuda to build
libraries of algorithms to run in a GPU enviroment. My eventual goal is
to be able to integrate PyCuda into a distributed computing management
system such as Boinc.
I'm willing to document my steps if someone can help figure out the
CUDA/PyCUDA/Python side of things.
- -William
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAknBATUACgkQZ63Azu7rahqZlACfaPJkmlesR6nn97jR/QtjCP+G
v1cAn1nwHM/VwVt4D96YHRq5Qg6Kb36M
=/94f
-----END PGP SIGNATURE-----
Hi, I'm invoking a kernel like using pycuda.driver.Out and using the
time_kernel=True argument to return the time in seconds. Does this
time include the device-to-host copy time?
Thanks! I'm happy to report a preliminary 165x speed up over our
existing radar imaging implementation thanks to PyCUDA, developed in 5
days.