Hi,
first of all: nice piece of work Andreas!
Now to my problem: I installed CUDA 2.1 on my T61 laptop computer with a
Nvidia Quadro NVS 140M graphics card w/ 128MB memory.
git version of PyCUDA (0.93 beta I assume) installs fine but I get some
errors when I try to run tests. First test_driver:
/opt/PyCUDA/pycuda-git/test$python test_driver.py
...E....Eterminate called after throwing an instance of 'cuda::error'
what(): cuMemFree failed: invalid context
Aborted
Or runnung gpuarray speed test:
/opt/PyCUDA/pycuda-git/test$python undistributed/test_gpuarray_speed.py
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
Traceback (most recent call last):
File "undistributed/test_gpuarray_speed.py", line 83, in <module>
main()
File "undistributed/test_gpuarray_speed.py", line 27, in main
b = gpuarray.zeros((size,), dtype=numpy.float32)
File
"/usr/lib/python2.5/site-packages/pycuda-0.93beta-py2.5-linux-x86_64.egg/pycuda/gpuarray.py",
line 409, in zeros
result = GPUArray(shape, dtype, stream, allocator)
File
"/usr/lib/python2.5/site-packages/pycuda-0.93beta-py2.5-linux-x86_64.egg/pycuda/gpuarray.py",
line 75, in __init__
self.gpudata = self.allocator(self.size * self.dtype.itemsize)
pycuda._driver.MemoryError: cuMemAlloc failed: out of memory
-----------------------------------------------------------
PyCUDA WARNING: I'm being asked to destroy a
context that's part of the current context stack.
-----------------------------------------------------------
I will pick the next lower active context from the
context stack. Since this choice is happening
at an unspecified point in time, your code
may be making false assumptions about which
context is active at what point.
Call Context.pop() to avoid this warning.
-----------------------------------------------------------
If Python is terminating abnormally (eg. exiting upon an
unhandled exception), you may ignore this.
-----------------------------------------------------------
test_gpuarray only threw some warnings:
python test_gpuarray.py
/usr/lib/python2.5/site-packages/pycuda-0.93beta-py2.5-linux-x86_64.egg/pycuda/gpuarray.py:498:
UserWarning: behavior change: arange guessed dtype other than float32.
suggest specifying explicit dtype.
warn("behavior change: arange guessed dtype other than float32. "
..................
----------------------------------------------------------------------
Ran 18 tests in 14.756s
OK
Here are some debug information you might need:
In [61]: pycuda.autoinit.device.get_attributes()
Out[62]:
{pycuda._driver.device_attribute.MAX_THREADS_PER_BLOCK: 512,
pycuda._driver.device_attribute.MAX_BLOCK_DIM_X: 512,
pycuda._driver.device_attribute.MAX_BLOCK_DIM_Y: 512,
pycuda._driver.device_attribute.MAX_BLOCK_DIM_Z: 64,
pycuda._driver.device_attribute.MAX_GRID_DIM_X: 65535,
pycuda._driver.device_attribute.MAX_GRID_DIM_Y: 65535,
pycuda._driver.device_attribute.MAX_GRID_DIM_Z: 1,
pycuda._driver.device_attribute.MAX_SHARED_MEMORY_PER_BLOCK: 16384,
pycuda._driver.device_attribute.TOTAL_CONSTANT_MEMORY: 65536,
pycuda._driver.device_attribute.WARP_SIZE: 32,
pycuda._driver.device_attribute.MAX_PITCH: 262144,
pycuda._driver.device_attribute.MAX_REGISTERS_PER_BLOCK: 8192,
pycuda._driver.device_attribute.CLOCK_RATE: 800000,
pycuda._driver.device_attribute.TEXTURE_ALIGNMENT: 256,
pycuda._driver.device_attribute.GPU_OVERLAP: 1,
pycuda._driver.device_attribute.MULTIPROCESSOR_COUNT: 2}
In [63]: pycuda.autoinit.device.compute_capability()
Out[63]: (1, 1)
In [65]: pycuda.autoinit.device.total_memory()
Out[65]: 133496832
In [66]: pycuda.autoinit.device.make_context()
terminate called after throwing an instance of 'cuda::error'
what(): cuMemFree failed: invalid context
Aborted
All the other tests were fine, so PyCUDA works nicely.
Could you please check where the "invalid context" error comes from?
And maybe add a memory checker to your tests so we low-mem GPU user are
not discriminated ;-).
Keep up the good work!
Philipp
Hello,
I went through the pycuda documentation and it seems the equivalent function
is
pycuda.driver.Context.synchronize() .
Is this accurate?
I need this to block kernel calls in python, to measure kernel execution
time.
thanks
Siddharth
Dear Andreas-
First, thank you for an extraordinary and ground breaking effort to
bring CUDA to the masses.
I have an ongoing installation issue closely related to the
problem/solution Paul Rigor generously contributed a few weeks back.
Running FC10 x86_64 with driver 180.51, CUDA 2.2, re-built Boost 1.39
after removing 1.37 (from Paul's instructions of ) and the just
released 0.93RC3. I can get thru configure.py, manually fix the
siteconf.py file, do a make install and then when running the first
test, test_driver.py, get the error, "ImportError:
/usr/lib64/python2.5/site-packages/pycuda-0.93rc3-py2.5-linux-x86_64.egg/pycuda/_driver.so:
undefined symbol: cuMemHostAlloc".
This was addressed in the post-- Message: 3 Date: Sun, 10 May 2009
16:46:59 -0400 From: Andreas Kl?ckner <lists(a)informa.tiker.net>
Subject: Re: [PyCuda] Call for Testers: 0.93rc1 --but I am still
somewhat unclear as to how to resolve. As I'm not facile with C code,
hope the list can help.
Here is the modified siteconf.py
BOOST_INC_DIR = ['/usr/local/include/boost-1_39']
BOOST_LIB_DIR = ['/usr/local/lib']
BOOST_COMPILER = 'gcc43'
BOOST_PYTHON_LIBNAME = ['boost_python-mt']
BOOST_THREAD_LIBNAME = ['boost_thread-mt']
CUDA_TRACE = False
CUDA_ENABLE_GL = False
CUDADRV_LIB_DIR = ['/usr/local/cuda/lib']
CUDADRV_LIBNAME = ['cuda']
CXXFLAGS = []
LDFLAGS = []
CUDA_ROOT = '/usr/local/cuda'
I've also tried combinations eliminating the entries for BOOST_INC_DIR
and BOOST_LIB_DIR with no luck.
TIA for any trailheads.
Regards,
V.
--
Vince Fulco, CFA, CAIA
612.424.5477 (universal)
vfulco1(a)gmail.com
Hi,
> I have an ongoing installation issue closely related to the
> problem/solution Paul Rigor generously contributed a few weeks back.
> Running FC10 x86_64 with driver 180.51, CUDA 2.2, re-built Boost 1.39
> after removing 1.37 (from Paul's instructions of ) and the just
> released 0.93RC3. I can get thru configure.py, manually fix the
> siteconf.py file, do a make install and then when running the first
> test, test_driver.py, get the error, "ImportError:
> /usr/lib64/python2.5/site-packages/pycuda-0.93rc3-py2.5-linux-x86_64.egg/py
>cuda/_driver.so: undefined symbol: cuMemHostAlloc".
That's the same issue I had ~2 weeks ago - you are using driver 180.51, but
CUDA 2.2 requires 185.18.08. *
cuMemHostAlloc is one of the additional symbols used by CUDA 2.2 (and
therefore pyCuda expects it)
So the solution seems either to stick to CUDA 2.1, or install the nvidia
driver recommended for CUDA 2.2
Vincent
--
Vincent Favre-Nicolin
CEA/Grenoble Institut Nanosciences & Cryogénie -
http://inac.cea.fr/
Université Joseph Fourier - http://physique-eea.ujf-grenoble.fr/
tél: (+33) 4 38 78 95 40 fax: (+33) 4 38 78 51 38
Would it be possible to change both the Device.get_attribute() and the
Function.get_attribute() calls to make them Python attributes instead
of using function-and-flag? It seems like this could avoid the
inconsistency between the two.
I agree with Ian - aesthetically it doesn't seem optimal to echo the C
API exactly in PyCUDA, when Python gives us much richer possibilities.
- bryan
On Tue, May 19, 2009 at 7:30 PM, Ian Cullinan <Ian.Cullinan(a)nicta.com.au> wrote:
> Whoops, forgot to send to the list. I'll remember one day.
>
> Ian
> ________________________________________
> From: Andreas Klöckner [lists(a)informa.tiker.net]
> Sent: Wednesday, 20 May 2009 11:43 AM
> To: Ian Cullinan
> Subject: Re: [PyCuda] Why are Function.lmem, Function.smem, and Function.registers deprecated?
>
> I'm torn. Doing it the attribute way here would create an inconsistency with
> Device.get_attribute(), too. Any more opinions?
>
> Andreas
>
> On Dienstag 19 Mai 2009, you wrote:
>> IMHO, it seems somewhat un-Pythonic to have to call a method (with a
>> magic-number-flag argument, no less) to access an attribute of an object,
>> when it could be just an attribute. I don't see any advantage to the
>> function-call way of doing things except that it more closely mirrors the
>> CUDA API - but the API looks like that because of a limitation of C, not
>> because it's a good way to do things.
>>
>> If you want to have only one way to do it, drop the ugly function-and-flags
>> way.
>>
>> Just my 2c,
>>
>> Ian Cullinan
>> ________________________________________
>> From: pycuda-bounces(a)tiker.net [pycuda-bounces(a)tiker.net] On Behalf Of
>> Andreas Klöckner [lists(a)informa.tiker.net] Sent: Wednesday, 20 May 2009
>> 9:10 AM
>> To: pycuda(a)tiker.net
>> Subject: Re: [PyCuda] Why are Function.lmem, Function.smem, and
>> Function.registers deprecated?
>>
>> On Dienstag 19 Mai 2009, Bryan Catanzaro wrote:
>> > I was browsing the documentation and saw the note that
>> > pycuda.driver.Function.registers, etc. are deprecated and will be
>> > removed in PyCuda 0.94. That makes me a little sad, as that
>> > information is very useful to one of the projects I'm working on. The
>> > implication in the documentation is that this is Cuda 2.2's fault.
>> > But I'm a little confused as to why - the .cubin files produced by my
>> > nvcc 2.2 compiler still have that information, so I must be missing
>> > something important here... What changed to make these useful
>> > attributes deprecated?
>>
>> No, it's different--things are getting better, not worse! :)
>>
>> CUDA 2.2 introduces an official API to find these values:
>> http://is.gd/Bscw
>>
>> So, if you're running 0.93 against 2.2, you'll get a deprecation warning
>> for using .registers etc. The goal is to keep with the Zen of Python:
>>
>> There should be one-- and preferably only one --obvious way to do it.
>>
>> Since CUDA 2.2 brought us a second way, the first one gets deprecated. If
>> this meets enough resistance, I guess I could be convinced to keep
>> .registers et al around. I'd rather not though. On 2.2, all .registers does
>> now is call the new API anyway.
>>
>> I've added a note to the docs about what the "new way" is.
>>
>> Andreas
>
>
>
> _______________________________________________
> PyCuda mailing list
> PyCuda(a)tiker.net
> http://tiker.net/mailman/listinfo/pycuda_tiker.net
>
Hi all,
Since there were a few issues reported with 0.93rc2, I've just rolled 0.93rc3.
Here's to hoping this version will also be 0.93. To make sure we get a solid
0.93 out, please test:
http://pypi.python.org/pypi/pycuda/0.93rc3
Thanks for all your dedication in reporting issues in these release
candidates. Rock on. :)
Andreas
Hello,
I was able to successfully compile and install pycuda-0.93rc2.
However, when I try to run the test driver, I get the following (I
apologize in advance for all the output):
python pycuda-0.93rc2/test/test_driver.py
(Sat,May30)
EE..EE....EEEEEE
======================================================================
ERROR: test_2d_texture (__main__.TestCuda)
----------------------------------------------------------------------
Traceback (most recent call last):
File "pycuda-0.93rc2/test/test_driver.py", line 145, in
test_2d_texture
""")
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 180, in __init__
arch, code, cache_dir, include_dirs)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 165, in compile
include_dirs = include_dirs + [_find_pycuda_include_path()]
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 126, in _find_pycuda_include_path
raise RuntimeError("could not find path PyCUDA's C header files")
RuntimeError: could not find path PyCUDA's C header files
======================================================================
ERROR: test_3d_texture (__main__.TestCuda)
----------------------------------------------------------------------
Traceback (most recent call last):
File "pycuda-0.93rc2/test/test_driver.py", line 349, in test_3d_texture
""")
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 180, in __init__
arch, code, cache_dir, include_dirs)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 165, in compile
include_dirs = include_dirs + [_find_pycuda_include_path()]
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 126, in _find_pycuda_include_path
raise RuntimeError("could not find path PyCUDA's C header files")
RuntimeError: could not find path PyCUDA's C header files
======================================================================
ERROR: test_gpuarray (__main__.TestCuda)
----------------------------------------------------------------------
Traceback (most recent call last):
File "pycuda-0.93rc2/test/test_driver.py", line 117, in test_gpuarray
diff = (a_g-3*b_g+(-a_g)).get() - (a-3*b+(-a))
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/gpuarray.py",
line 281, in __rmul__
return self._axpbz(scalar, 0, result)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/gpuarray.py",
line 164, in _axpbz
func = elementwise.get_axpbz_kernel(self.dtype)
File "<string>", line 1, in <lambda>
File
"/usr/local/lib/python2.6/dist-packages/pytools-9-py2.6.egg/pytools/__init__.py",
line 131, in memoize
result = func(*args)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/elementwise.py",
line 234, in get_axpbz_kernel
"axpb")
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/elementwise.py",
line 85, in get_elwise_kernel
arguments, operation, name, keep, options)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/elementwise.py",
line 70, in get_elwise_kernel_and_types
keep, options)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/elementwise.py",
line 64, in get_elwise_module
options=options, keep=keep)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 180, in __init__
arch, code, cache_dir, include_dirs)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 165, in compile
include_dirs = include_dirs + [_find_pycuda_include_path()]
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 126, in _find_pycuda_include_path
raise RuntimeError("could not find path PyCUDA's C header files")
RuntimeError: could not find path PyCUDA's C header files
======================================================================
ERROR: test_large_smem (__main__.TestCuda)
----------------------------------------------------------------------
Traceback (most recent call last):
File "pycuda-0.93rc2/test/test_driver.py", line 245, in test_large_smem
""" % n)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 180, in __init__
arch, code, cache_dir, include_dirs)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 165, in compile
include_dirs = include_dirs + [_find_pycuda_include_path()]
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 126, in _find_pycuda_include_path
raise RuntimeError("could not find path PyCUDA's C header files")
RuntimeError: could not find path PyCUDA's C header files
======================================================================
ERROR: test_multichannel_2d_texture (__main__.TestCuda)
----------------------------------------------------------------------
Traceback (most recent call last):
File "pycuda-0.93rc2/test/test_driver.py", line 211, in
test_multichannel_2d_texture
""")
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 180, in __init__
arch, code, cache_dir, include_dirs)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 165, in compile
include_dirs = include_dirs + [_find_pycuda_include_path()]
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 126, in _find_pycuda_include_path
raise RuntimeError("could not find path PyCUDA's C header files")
RuntimeError: could not find path PyCUDA's C header files
======================================================================
ERROR: test_multiple_2d_textures (__main__.TestCuda)
----------------------------------------------------------------------
Traceback (most recent call last):
File "pycuda-0.93rc2/test/test_driver.py", line 176, in
test_multiple_2d_textures
""")
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 180, in __init__
arch, code, cache_dir, include_dirs)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 165, in compile
include_dirs = include_dirs + [_find_pycuda_include_path()]
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 126, in _find_pycuda_include_path
raise RuntimeError("could not find path PyCUDA's C header files")
RuntimeError: could not find path PyCUDA's C header files
======================================================================
ERROR: test_prepared_invocation (__main__.TestCuda)
----------------------------------------------------------------------
Traceback (most recent call last):
File "pycuda-0.93rc2/test/test_driver.py", line 372, in
test_prepared_invocation
""")
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 180, in __init__
arch, code, cache_dir, include_dirs)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 165, in compile
include_dirs = include_dirs + [_find_pycuda_include_path()]
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 126, in _find_pycuda_include_path
raise RuntimeError("could not find path PyCUDA's C header files")
RuntimeError: could not find path PyCUDA's C header files
======================================================================
ERROR: test_simple_kernel (__main__.TestCuda)
----------------------------------------------------------------------
Traceback (most recent call last):
File "pycuda-0.93rc2/test/test_driver.py", line 33, in test_simple_kernel
""")
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 180, in __init__
arch, code, cache_dir, include_dirs)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 165, in compile
include_dirs = include_dirs + [_find_pycuda_include_path()]
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 126, in _find_pycuda_include_path
raise RuntimeError("could not find path PyCUDA's C header files")
RuntimeError: could not find path PyCUDA's C header files
======================================================================
ERROR: test_simple_kernel_2 (__main__.TestCuda)
----------------------------------------------------------------------
Traceback (most recent call last):
File "pycuda-0.93rc2/test/test_driver.py", line 54, in
test_simple_kernel_2
""")
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 180, in __init__
arch, code, cache_dir, include_dirs)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 165, in compile
include_dirs = include_dirs + [_find_pycuda_include_path()]
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 126, in _find_pycuda_include_path
raise RuntimeError("could not find path PyCUDA's C header files")
RuntimeError: could not find path PyCUDA's C header files
======================================================================
ERROR: test_streamed_kernel (__main__.TestCuda)
----------------------------------------------------------------------
Traceback (most recent call last):
File "pycuda-0.93rc2/test/test_driver.py", line 89, in
test_streamed_kernel
""")
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 180, in __init__
arch, code, cache_dir, include_dirs)
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 165, in compile
include_dirs = include_dirs + [_find_pycuda_include_path()]
File
"/usr/local/lib/python2.6/dist-packages/pycuda-0.93rc2-py2.6-linux-x86_64.egg/pycuda/compiler.py",
line 126, in _find_pycuda_include_path
raise RuntimeError("could not find path PyCUDA's C header files")
RuntimeError: could not find path PyCUDA's C header files
----------------------------------------------------------------------
Ran 16 tests in 0.104s
FAILED (errors=10)
It seems 10 of the 16 tests fail with this "RuntimeError: could not find
path PyCUDA's C header files". Any ideas?
Thanks,
Rob
Here is the results I get
0.865973949432 CPU
0.582780122757 GPU
I kind of expected more... (the GPU is a GTX280)
Of course, I never exclude that I did something stupid, in fact, I
expect it...
Is it the acceleration I should expect from this kind of matrix
operation? If yes, well cool... I guess.
If not, did I miss something?
Here is the code I use :
import pycuda.gpuarray as gpuarray
import pycuda.driver as cuda
import pycuda.autoinit
import numpy
import time
a=numpy.random.randn(1e4,1e4)
tic=time.time()
a_square=a*a
toc=time.time()-tic
print toc,"CPU"
a_gpu = gpuarray.to_gpu(a.astype(numpy.float32))
tic=time.time()
a_squared = (a_gpu*a_gpu).get()
toc=time.time()-tic
print toc,"GPU"
Hua Wong a écrit :
> Darn... you are right... Back to square one.
>
> Per B. Sederberg a écrit :
>> I'm not sure your code is doing what you mean it to. I get totally
>> different results when running a dot product of two gpuarrays. Did
>> you check the output to show that it was doing what you expect? I'm
>> actually surprised it ran at all.
>>
>> As far as I know, you can't simply replace numpy arrays with gpuarrays
>> in any numpy method, but I would love to be wrong about this...
>>
>> Best,
>> Per
>>
>> On Thu, May 28, 2009 at 10:34 AM, Hua Wong <hua.wong(a)pasteur.fr> wrote:
>>
>>> Following Andreas remark, I replaced the following in the code :
>>>
>>> import pycuda.gpuarray as gpuarray
>>> import pycuda.driver as cuda
>>> import pycuda.autoinit
>>> import numpy
>>> import time
>>>
>>> a=numpy.float32(numpy.random.randn(4000,4000))
>>> b=numpy.float32(numpy.random.randn(4000,4000))
>>>
>>> tic=time.time()
>>> axb=numpy.dot(a,b) # I assume this time it is matrix multiplication,
>>> according to numpy tutorials I've read...
>>> toc=time.time()-tic
>>> print toc,"CPU"
>>>
>>>
>>> tic=time.time()
>>> a_gpu = gpuarray.to_gpu(a)
>>> b_gpu = gpuarray.to_gpu(b)
>>> axbGPU = (numpy.dot(a_gpu,b_gpu)).get() # ditto here
>>> toc=time.time()-tic
>>> print toc,"GPU"
>>>
>>> Here are the results I get :
>>> 2.06739115715 CPU
>>> 0.171211004257 GPU
>>>
>>> It speeds up the calculation 11 times :)
>>> But I can't try bigger matrices, I lack RAM :(
>>>
>>> Hua Wong a écrit :
>>>
>>>> Thanks, I'm also puzzled by the results because I thought a 1e4*1e4
>>>> matrix
>>>> was already ginormous...
>>>>
>>>> I expected something like a 49 time speedup like in the
>>>> test_gpuarray_speed_random.py (size ~16000000 give a x49 speedup).
>>>>
>>>> So I guess I'm doing something wrong somewhere. I will check the test
>>>> script...
>>>>
>>>> Getting :
>>>> 0.46285700798 CPU
>>>> 0.728541851044 GPU
>>>>
>>>> with your code on a CentOS machine, with a GTX280 and 2x quad core
>>>> E5410
>>>>
>>>> Per B. Sederberg a écrit :
>>>>
>>>>> I modified your code slightly to make it so you are comparing apples
>>>>> to apples a bit better and I'm getting even worse performance for the
>>>>> GPU (GTX285 on Debian Testing):
>>>>>
>>>>> 0.652935028076 CPU
>>>>> 1.61081981659 GPU
>>>>>
>>>>> Here's the new code, which puts the sending and receiving of the data
>>>>> to/from the card in the loop and also has the CPU perform a float32
>>>>> operation just like the GPU:
>>>>>
>>>>> import pycuda.gpuarray as gpuarray
>>>>> import pycuda.driver as cuda
>>>>> import pycuda.autoinit
>>>>> import numpy
>>>>> import time
>>>>>
>>>>> a=numpy.float32(numpy.random.randn(1e4,1e4))
>>>>>
>>>>> tic=time.time()
>>>>> a_square=a*a
>>>>> toc=time.time()-tic
>>>>> print toc,"CPU"
>>>>>
>>>>>
>>>>> tic=time.time()
>>>>> a_gpu = gpuarray.to_gpu(a)
>>>>> a_squared = (a_gpu*a_gpu).get()
>>>>> toc=time.time()-tic
>>>>> print toc,"GPU"
>>>>>
>>>>> It looks like you'll need to have even larger matrices before you'll
>>>>> see a major GPU benefit, though I'm a bit surprised by these results.
>>>>>
>>>>> Best,
>>>>> Per
>>>>>
>>>>>
>>>>> On Thu, May 28, 2009 at 6:55 AM, Hua Wong <hua.wong(a)pasteur.fr>
>>>>> wrote:
>>>>>
>>>>>
>>>>>> Here is the results I get
>>>>>> 0.865973949432 CPU
>>>>>> 0.582780122757 GPU
>>>>>>
>>>>>> I kind of expected more... (the GPU is a GTX280)
>>>>>>
>>>>>> Of course, I never exclude that I did something stupid, in fact, I
>>>>>> expect
>>>>>> it...
>>>>>> Is it the acceleration I should expect from this kind of matrix
>>>>>> operation?
>>>>>> If yes, well cool... I guess.
>>>>>> If not, did I miss something?
>>>>>>
>>>>>> Here is the code I use :
>>>>>>
>>>>>> import pycuda.gpuarray as gpuarray
>>>>>> import pycuda.driver as cuda
>>>>>> import pycuda.autoinit
>>>>>> import numpy
>>>>>> import time
>>>>>>
>>>>>> a=numpy.random.randn(1e4,1e4)
>>>>>>
>>>>>> tic=time.time()
>>>>>> a_square=a*a
>>>>>> toc=time.time()-tic
>>>>>> print toc,"CPU"
>>>>>>
>>>>>> a_gpu = gpuarray.to_gpu(a.astype(numpy.float32))
>>>>>>
>>>>>> tic=time.time()
>>>>>> a_squared = (a_gpu*a_gpu).get()
>>>>>> toc=time.time()-tic
>>>>>> print toc,"GPU"
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> PyCuda mailing list
>>>>>> PyCuda(a)tiker.net
>>>>>> http://tiker.net/mailman/listinfo/pycuda_tiker.net
>>>>>>
>>>>>>
>>>>>>
>>>>
>>> _______________________________________________
>>> PyCuda mailing list
>>> PyCuda(a)tiker.net
>>> http://tiker.net/mailman/listinfo/pycuda_tiker.net
>>>
>>>
>
>
It says "OK" at the end, so I should be golden (¿o rly?)
I just would like to know what is the warning about (maybe it's
something I can correct) something about sqlite?
What does it do?
python test_driver.py
/work/hwong/site-packages/lib64/python2.4/site-packages/pytools-9-py2.4.egg/pytools/diskdict.py:119:
UserWarning: DiskDict will memory-only: a usable version of sqlite was
not found.
warnings.warn("DiskDict will memory-only: "
...............
----------------------------------------------------------------------
Ran 15 tests in 6.295s
OK