Hey everyone,
Out of curiosity, has anyone had any success with OpenGL in pycuda?
I have searched through the past messages on the list that I could find, and
it did not seem that the issue was ever resolved. Does anyone have any
insight?
Thanks,
>>>Chris Heuser
Hi, I'm very surprised that google isn't turning up something about
this topic because I thought it's been previously discussed, so my
apologies if it has.
I'm trying the NVIDIA CUDA Visual Profiler (v 2.2.05) in Windows XP
with a fairly recent PyCUDA git, on CUDA 2.2
(pycuda.driver.get_driver_version() returns 2020).
I provide the Visual Profiler with a Windows batch file that calls
"python my_pycuda_script.py -some -flags", but the Visual Profiler
(after running the script 4 times) only reports two methods,
"memcopy". All other counters are zero (so it doesn't display them in
the table). Manipulating the counters enabled doesn't change this.
Any assistance would be much appreciated. My application runs only
~10% faster on a Tesla C1060 than a G80 Quadro (despite having twice
as many MPs) so I'm hoping the profiler will help me understand why.
Thanks!
Ahmed
I am trying to get .94 beta up on my Mac, and despite a seemingly successful
install, I get the following error:
Fatal Python error: Interpreter not initialized (version mismatch?)
Abort trap
After some googling, I recognize that it is a common error with Mac
applications of programs, due to the system python. But despite trying a few
suggested methods, I have not been able to fix the problem. Any Suggestions?
Thank you,
>>>Chris Heuser
Alan,
I'm forwarding your message to the PyCUDA mailing list, maybe someone there
has an idea.
Andreas
---------- Forwarded Message ----------
Betreff: Re: about pyopencl
Datum: Montag 27 Juli 2009
Von: Alan <alanwilter(a)gmail.com>
An: Andreas Klöckner <kloeckner(a)dam.brown.edu>
HI Andreas,
Thank you very much for your comments.
I did
try pyCuda first. After some struggle to get boost to work (I use Fink
x86_64), I did this for pycuda:
./configure.py --boost-inc-dir=/sw/include --boost-lib-dir=/sw/lib \
--boost-thread-libname=boost_thread-mt
--boost-python-libname=boost_python-mt
( edited mine siteconf.py:
BOOST_INC_DIR = ['/sw/include']
BOOST_LIB_DIR = ['/sw/lib']
BOOST_COMPILER = 'gcc-4.2'
BOOST_PYTHON_LIBNAME = ['boost_python-mt']
BOOST_THREAD_LIBNAME = ['boost_thread-mt']
CUDA_TRACE = False
CUDA_ENABLE_GL = False
CUDADRV_LIB_DIR = []
CUDADRV_LIBNAME = ['cuda']
CXXFLAGS = ['-m64']
LDFLAGS = []
)
make
[snip]
g++ -L/sw/lib -bundle -L/sw/lib/python2.6/config -lpython2.6
build/temp.macosx-10.5-i386-2.6/src/cpp/cuda.o
build/temp.macosx-10.5-i386-2.6/src/cpp/bitlog.o
build/temp.macosx-10.5-i386-2.6/src/wrapper/wrap_cudadrv.o
build/temp.macosx-10.5-i386-2.6/src/wrapper/mempool.o -L/sw/lib
-L/usr/local/cuda/lib -lboost_python-mt -lboost_thread-mt -lcuda -o
build/lib.macosx-10.5-i386-2.6/pycuda/_driver.so -arch i386
ld warning: in /sw/lib/python2.6/config/libpython2.6.dylib, file is not of
required architecture
ld warning: in build/temp.macosx-10.5-i386-2.6/src/cpp/cuda.o, file is not
of required architecture
ld warning: in build/temp.macosx-10.5-i386-2.6/src/cpp/bitlog.o, file is not
of required architecture
ld warning: in build/temp.macosx-10.5-i386-2.6/src/wrapper/wrap_cudadrv.o,
file is not of required architecture
ld warning: in build/temp.macosx-10.5-i386-2.6/src/wrapper/mempool.o, file
is not of required architecture
ld warning: in /sw/lib/libboost_python-mt.dylib, file is not of required
architecture
ld warning: in /sw/lib/libboost_thread-mt.dylib, file is not of required
architecture
I know quite well the warnings above and hence, make tests failed. It's
because:
otool -L build/lib.macosx-10.5-i386-2.6/pycuda/_driver.so
build/lib.macosx-10.5-i386-2.6/pycuda/_driver.so:
/usr/local/cuda/lib/libcuda.dylib (compatibility version 1.1.0, current
version 2.3.0)
/usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version
7.4.0)
/usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current version
1.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version
111.1.4)
file /usr/local/cuda/lib/libcuda.dylib
/usr/local/cuda/lib/libcuda.dylib: Mach-O dynamically linked shared library
i386
All others are x86_64. So basically my problem is libcuda is 32 bits.
And I guess I will be in this dead end until Nvidia release Cuda in 64bits
for Mac. And maybe this will happen only when Mac OSX becomes only 64
bits...
Cheers,
Alan
On Fri, Jul 24, 2009 at 10:50, Andreas Klöckner
<kloeckner(a)dam.brown.edu>wrote:
> Hi Alan,
>
> On Freitag 24 Juli 2009, you wrote:
> > I am about to start to play with pycuda but I am keen to consider
> > opencl and I was trying to find any information about pyopencl and got
> > nothing. And since your name is behind both project I was wondering if
> > pyopencl is not developed anymore.
>
> It's not that PyOpenCL isn't developed any more--I just haven't found time
> to
> grow it beyond its embryonic stage just yet, but it'll happen. Funnily, a
> different project called PyOpenCL has cropped up [1], which is a minimal,
> but
> functional, ctypes-based wrapper. I've gotten in touch with its author
> regarding working together or at least changing one of the names, but so
> far
> haven't gotten any response. Since I personally don't like ctypes for
> big-ish
> wrappers, I'll continue to develop "my" PyOpenCL as-is. Contributions to
> the
> code would be more than welcome.
>
> [1] http://pyopencl.next-touch.com/
>
> > I see CUDA has been ahead but for the future and the good of GPU
> > applications I believe opencl has to be the way and I am afraid of
> > developing my applications in CUDA only.
>
> While OpenCL promises vendor independence, I believe that switching devices
> will still involve significant code changes, so in that sense you're
> committing yourself to a device (and hence a vendor) anyway. Relatedly, it
> appears that Nvidia is the only vendor with a credible CL story right now,
> and
> it's plausible that they'll have CUDA stays 'ahead' in some sense on their
> own
> hardware--be that in terms of features or performance. If "just wait for a
> year or two and see what happens" is a valid choice for you, that may be
> what
> you want to do, but if you want to do GPU computing right now, CUDA is
> probably the best choice.
>
> HTH
> Andreas
>
> --
> Andreas Kloeckner
> Applied Mathematics, Brown University
> http://www.dam.brown.edu/people/kloeckner
> +1-401-648-0599
>
--
Alan Wilter S. da Silva, D.Sc. - CCPN Research Associate
Department of Biochemistry, University of Cambridge.
80 Tennis Court Road, Cambridge CB2 1GA, UK.
>>http://www.bio.cam.ac.uk/~awd28<<
-------------------------------------------------------
Hi,
first of all: nice piece of work Andreas!
Now to my problem: I installed CUDA 2.1 on my T61 laptop computer with a
Nvidia Quadro NVS 140M graphics card w/ 128MB memory.
git version of PyCUDA (0.93 beta I assume) installs fine but I get some
errors when I try to run tests. First test_driver:
/opt/PyCUDA/pycuda-git/test$python test_driver.py
...E....Eterminate called after throwing an instance of 'cuda::error'
what(): cuMemFree failed: invalid context
Aborted
Or runnung gpuarray speed test:
/opt/PyCUDA/pycuda-git/test$python undistributed/test_gpuarray_speed.py
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
Traceback (most recent call last):
File "undistributed/test_gpuarray_speed.py", line 83, in <module>
main()
File "undistributed/test_gpuarray_speed.py", line 27, in main
b = gpuarray.zeros((size,), dtype=numpy.float32)
File
"/usr/lib/python2.5/site-packages/pycuda-0.93beta-py2.5-linux-x86_64.egg/pycuda/gpuarray.py",
line 409, in zeros
result = GPUArray(shape, dtype, stream, allocator)
File
"/usr/lib/python2.5/site-packages/pycuda-0.93beta-py2.5-linux-x86_64.egg/pycuda/gpuarray.py",
line 75, in __init__
self.gpudata = self.allocator(self.size * self.dtype.itemsize)
pycuda._driver.MemoryError: cuMemAlloc failed: out of memory
-----------------------------------------------------------
PyCUDA WARNING: I'm being asked to destroy a
context that's part of the current context stack.
-----------------------------------------------------------
I will pick the next lower active context from the
context stack. Since this choice is happening
at an unspecified point in time, your code
may be making false assumptions about which
context is active at what point.
Call Context.pop() to avoid this warning.
-----------------------------------------------------------
If Python is terminating abnormally (eg. exiting upon an
unhandled exception), you may ignore this.
-----------------------------------------------------------
test_gpuarray only threw some warnings:
python test_gpuarray.py
/usr/lib/python2.5/site-packages/pycuda-0.93beta-py2.5-linux-x86_64.egg/pycuda/gpuarray.py:498:
UserWarning: behavior change: arange guessed dtype other than float32.
suggest specifying explicit dtype.
warn("behavior change: arange guessed dtype other than float32. "
..................
----------------------------------------------------------------------
Ran 18 tests in 14.756s
OK
Here are some debug information you might need:
In [61]: pycuda.autoinit.device.get_attributes()
Out[62]:
{pycuda._driver.device_attribute.MAX_THREADS_PER_BLOCK: 512,
pycuda._driver.device_attribute.MAX_BLOCK_DIM_X: 512,
pycuda._driver.device_attribute.MAX_BLOCK_DIM_Y: 512,
pycuda._driver.device_attribute.MAX_BLOCK_DIM_Z: 64,
pycuda._driver.device_attribute.MAX_GRID_DIM_X: 65535,
pycuda._driver.device_attribute.MAX_GRID_DIM_Y: 65535,
pycuda._driver.device_attribute.MAX_GRID_DIM_Z: 1,
pycuda._driver.device_attribute.MAX_SHARED_MEMORY_PER_BLOCK: 16384,
pycuda._driver.device_attribute.TOTAL_CONSTANT_MEMORY: 65536,
pycuda._driver.device_attribute.WARP_SIZE: 32,
pycuda._driver.device_attribute.MAX_PITCH: 262144,
pycuda._driver.device_attribute.MAX_REGISTERS_PER_BLOCK: 8192,
pycuda._driver.device_attribute.CLOCK_RATE: 800000,
pycuda._driver.device_attribute.TEXTURE_ALIGNMENT: 256,
pycuda._driver.device_attribute.GPU_OVERLAP: 1,
pycuda._driver.device_attribute.MULTIPROCESSOR_COUNT: 2}
In [63]: pycuda.autoinit.device.compute_capability()
Out[63]: (1, 1)
In [65]: pycuda.autoinit.device.total_memory()
Out[65]: 133496832
In [66]: pycuda.autoinit.device.make_context()
terminate called after throwing an instance of 'cuda::error'
what(): cuMemFree failed: invalid context
Aborted
All the other tests were fine, so PyCUDA works nicely.
Could you please check where the "invalid context" error comes from?
And maybe add a memory checker to your tests so we low-mem GPU user are
not discriminated ;-).
Keep up the good work!
Philipp
Hey everybody,
I am trying to install .94 beta on a MacBook Pro. I went through the
installation instructions on the wiki, and everything went fairly well. It
seemed to have all worked and returned 0 errors. but when I run
test_driver.py, this error comes up:
Traceback (most recent call last):
File "test/test_driver.py", line 472, in <module>
import pycuda.autoinit
File
"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/pycuda-0.94beta-py2.5-macosx-10.3-i386.egg/pycuda/autoinit.py",
line 1, in <module>
import pycuda.driver as cuda
File
"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/pycuda-0.94beta-py2.5-macosx-10.3-i386.egg/pycuda/driver.py",
line 1, in <module>
from _driver import *
ImportError:
dlopen(/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/pycuda-0.94beta-py2.5-macosx-10.3-i386.egg/pycuda/_driver.so,
2): Library not loaded: libboost_python-xgcc40-mt-1_39.dylib
Referenced from:
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/pycuda-0.94beta-py2.5-macosx-10.3-i386.egg/pycuda/_driver.so
Reason: image not found
Any ideas how I could fix this?
Thanks!
>>>Chris Heuser
Hi Guys-
I'm trying to write re-usable code for automatically tuning (CUDA)
kernels and their associated (python) calling functions for optimal
execution time.
My main goals:
1. Write re-usable code to handle the optimization of kernels (and
their associated support code), given templates that contain:
2. Inline definition of logic for conditional unrolling of loops,
disabling sections of code, etc...
3. Inline definition of the tunable parameters, along with the sets of
possible values they can take
I've found two templating languages that I think are promising for this goal:
Cheetah
http://www.cheetahtemplate.org
and
Mako
http://www.makotemplates.org
Both are pretty rich and Pythonic in terms of the logic you can
incorporate into your templates, so that covers 2. pretty well.
I'm stuck on 3.
To give you some idea what I'm talking about, here's a dummy template
that generates python code that just waits different amounts of time
based on the (after the template is rendered) hard coded values of x1,
x2, y1, y2.
__________contents of mainFileName__________________
## This is a mako template demonstrating how to make a template for
use with the AutoTunedFunction class
<%!
tuneableParameterSets = []
%>
import time
def main:
## Two tuneable parameters that must be optimized together
<%!
tuneableParameterSets.append({})
tuneableParameterSets[0]['x1'] = [0, 3, 2]
tuneableParameterSets[0]['x2'] = [0, 5, 4]
%>
time.sleep(.01 * ${x1} * ${x2})
## Another set of parameters that should be optimized together.
<%!
tuneableParameterSets.append({})
tuneableParameterSets[1]['y1'] = [True, False]
tuneableParameterSets[1]['y2'] = [False, True]
%>
if ${y1} or ${y2}:
time.sleep(.02)
____________________________
So there is a list of dictionaries, tuneableParameterSets, that I want
to be able to access from outside the template. When you do:
>>> from mako.template import Template
>>> t = Template(filename=mainTemplateFileName)
the template source gets compiled to template code. I was hoping
there'd be a nice way to inspect the resulting class instance t to get
the tuneableParameterSets variable, and use that to iterate through
the possible values for the tuned parameters. Unfortunately, there
doesn't seem to be a nice clean way to do this. One ugly hack would
be to scrape the code that sets the variables out of the compiled (but
not rendered/filled) template code, perhaps with the help of some
added delimiters. This is really ugly, but so far it's the only thing
that I've come up with that would work.
In Cheetah there is a syntax #attr that lets you set attributes of the
resulting class (after the template is compiled, before it is
rendered/filled). That almost worked, but unfortunately it doesn't
seem like you can set a dictionary, only string and numeric literals.
Anyone familiar with these templating engines have any clever ideas?
Is anyone else using generic tuning code, or is everyone writing a
separate auto-tuning script for each module?
I also looked a bit a codepy, but I don't quite understand what to do
with it; it doesn't seem suitable for manipulating chunks of code
written in another language. I may be misunderstanding it completely
though, and there's not much documentation.
Thanks!
Drew
Hey everyone, it turns out PyCUDA on Linux was such a bit hit that
some colleagues are asking for the same thing for their Windows boxes
also. I have made sure the NVIDIA CUDA SDK examples work and am
following the wiki installation for MSVC 9.0 with some trouble.
I used the BoostPro 1.38 installer to install default variants for
Visual C++ 9.0 (Visual Studio 2008), with variants Multithread, DLL;
Multithread; Multithread, static runtime; and Single thread, static
runtime. I install the default components as well as Boost DateTime,
Boost Python, and Boost Threads, without which the installation of
PyCUDA won't happen. I've tried various other combinations of variants
and compilers but test_driver.py fails with the same error message:
C:\home\fasiha\pycuda\test>python test_driver.py
Traceback (most recent call last):
File "test_driver.py", line 472, in <module>
import pycuda.autoinit
File "C:\Python26\lib\site-packages\pycuda-0.94beta-py2.6-win32.egg\pycuda\autoinit.py",
line 1, in <module>
import pycuda.driver as cuda
File "C:\Python26\lib\site-packages\pycuda-0.94beta-py2.6-win32.egg\pycuda\driver.py",
line 1, in <module>
from _driver import *
ImportError: DLL load failed: The specified module could not be found.
I was slightly confused with the instruction to "set HOME=123456", I'm
not too familiar with Windows, so I tried that as well as "set
HOME=c:/home/fasiha" (and with backslashes), the same error.
I'm also slightly confused by this sentence in the wiki: "To work with
pycuda You need to have Microsoft Visual Studio 2005 (VS 8) included
in the PATH variable (for example c:\mvs8\vc\bin, where mvs8 - is your
vc8 installation part) (otherwise, CUDA complains about version 6 and
7.1 !)" I assume that this is not a problem for me since PyCUDA
doesn't seem to complain about MSVC.
I am so sorry to unload yet another plea for installation assistance,
I never ever thought I'd be installing it in Windows but I guess it's
a small price for success :) I'll keep trying some stuff and report
any further info.
Thanks all,
Ahmed
On mercredi 15 juillet 2009, Andreas Klöckner wrote:
> On Mittwoch 15 Juli 2009, you wrote:
> > Andreas,
> >
> > I added a quick installation tutorial for Ubuntu 9.04 (8.10 is
> > similar, only the libboost version changes).
> > http://wiki.tiker.net/PyCuda/Installation/Linux#Installing_PyCUDA_on_Ubun
> >tu _9.04
Is there any reason this tutorial installs numpy 1.3.0 from source ? the
python-numpy package (in 9.0.4) already provides numpy 1.2.1. Moreover atlas
is not a requirement either ?
--
Vincent Favre-Nicolin
CEA/Grenoble Institut Nanosciences & Cryogénie -
http://inac.cea.fr/
Université Joseph Fourier - http://physique-eea.ujf-grenoble.fr/
tél: (+33) 4 38 78 95 40 fax: (+33) 4 38 78 51 38