Hello all,
I finally bit the bullet and got radix working in PyOpenCL :)
It's also improved over the SDK example because it does keys and values,
mostly thanks to my advisor.
Additionally this sort will handle any size array as long as it is a power
of 2. The shipped example does not allow for arrays smaller than 32768, but
I've hooked up their naive scan to allow all smaller arrays.
https://github.com/enjalot/adventures_in_opencl/tree/master/experiments/rad…
all you really need are radix.py, RadixSort.cl and Scan_b.cl
some simple tests are at the bottom of radix.py
I hammered this out because I need it for a project, it's not all that clean
and I didn't add support for sorting on keys only (altho it wouldn't take
much to add that, and I intend to at a later time when I need the
functionality). Hopefully this helps someone else out there. I'll also be
porting it using my own OpenCL C++ wrappers to include in my fluid
simulation library at some point.
I also began looking at AMD's radix from their SPH tutorial, but they use
local atomics which are not supported on my 9600M
--
Ian Johnson
http://enja.org
Hello,
Fwiw, I posted the outputs from the 'test_array.py' script at http://mypage.iu.edu/~heiland/pyopencl/
I realize the failures (and fixes) that occur here may be due to Andreas's lack of having an OSX platform to test on, so I don't intend for this post to be a nuisance. I'm simply curious what people are getting as results (from tests & examples) on various platforms and thought I'd offer up some from OSX.
Btw, if there is some guidance on debugging pyopencl, I'd certainly welcome it and try to contribute.
-Randy
Hi all,
I'm hoping someone will offer some advice for a newbie. I'm getting the following error:
python demo_mandelbrot.py
Traceback (most recent call last):
File "demo_mandelbrot.py", line 161, in <module>
test = Mandelbrot()
File "demo_mandelbrot.py", line 120, in __init__
self.create_image()
File "demo_mandelbrot.py", line 147, in create_image
self.draw(-2.13, 0.77, -1.3, 1.3)
File "demo_mandelbrot.py", line 133, in draw
output = calc_fractal(q, maxiter)
File "demo_mandelbrot.py", line 66, in calc_fractal_opencl
output_opencl, np.uint16(maxiter))
File "/Library/Python/2.7/site-packages/pyopencl-2011.1.2-py2.7-macosx-10.7-intel.egg/pyopencl/__init__.py", line 221, in kernel_call
global_offset, wait_for, g_times_l=g_times_l)
pyopencl.LogicError: clEnqueueNDRangeKernel failed: invalid work group size
To avoid cluttering this email, I've provided a dump of my properties (AMD card) here:
http://mypage.iu.edu/~heiland/pyopencl/
I'm working out of git:
In [11]: pyopencl.version.VERSION
Out[11]: (2011, 1, 2)
and also have:
PyOpenGL-3.0.1-py2.7
I did a search here http://search.gmane.org/?query=invalid+work+group+size&group=gmane.comp.pyt…
but it didn't shed much light, for me anyway.
thanks, Randy
Hi,
I've been using PyCUDA a lot (see http://pynx.sf.net) and I am just
switching to PyOpenCL, which seems to work very nicely, including with CPUs.
Right now I am working under Ubuntu with amd+intel+nvidia sdk's, which seem
to work nicely along each other and PyOpenCL.
Only slightly annoying thing, the intel compiler seems to be a bit verbose
when it does not need to be, e.g. I get messages like:
##########
Build on <pyopencl.Device 'Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz' at
0x1af72c0> succeeded, but said:
Build started
Kernel <Fhkl> was successfully vectorized
Done.
##########
This is a pretty minor issue, but does someone know how to turn off this
message from the intel compiler (while still keeping real warnings, I don't
want to completely turn off PyOpenCL reporting of messages). I tried looking in
the intel manual, but did not find a compiler option for that.
Incidentally, the intel compiler seems much more efficient (3.5x speed) than
AMD's for my Intel Core 2 Q9550. Not sure why, either better vectorization or
faster sin/cos functions.
regards,
--
Vincent Favre-Nicolin http://inac.cea.fr
CEA/Grenoble Institut Nanosciences & Cryogénie
Laboratoire SP2M/Nano-structures et Rayonnement Synchrotron
17, rue des Martyrs
38054 Grenoble Cedex 9 - France
Université Joseph Fourier http://www.ujf-grenoble.fr
tél: (+33) 4 38 78 95 40 fax: (+33) 4 38 78 51 38
On Mon, 29 Aug 2011 08:56:01 -0400, Sean True <sean.true(a)gmail.com> wrote:
> Has been uploaded to PyPI. It's still a bit raw, but adds a useful
> thin layer between Python and the PyOpenCL runtime.
>
> Declaring an interface:
>
> interface mandelbrot {
> kernel mandelbrot(in complex64 *q, outlike int16 *q, in int32 maxiter);
> }
>
> lets you call mandelbrot :
>
>
> output = calc_fractal(q, maxiter)
>
> No explicit buffer management, note that the output buffer q the
> return value, and that no declarations have to be made for it at all
> in the python code.
>
> I'm now using this actively for my personal work, and the package
> dependencies are primarily the same as PyOpenCL (pyparsing is the
> primary addition).
>
> Licensed PSF, use as you like.
This looks like good stuff. A friend of mine had a need for something
like this recently, and I was happy I could point him towards it. I've
also added a link to the FAQ.
Quick request: Please add a link to the PyPI page to the doc front page,
so that there's a central place to link to.
Andreas
Has been uploaded to PyPI. It's still a bit raw, but adds a useful
thin layer between Python and the PyOpenCL runtime.
Declaring an interface:
interface mandelbrot {
kernel mandelbrot(in complex64 *q, outlike int16 *q, in int32 maxiter);
}
lets you call mandelbrot :
output = calc_fractal(q, maxiter)
No explicit buffer management, note that the output buffer q the
return value, and that no declarations have to be made for it at all
in the python code.
I'm now using this actively for my personal work, and the package
dependencies are primarily the same as PyOpenCL (pyparsing is the
primary addition).
Licensed PSF, use as you like.
-- Sean
Sean True
Swapwizard Consulting
Hello.
I am sending one email to two groups - just call me lazy ;-)
Yesterday python-pycuda got accepted into Debian unstable,
and today it landed in archives.
This means that Debian unstable contains PyCUDA 2011.1.3 with
git patches from 2011-08-13 (smem_alloc_granularity) and
from 2011-08-14 (fix of debug code leak) is available in Debian
unstable. Package should migrate into testing (and thus
become candidate for inclusion into next stable Debian)
after two weeks if there are no problems with it.
At the same time both Debian unstable and Debian testing
both contain PyOpenCL 2011.1.2 with changes from git
up to 2011-07-16. After 2011.2 is released I intend to
upload it into Debian.
Unfortunately Ubuntu still contains PyOpenCL 0.92.
Automatic upload of Debian version into Ubuntu did
not worked, probably because of bug
https://bugs.launchpad.net/ubuntu/+source/pyopencl/+bug/763457
which forces users to install NVIDIA drivers even
if they use another OpenCL provider. Debian version
contains fix (at least I hope that it fixes this problem)
but I got no reply to my message asking whether problem
is fixed or not.
As for PyCUDA in Ubuntu, it does not look like it will
be included soon. Ubuntu has different driver architecture,
and does not contain NVIDIA CUDA toolkit, and both are heavily
used by python-pycuda.
Best regards.
--
Tomasz Rybak <bogomips(a)post.pl> GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak
Jan,
Thanks for that tip. I realized that after reading the OpenCL specification
that type "double" is not required by the specification, so users should
check with their hardware documentation, both Nvidia and ATI, to determine
whether double is supported. In my case it is not.
Thanks, Max
On Tue, Aug 23, 2011 at 3:00 PM, <pyopencl-request(a)tiker.net> wrote:
> Send PyOpenCL mailing list submissions to
> pyopencl(a)tiker.net
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.tiker.net/listinfo/pyopencl
> or, via email, send a message with subject or body 'help' to
> pyopencl-request(a)tiker.net
>
> You can reach the person managing the list at
> pyopencl-owner(a)tiker.net
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of PyOpenCL digest..."
>
>
> Today's Topics:
>
> 1. Re: yapocis? Yet another python opencl interface
> specification layer? (Sean True)
> 2. Re: error importing pyopencl (Jan Meinke)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 22 Aug 2011 20:56:51 +0000 (UTC)
> From: Sean True <sean.true(a)gmail.com>
> To: pyopencl(a)tiker.net
> Subject: Re: [PyOpenCL] yapocis? Yet another python opencl interface
> specification layer?
> Message-ID: <loom.20110822T225431-979(a)post.gmane.org>
> Content-Type: text/plain; charset=us-ascii
>
> Andreas:
>
> http://yapocis.readthedocs.org/ and https://github.com/seantrue/Yapocishave rev 0.1 of the layer I
> mentioned before.
>
> I'd be interested in your reaction ;-)
>
> -- Sean
>
>
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 22 Aug 2011 22:52:28 +0200
> From: Jan Meinke <jan.meinke(a)gmail.com>
> To: "M.Gelman" <m.gelman08(a)gmail.com>
> Cc: pyopencl(a)tiker.net
> Subject: Re: [PyOpenCL] error importing pyopencl
> Message-ID:
> <CAEFwH5DGD8CzdTjzLxLibNt7-_355JMOiH=rSi_=T8+Li-8Zdw(a)mail.gmail.com
> >
> Content-Type: text/plain; charset="iso-8859-1"
>
> Dear Max,
>
> AMD does not enable double precision support in OpenCL kernels by default.
> Instead you need to add
>
> #pragma OPENCL EXTENSION cl_amd_fp64 : enable
>
> to your kernel. This should work on the CPU but it will break the
> compilation on your GPU since it doesn't support double precision as far as
> I know.
>
> Jan
>
>
> On Thu, Aug 4, 2011 at 1:19 AM, M.Gelman <m.gelman08(a)gmail.com> wrote:
>
> > Thanks for your suggestion. I was able to fix it by copying the
> > libOpenCL.so provided by the ati stream sdk to /usr/lib.
> >
> > However, running the benchmark-all.py. I get the following errors about a
> > "double" type not enabled. It looks to me like everything is working
> > otherwise, including the other tests.
> >
> > ('Execution time of test without OpenCL: ', 8.684561014175415, 's')
> > ===============================================================
> > ('Platform name:', 'ATI Stream')
> > ('Platform profile:', 'FULL_PROFILE')
> > ('Platform vendor:', 'Advanced Micro Devices, Inc.')
> > ('Platform version:', 'OpenCL 1.1 ATI-Stream-v2.3 (451)')
> > ---------------------------------------------------------------
> > ('Device name:', 'ATI RV710')
> > ('Device type:', 'GPU')
> > ('Device memory: ', 256, 'MB')
> > ('Device max clock speed:', 600, 'MHz')
> > ('Device compute units:', 2)
> >
> /usr/local/lib/python2.6/dist-packages/pyopencl-2011.1.2-py2.6-linux-x86_64.egg/pyopencl/cache.py:343:
> > UserWarning: Build succeeded, but resulted in non-empty logs:
> > Build on <pyopencl.Device 'ATI RV710' at 0x21a9b20> succeeded, but said:
> >
> > /tmp/OCLQdtqga.cl(11): warning: double-precision constant is represented
> as
> > single-precision constant because double is not enabled
> > c[gid] = c[gid] * (a[gid] / 2.0);
> > ^
> >
> >
> > warn("Build succeeded, but resulted in non-empty logs:\n"+message)
> > Execution time of test: 0.00788741 s
> > benchmark-all.py:70: DeprecationWarning: 'enqueue_read_buffer' has been
> > deprecated in version 2011.1. Please use enqueue_copy() instead.
> > cl.enqueue_read_buffer(queue, dest_buf, c).wait()
> > Results OK
> > ===============================================================
> > ('Platform name:', 'ATI Stream')
> > ('Platform profile:', 'FULL_PROFILE')
> > ('Platform vendor:', 'Advanced Micro Devices, Inc.')
> > ('Platform version:', 'OpenCL 1.1 ATI-Stream-v2.3 (451)')
> > ---------------------------------------------------------------
> > ('Device name:', 'Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz')
> > ('Device type:', 'CPU')
> > ('Device memory: ', 3072, 'MB')
> > ('Device max clock speed:', 2934, 'MHz')
> > ('Device compute units:', 8)
> >
> /usr/local/lib/python2.6/dist-packages/pyopencl-2011.1.2-py2.6-linux-x86_64.egg/pyopencl/cache.py:343:
> > UserWarning: Build succeeded, but resulted in non-empty logs:
> > Build on <pyopencl.Device 'Intel(R) Core(TM) i7 CPU 870 @
> 2.93GHz'
> > at 0x21bb140> succeeded, but said:
> >
> > /tmp/OCLLl8sG4.cl(11): warning: double-precision constant is represented
> as
> > single-precision constant because double is not enabled
> > c[gid] = c[gid] * (a[gid] / 2.0);
> > ^
> >
> >
> > warn("Build succeeded, but resulted in non-empty logs:\n"+message)
> > Execution time of test: 0.00134482 s
> > Results OK
> >
> >
> > On Wed, Aug 3, 2011 at 4:36 AM, Andreas Kloeckner <
> lists(a)informa.tiker.net
> > > wrote:
> >
> >> On Tue, 2 Aug 2011 19:05:33 -0400, "M.Gelman" <m.gelman08(a)gmail.com>
> >> wrote:
> >> Non-text part: multipart/alternative
> >> > I have been trying to get pyopencl to work with my ati card and ati
> >> stream
> >> > for awhile. I followed the tutorial an was able to get a good
> >> compilation
> >> > using this script:
> >> >
> >> > python configure.py \
> >> > --boost-inc-dir=/usr/include/boost \
> >> > --boost-lib-dir=/usr/lib \
> >> > --boost-python-libname=boost_python-mt-py26 \
> >> > --cl-inc-dir=/opt/ati-stream-sdk-v2.3-lnx64/include \
> >> > --cl-lib-dir=/opt/ati-stream-sdk-v2.3-lnx64/lib/x86_64 \
> >> > --cl-libname=OpenCL
> >> >
> >> > However when I import pyopencl, I get:
> >> >
> >> > >>> import pyopencl
> >> > Traceback (most recent call last):
> >> > File "<stdin>", line 1, in <module>
> >> > File
> >> >
> >>
> "/usr/local/lib/python2.6/dist-packages/pyopencl-2011.1.2-py2.6-linux-x86_64.egg/pyopencl/__init__.py",
> >> > line 4, in <module>
> >> > import pyopencl._cl as _cl
> >> > ImportError:
> >> >
> >>
> /usr/local/lib/python2.6/dist-packages/pyopencl-2011.1.2-py2.6-linux-x86_64.egg/pyopencl/_cl.so:
> >> > symbol clEnqueueWriteBufferRect, version OPENCL_1.1 not defined in
> file
> >> > libOpenCL.so with link time reference
> >> >
> >> > Its something probably simple however, I cannot get passed it.
> >> >
> >> > Here are my env vars:
> >> > export ATISTREAMSDKROOT=/opt/ati-stream-sdk-v2.3-lnx64
> >> > export ATISTREAMSDKSAMPLEROOT=/opt/ati-stream-sdk-v2.3-lnx64
> >> > export LD_LIBRARY_PATH=$ATISTREAMSDKROOT/lib/x86_64:$LD_LIBRARY_PATH
> >>
> >> Your OpenCL header does not match your OpenCL library. Check with
> >>
> >> $ ldd
> >>
> >>
> /usr/local/lib/python2.6/dist-packages/pyopencl-2011.1.2-py2.6-linux-x86_64.egg/pyopencl/_cl.so
> >> (one line)
> >>
> >> to see whether the library is the one you expect.
> >>
> >> HTH,
> >> Andreas
> >>
> >> PS: Please send email to pyopencl@, not pyopencl-owner@.
> >>
> >
> >
> >
> > --
> > M.Gelman
> > (412)-540-5238
> > LinkedIn Profile <http://www.linkedin.com/in/maxgelman>
> >
> > _______________________________________________
> > PyOpenCL mailing list
> > PyOpenCL(a)tiker.net
> > http://lists.tiker.net/listinfo/pyopencl
> >
> >
>
I'm wondering if yet-another-python-opencl-interface-layer would be of interest.
This layer is built on top of pyopencl, and is intended to use familiar
RPC definitions to ease the special joys of talking to opencl.
I've implemented this on OS/X and it is working for me and has been stable, oh,
for several days ;-) It eliminates the need for routine glue code,
and can be extended to minimize movement of data between host and opencl service.
This code is currently private, and I will be happy to either fork pyopencl and
submit a pull after integration, or a separate github project.
-- Sean
Sean True
Swapwizard Consulting
Presumed highlights:
# Use the rpc extension to define and load the kernel as a callable.
from rpc import kernels, interfaces
calc_fractal_opencl = kernels.loadProgram(interfaces.mandelbrot).mandelbrot
# Call it the way we like to call Python callables:
output = calc_fractal(q, maxiter)
# RPC definition language loosely based on Apollo NCS/OSF DCE/Microsoft IDL
# outlike is a novel keyword that says: allocate for me, return as part
of return vals.
interface mandelbrot {
kernel mandelbrot(in complex64 *q, outlike int16 *q, in int32 maxiter);
}
# mandelbrot.mako is just what it has always been.
__kernel void mandelbrot(__global float2 *q,
__global short *output, int const maxiter)
{
int gid = get_global_id(0);
float nreal, real = 0;
float imag = 0;
output[gid] = 0;
for(int curiter = 0; curiter < maxiter; curiter++) {
nreal = real*real - imag*imag + q[gid].x;
imag = 2* real*imag + q[gid].y;
real = nreal;
if (real*real + imag*imag > 4.0f)
output[gid] = curiter;
}
}