I finally bit the bullet and got radix working in PyOpenCL :)
It's also improved over the SDK example because it does keys and values,
mostly thanks to my advisor.
Additionally this sort will handle any size array as long as it is a power
of 2. The shipped example does not allow for arrays smaller than 32768, but
I've hooked up their naive scan to allow all smaller arrays.
all you really need are radix.py, RadixSort.cl and Scan_b.cl
some simple tests are at the bottom of radix.py
I hammered this out because I need it for a project, it's not all that clean
and I didn't add support for sorting on keys only (altho it wouldn't take
much to add that, and I intend to at a later time when I need the
functionality). Hopefully this helps someone else out there. I'll also be
porting it using my own OpenCL C++ wrappers to include in my fluid
simulation library at some point.
I also began looking at AMD's radix from their SPH tutorial, but they use
local atomics which are not supported on my 9600M
I am writing with rather naive question, but what is
the status of those two, and when can we expect new
I am asking this because I am trying to build Debian
package of PyOpenCL. Finally Debian has AMD/ATI OpenCL
library, so I want to change PyOpenCL package to take
this into consideration and not to require NVIDIA OpenCL
libraries. For now, PyOpenCL tests crash:
============================= test session starts
platform linux2 -- Python 2.7.2 -- pytest-1.3.4
test path 1: test_array.py
ERROR collecting /usr/src/cuda/pyopencl/pyopencl-2011.1.2
test_array.py:669: in <module>
> register_dtype(mmc_dtype, "minmax_collector")
> % (dtype, DTYPE_TO_NAME[dtype], ", ".join(c_names)))
E RuntimeError: dtype '[('cur_min', '<i4'), ('cur_max',
'<i4'), ('pad', '<i4')]' already registered (as 'minmax_collector', new
=========================== 1 error in 0.21 seconds
So my question is - should I just wait for few days till situation
gets better, or is this more serious, requiring more time to fix,
and I should just package older version for now?
Tomasz Rybak <bogomips(a)post.pl> GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860
Did anyone get the pydev code completion to work with pyopencl? Adding
pyopencl to the "Forced Built ins" list didnt help. I dont get any code
completion and eclipse shows me a lot of errors because it doesnt find
Or is there a trick to get at least rid of the error messages?
On Tue, 22 Nov 2011 07:57:24 +0100, Matthias Vogelgesang <matthias.vogelgesang(a)gmail.com> wrote:
> 2011/11/22 Andreas Kloeckner <lists(a)informa.tiker.net>:
> > Fixed in git. The problem was that options defaulted to a list, which
> > the backend code didn't know how to process.
> Much appreciated, thanks.
Btw--PyOpenCL already already caches the binaries that get compiled. So
if you're after time savings by building from binary, there likely
won't be any. Just FYI. :)
At 01.11.2011 14:06, Frédéric Bastien wrote:
> The cuda develloper version of the driver always worked correctly when
> some others version failled for me. So you can try it if you want to
> check that powsobility. For the hardware, can you try on another computer?
Now i've tried it on a ATI Radeon HD 6970. And ... - it works!
So it's really seems to be a driver or hardware bug.
The problem for me is that the developer-driver ( v.197.16) is not
compatible for my device and the new VERDE v.285.62-Driver freezes my
system after installation (on a fresh installed Win 7 x64 Pro).
This is why i have to use the v.280.26.
I've wrote this the NVidia-Support-Team but still got no answer.
On Mon, 21 Nov 2011 21:09:20 +0100, Matthias Vogelgesang <matthias.vogelgesang(a)gmail.com> wrote:
> Hi folks,
> according to the reference, it should be possible to re-load the
> output from a built program. However, the following short code is not
> ctx = cl.create_some_context(False)
> platform = cl.get_platforms()
> device = platform.get_devices()
> program = cl.Program(ctx, """
> __kernel void simple(__global float *in, __global float *out)
> out[get_global_id(0)] = in[get_global_id(0)];
> binary = program.get_info(cl.program_info.BINARIES)
> foo = cl.Program(ctx, [device], [binary])
> # I cannot build ...
> # ... nor call the kernel
> Is there something wrong with this code or PyOpenCL?
Fixed in git. The problem was that options defaulted to a list, which
the backend code didn't know how to process.
according to the reference, it should be possible to re-load the
output from a built program. However, the following short code is not
ctx = cl.create_some_context(False)
platform = cl.get_platforms()
device = platform.get_devices()
program = cl.Program(ctx, """
__kernel void simple(__global float *in, __global float *out)
out[get_global_id(0)] = in[get_global_id(0)];
binary = program.get_info(cl.program_info.BINARIES)
foo = cl.Program(ctx, [device], [binary])
# I cannot build ...
# ... nor call the kernel
Is there something wrong with this code or PyOpenCL?