The clpp project is also out there. The cantankerous issue with opencl is
getting the workgroups mapped propperly to each compute unit so for each
phase they can stream independently.
I CC'd Sean Baxter who is the current guru. I will be traveling to San
Diego later this week; so I won't be able do help out till next Monday.
On Jan 16, 2012 2:19 PM, "Dieter Morgenroth" <dieter.morgenroth(a)web.de>
I also worked on a radix sort implementation, I had a rough working
implementation but I found that the numpy.argsort was much faster on my
machine. So I delayed that task for now. But if someone comes up with a
fast generic solution I would also be interested.
I used the sorting for a sph simulation.
I have only a notebook with an ATI graphics card. At least on that the
numpy sort was about 5 times faster even on several million entries.
Am 15.01.2012 22:22, schrieb Ian Johnson:
That code is the latest, I haven't touched it in a long time since my
work has taken me away from opencl for the time being. As for the
licensing, I put in an MIT license so its free as far as I'm concerned.
Some of the radix code comes straight from the nvidia sdk example, we had
to modify it a good bit to sort keys and values but I'm not sure what their
This is also definitely not the best implementation of radix, as there
is a much faster (and open) CUDA implementation. I would have hoped it
would be ported to OpenCL by now, and there is this project:
which is GPL.
good luck! I'd like to hear about any improvements that come along!
On Sun, Jan 15, 2012 at 9:35 AM, Andreas Kloeckner <
> Hi Ian,
> On Sun, 17 Apr 2011 22:29:41 -0400, Ian Johnson <enjalot(a)gmail.com>
> > I finally bit the bullet and got radix working in PyOpenCL :)
> > It's also improved over the SDK example because it does keys and values,
> > mostly thanks to my advisor.
> > Additionally this sort will handle any size array as long as it is a
> > of 2. The shipped example does not allow for arrays smaller than 32768,
> > I've hooked up their naive scan to allow all smaller arrays.
> > all you really need are radix.py, RadixSort.cl and Scan_b.cl
> > some simple tests are at the bottom of radix.py
> > I hammered this out because I need it for a project, it's not all that
> > and I didn't add support for sorting on keys only (altho it wouldn't
> > much to add that, and I intend to at a later time when I need the
> > functionality). Hopefully this helps someone else out there. I'll also
> > porting it using my own OpenCL C++ wrappers to include in my fluid
> > simulation library at some point.
> > I also began looking at AMD's radix from their SPH tutorial, but they
> > local atomics which are not supported on my 9600M
> Out of personal need, I'm thinking of bringing some kind of sort
> functionality into PyOpenCL. I saw that you made a number of
> enhancements to your sort code since you sent the announcement. Is your
> most recent sort code still in the repo above? What is the license for
> that code? More generally, what course of action would you recommend?
> Thanks in advance for your help,
PyOpenCL mailing listPyOpenCL@tiker.nethttp://lists.tiker.net/listinfo/pyopencl
PyOpenCL mailing list