I also worked on a radix sort implementation, I had a rough working
implementation but I found that the numpy.argsort was much faster on my
machine. So I delayed that task for now. But if someone comes up with a
fast generic solution I would also be interested.
I used the sorting for a sph simulation.
I have only a notebook with an ATI graphics card. At least on that the
numpy sort was about 5 times faster even on several million entries.
Am 15.01.2012 22:22, schrieb Ian Johnson:
That code is the latest, I haven't touched it in a long time since my
work has taken me away from opencl for the time being. As for the
licensing, I put in an MIT license so its free as far as I'm
concerned. Some of the radix code comes straight from the nvidia sdk
example, we had to modify it a good bit to sort keys and values but
I'm not sure what their licenses are.
This is also definitely not the best implementation of radix, as there
is a much faster (and open) CUDA implementation. I would have hoped it
would be ported to OpenCL by now, and there is this project:
which is GPL.
good luck! I'd like to hear about any improvements that come along!
On Sun, Jan 15, 2012 at 9:35 AM, Andreas Kloeckner
<lists(a)informa.tiker.net <mailto:firstname.lastname@example.org>> wrote:
On Sun, 17 Apr 2011 22:29:41 -0400, Ian Johnson <enjalot(a)gmail.com
I finally bit the bullet and got radix working in
It's also improved over the SDK example because it does keys and
mostly thanks to my advisor.
Additionally this sort will handle any size array as long as it
is a power
of 2. The shipped example does not allow for
arrays smaller than
I've hooked up their naive scan to allow all
all you really need are radix.py, RadixSort.cl
some simple tests are at the bottom of radix.py
I hammered this out because I need it for a project, it's not
and I didn't add support for sorting on keys
only (altho it
much to add that, and I intend to at a later time
when I need the
functionality). Hopefully this helps someone else out there.
I'll also be
porting it using my own OpenCL C++ wrappers to
include in my fluid
simulation library at some point.
I also began looking at AMD's radix from their SPH tutorial, but
local atomics which are not supported on my 9600M
Out of personal need, I'm thinking of bringing some kind of sort
functionality into PyOpenCL. I saw that you made a number of
enhancements to your sort code since you sent the announcement. Is
most recent sort code still in the repo above? What is the license for
that code? More generally, what course of action would you recommend?
Thanks in advance for your help,
PyOpenCL mailing list