On Sat, Apr 2, 2011 at 10:35 PM, Patric Holmvall <patric.hol(a)gmail.com> wrote:
As many of you might know, there are a lot of issues
with clrandom, for
example the high execution time on some platforms. Also, some of my
colleagues recently discovered that with more and more random numbers, the
quality quickly become worse and worse. I've been talking a bit to Andreas
about implementing a better PRNG in PyOpenCL, for example Ranlux or Mersenne
Twister (which I believe numpy is based on). Unfortunatley, I am not skilled
enough to do so, and Andreas seems to have a lot on his plate at the moment.
So what I'm asking is if anyone already has implemented a better PRNG in
PyOpenCL, or is interested in doing so. I suspect that it would be possible
to get a lot of the job done for free by using implementations done directly
in OpenCL. Here is an example of implementing Mersenne Twister:
I also happen to have the source code for a Ranlux implementation with C++
as host programming language, if anyone is interested.
I think I can do that, but I would propose slightly different
architecture. In my opinion, GPU-powered libraries should be separated
from API wrappers (PyCuda and PyOpenCL), since library code is mostly
identical for both platforms. So, if I were to implement RNG, I would
create something similar to PyFFT, which uses the single template to
create both Cuda and OpenCL versions of kernels.