[PyOpenCL] Trouble understanding/applying ReductionKernel

Andreas Kloeckner lists at informa.tiker.net
Thu Jan 19 12:27:38 PST 2012


On Thu, 19 Jan 2012 12:30:54 -0700, Steve Spicklemire <steve at spvi.com> wrote:
> opencl/cuda Newbie here.. trying to use pyopencl/pycuda to learn my way around (use python a lot!) I have examples of what I've been trying to do to get familiar with the software. I'm trying to do an MC calculation of pi using the ReductionKernel. Here's what I've found:
> 
> <http://spvi.com/files/pyopencl-monte-carlo>
> 
> <http://spvi.com/files/pyopencl-mc-profile>
> 
> <http://spvi.com/files/pycuda-monte-carlo>
> 
> <http://spvi.com/files/pycuda-mc-profile>
> 
> I'm running on a macbook pro with GeForce GT 330M graphics.
> 
> I must be missing something basic. Both of these approaches are very
> slow.

I.e. 10**8 samples in 15s, that's 6M samples/s. What's your reference
value? Also note that clrandom has a 'luxury' value that can be turned
down to get random numbers faster. Further, it might be good to know
what part is slow. Python profiles are unfortunately unhelpful, as the
GPU runs asynchronously and only blocks on the outbound data transfer
(that's clearly visible in the CL profile, PyCUDA seems a bit more
complicated).

Use cl.enqueue_marker with a profiling-enabled command queue to figure
out what is actually taking the time, the reduction or the RNG.

HTH,
Andreas

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.tiker.net/pipermail/pyopencl/attachments/20120119/15fbed80/attachment.pgp>


More information about the PyOpenCL mailing list