On Sat, 21 Apr 2012 20:55:48 -0300, Ezequiel Alfíe <ealfie(a)gmail.com> wrote:
Until now I was resorting to the 'trick' below
to avoid the limitation
of 2d grids.
I guess it should be useful for those needing for whatever reason on
using cuda < 4.0 or are using previous versions of pycuda.
const int bx = blockIdx.x;
const int by = blockIdx.y % Gy;
const int bz = blockIdx.y / Gy;
That's of course possible, but integer mod and div are very slow on
Nvidia's hardware--as far as I recall. 3D grids are better. And I'm
comfortable saying that, now that they're in. :)