I think thrust has a similar feature but it is limited to 1-D cases.
It shouldn't be too hard to adapt it to 2-D.
Jared from the thrust list has put up some code,which I can dig up if you
It is certainly doable and useful.
2012/11/11 Andreas Kloeckner <lists(a)informa.tiker.net>
Freddie Witherden <freddie(a)witherden.org>
I am wondering if anyone has worked up a class to
automatically select a
suitable thread block dimension given a function, nrow and ncol. I know
using OccupancyRecord I can determine the occupancy for a given number
of threads but it does not appear to be able to solve the inverse
While I know there is more to performance than just occupancy it does
often correlate with performance.
I know of no such thing, but I do see the usefulness. Whether it should
be a class or a function and many of the details are of course up in the
air, but I could imagine accepting something like this into PyCUDA, if
you've got time to work on it.
PyCUDA mailing list