On Thu, 15 Jul 2010 12:34:45 -0400, Frédéric Bastien <nouiz(a)nouiz.org> wrote:
1) elemwise1.patch: This modification allow to pass
the block and grid
to the ElementwiseKernel generated fct. If not provided, it continue
Can you comment on the necessity of this patch? What in particular does
it help you achieve?
2) tools1.patch: recognize the
npy_float[32,64] data type.
I'm not sure this patch makes sense--the npy_* types are not really
defined on the GPU, so you can't use them in actual kernel code. And if
you'd like to specify numpy types as arguments, you can do the
From pycuda.tools import VectorArg, ScalarArg
From pycuda.elementwise import ElementwiseKernel
lin_comb = ElementwiseKernel(
"z[i] = a*x[i] + b*y[i]",
I don't use the gpuarray class that are passed to
the pycuda fct. I
modified mine to mimic its interface. While doing so, I saw that you
use the attribute size and mem_size that seam to always have the same
value? Is that true? If so, why both?
The docs say this:
.. attribute :: size
The number of meaningful entries in the array. Can also be computed by
multiplying up the numbers in :attr:`shape`.
.. attribute :: mem_size
The total number of entries, including padding, that are present in
the array. Padding may arise for example because of pitch adjustment by
Thanks for your contributions!