On Wed, 14 Dec 2011 13:41:12 -0600, "Pazzula, Dominic J "
The subset index seems to drop (or is exclusive of) the last value.
This can be confusing if you are using range() to generate the array of indices. Instead
of needing the traditional range(start,end+1), you need to use range(start,end+2) as the
subset_dot() and the range() routine are exclusive of the last value.
Is this behavior as intended?
import numpy as np
import pycuda as cuda
n = 3
a = np.array(range(0,n**2),ndmin=2)
a = a.astype(np.float32)
g_a = cuda.gpuarray.to_gpu(a)
g_b = cuda.gpuarray.to_gpu(a)
subset = cuda.gpuarray.to_gpu(np.array(range(2,5)))
print "Subset Array", subset;
x = cuda.gpuarray.subset_dot(subset,g_a,g_b)
print "a[2:5] dot g[2:5]", x
[[ 0. 1. 2. 3. 4. 5. 6. 7. 8.]]
Subset Array [2 3 4]
a[2:5] dot g[2:5] 13.0
The issue is that 'subset' ends up being an array of 64-bit integers,
but the subset_* routines tacitly assume that you're giving them 32-bit
integers. This is fixed in git, although for performance reasons I would
still recommend 32-bit ints for 'subset', if you can get away with it.