Hello Andreas,
thank you for your feedback:
Which prerequisite must have a data structure to be good for GPU?
Should I allocate exact size of memory for each array ?
Is it ok to use numpy data stractures to compute arrays and execute
operations on the GPU, instead of python 'set' data structure?
e.g.
np.intersect1d(['a','beta','gamma'],['gamma','delta','omega'])
could it be parallelized ?
as approch, is it better to try to parallelize the operation of
intersection between two keys in a dictionary,
or rather import the whole dictionary (or a partition of it) in the GPU ?
On Sun, Dec 21, 2014 at 9:44 PM, Andreas Kloeckner <lists(a)informa.tiker.net>
wrote:
Luigi,
here are a few problems with your approach:
- The contents of your SourceModule is not valid C (as in, C the
programming language)
- 'set' is a Python data structure. PyCUDA will not magically swap out
the code of 'set' and execute its operations on the GPU.
- Working with arrays of variable-size objects (such as strings) on the
GPU is somewhat tricky. You'll have to come up with a good data
structure. In particular, just copying over a Python data structure
will not help--if it succeeds, the pointers in the structure will
point to CPU memory and be entirely useless on the GPU.
Andreas
Luigi Assom <luigi.assom(a)gmail.com> writes:
I need to parallelize a computation of
intersection of sets of keywords
over GPU .
As example, I will take a cosine similarity computing the intersection
between two sets.
(see also post:
http://stackoverflow.com/questions/22381939/python-calculate-cosine-similar…
)
I want to compute the similiarity, for each key value pairs of large
dictionaries.
The value of a key is indeed a set of thousands of elements, and they can
be strings.
Using multiprocessing I was able to improve by 4x, but i would like to
try
out GPU for really speed up the computation.
in the source module, i actually don't know how to declare my parameters
cause they are not float and i haven't found a tutorial using other data
structures than numerical arrays with numpy.
That's why I was I converted my lists of keywords in np.asarray() and I
have tried the following:
# convert list of strings into numpy array
key1 = 'key1'
array1 = np.asarray(D[key1])
# convert list of strings into numpy array
array2 = np.asarray(D[key2])
# assign memory to cuda
array1_cuda = cuda.mem_alloc(sys.getsizeof(array1))
array2_cuda = cuda.mem_alloc(sys.getsizeof(array2))
# and tried
mod = SourceModule("""
__global__ void cosine(*a, *b)
{
int idx = threadIdx.x + threadIdx.y*4;
proxy =
len(set(a[idx])&set(b[idx]))/math.sqrt(len(set(a[idx]))*len(set(b[idx])))
}
""")
a_gpu = gpuarray.to_gpu(array1)
b_gpu = gpuarray.to_gpu(array2)
proxy =
len(set(a_gpu)&set(b_gpu))/math.sqrt(len(set(a_gpu))*len(set(b_gpu)))
but I get
TypeError: GPUArrays are not hashable.
Is it a problem of data structure, or am I following a conceptual
mistake ?
with multiprocessing (without pyCuda) my code is:
## Measuring Performance: 4x !
with Timer() as t:
key = 'key1'
setParent = D[key]
ngbrProxy = set([])
p = Pool()
for ngbr in p.imap_unordered(cosine,setParent):
ngbrProxy.add(ngbr)
print "=> elasped lpush: %s s" % t.secs
I wonder how I could exploit the GPU for this type of computation: I am
not
working with numerical matrixes; on the
documentation of pyCuda i read it
is possibile to assign any type of data structures, even str, but I
couldn't find an example.
Could you please help in working this out ?
_______________________________________________
PyCUDA mailing list
PyCUDA(a)tiker.net
http://lists.tiker.net/listinfo/pycuda