Received from Ashwin Srinath on Wed, Dec 24, 2014 at 02:51:42PM EST:
> On Tue, Dec 23, 2014 at 6:45 PM, Lev Givon <lev(a)columbia.edu> wrote:
>
> > (Not sure if this is more of an mpi4py or a pycuda issue at this point.)
> >
> > I recently tried running a gist a wrote in the past [1] to test
> > communication of
> > data stored in GPU memory with pycuda using mpi4py compiled against OpenMPI
> > 1.8.* (which contains CUDA support). Using the latest revision (9a70e69)
> > compiled against OpenMPI 1.8.4 (which was in turn compiled against CUDA
> > 6.5 on
> > Ubuntu 14.04.1) and installed in a Python 2.7.6 virtualenv along with
> > pycuda
> > 2014.1 (also manually compiled against CUDA 6.5), I was able to run the
> > gist
> > without any problems. However, when I changed line 55 from
> >
> > x_gpu = gpuarray.arange(100, 200, 10, dtype=np.double)
> >
> > to
> >
> > x_gpu = gpuarray.to_gpu(np.arange(100, 200, 10, dtype=np.double))
> >
> > the data transfer succeeded but was immediately followed by the following
> > error:
> >
> > [avicenna:32494] *** Process received signal ***
> > [avicenna:32494] Signal: Segmentation fault (11)
> > [avicenna:32494] Signal code: Address not mapped (1)
> > [avicenna:32494] Failing at address: (nil)
> > [avicenna:32494] [ 0]
> > /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x2ba2e8fe2340]
> > [avicenna:32494] [ 1]
> > /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x1f60f5)[0x2ba2fd19b0f5]
> > [avicenna:32494] [ 2]
> > /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x20470b)[0x2ba2fd1a970b]
> > [avicenna:32494] [ 3]
> > /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x17ac02)[0x2ba2fd11fc02]
> > [avicenna:32494] [ 4]
> > /usr/lib/x86_64-linux-gnu/libcuda.so.1(cuStreamDestroy_v2+0x52)[0x2ba2fd0eeb32]
> > [avicenna:32494] [ 5]
> > /opt/openmpi-1.8.4/lib/libmpi.so.1(mca_common_cuda_fini+0x1c3)[0x2ba2f57718a3]
> > [avicenna:32494] [ 6]
> > /opt/openmpi-1.8.4/lib/libmpi.so.1(+0xf5e3e)[0x2ba2f57aee3e]
> > [avicenna:32494] [ 7]
> > /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_component_close+0x19)[0x2ba2f6122099]
> > [avicenna:32494] [ 8]
> > /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_components_close+0x42)[0x2ba2f6122112]
> > [avicenna:32494] [ 9]
> > /opt/openmpi-1.8.4/lib/libmpi.so.1(+0xd7515)[0x2ba2f5790515]
> > [avicenna:32494] [10]
> > /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_framework_close+0x63)[0x2ba2f612b3c3]
> > [avicenna:32494] [11]
> > /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_framework_close+0x63)[0x2ba2f612b3c3]
> > [avicenna:32494] [12]
> > /opt/openmpi-1.8.4/lib/libmpi.so.1(ompi_mpi_finalize+0x56d)[0x2ba2f573693d]
> > [avicenna:32494] [13]
> > /home/lev/Work/virtualenvs/PYTHON/lib/python2.7/site-packages/mpi4py/MPI.so(+0x2e694)[0x2ba2f53b2694]
> > [avicenna:32494] [14] python(Py_Finalize+0x1a6)[0x42fb0f]
> > [avicenna:32494] [15] python(Py_Main+0xbed)[0x46ac10]
> > [avicenna:32494] [16]
> > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x2ba2e9211ec5]
> > [avicenna:32494] [17] python[0x57497e]
> > [avicenna:32494] *** End of error message ***
> >
> > I also tried replacing line 55 with
> >
> > x_gpu = gpuarray.zeros(10, dtype=np.double)
> > x_gpu.set(np.arange(100, 200, 10, dtype=np.double))
> >
> > which resulted in no error and
> >
> > x_gpu = gpuarray.empty(10, dtype=np.double)
> > x_gpu.set(np.arange(100, 200, 10, dtype=np.double))
> >
> > which resulted in the same error as mentioned earlier.
> >
> > Any ideas as to what could be going on?
> >
> > [1] https://gist.github.com/8514d3456a94a6c73e6d
>
> Hi Lev,
>
> This code worked for me (even after changing line 55 to use
> 'gpuarray.to_gpu(np.arange...'). I'm on an environment very similar to
> yours.
Did the code run without error on your system after modifying line 55 even
without MPI.Finalize() added to the end of the code?
> Just a couple of suggestions:
>
> 1. Insert MPI.Finalize() at the end of your code.
> 2. If you're not already, pass the parameter '--mca pml ob1' to your
> mpiexec command.
Adding the call to MPI.Finalize() made the error go away even when using
gpuarray.to_gpu(); adding the extra mca parameters didn't appear to have any effect.
My understanding is that the call to MPI.Finalize() should be automatically
registered to be executed when the processes exit; this makes me wonder whether
my explicitly registering the pycuda method that cleans up the current context
is causing problems. I'll see what the folks on the mpi4py list have to say.
Thanks,
--
Lev Givon
Bionet Group | Neurokernel Project
http://www.columbia.edu/~lev/http://lebedov.github.io/http://neurokernel.github.io/
Lev,
The code ran without MPI.Finalize(), but I got an error about
cuMemHostUnregister. Not sure why this happens, but I've mentioned it
before in the mpi4py forum:
https://groups.google.com/forum/#!msg/mpi4py/xd-SR1b6GZ0/CdyHFWUNhskJ
Thanks,
Ashwin
On Wed, Dec 24, 2014 at 3:10 PM, Lev Givon <lev(a)columbia.edu> wrote:
> Received from Ashwin Srinath on Wed, Dec 24, 2014 at 02:51:42PM EST:
> > On Tue, Dec 23, 2014 at 6:45 PM, Lev Givon <lev(a)columbia.edu> wrote:
> >
> > > (Not sure if this is more of an mpi4py or a pycuda issue at this
> point.)
> > >
> > > I recently tried running a gist a wrote in the past [1] to test
> > > communication of
> > > data stored in GPU memory with pycuda using mpi4py compiled against
> OpenMPI
> > > 1.8.* (which contains CUDA support). Using the latest revision
> (9a70e69)
> > > compiled against OpenMPI 1.8.4 (which was in turn compiled against CUDA
> > > 6.5 on
> > > Ubuntu 14.04.1) and installed in a Python 2.7.6 virtualenv along with
> > > pycuda
> > > 2014.1 (also manually compiled against CUDA 6.5), I was able to run the
> > > gist
> > > without any problems. However, when I changed line 55 from
> > >
> > > x_gpu = gpuarray.arange(100, 200, 10, dtype=np.double)
> > >
> > > to
> > >
> > > x_gpu = gpuarray.to_gpu(np.arange(100, 200, 10, dtype=np.double))
> > >
> > > the data transfer succeeded but was immediately followed by the
> following
> > > error:
> > >
> > > [avicenna:32494] *** Process received signal ***
> > > [avicenna:32494] Signal: Segmentation fault (11)
> > > [avicenna:32494] Signal code: Address not mapped (1)
> > > [avicenna:32494] Failing at address: (nil)
> > > [avicenna:32494] [ 0]
> > > /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x2ba2e8fe2340]
> > > [avicenna:32494] [ 1]
> > > /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x1f60f5)[0x2ba2fd19b0f5]
> > > [avicenna:32494] [ 2]
> > > /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x20470b)[0x2ba2fd1a970b]
> > > [avicenna:32494] [ 3]
> > > /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x17ac02)[0x2ba2fd11fc02]
> > > [avicenna:32494] [ 4]
> > >
> /usr/lib/x86_64-linux-gnu/libcuda.so.1(cuStreamDestroy_v2+0x52)[0x2ba2fd0eeb32]
> > > [avicenna:32494] [ 5]
> > >
> /opt/openmpi-1.8.4/lib/libmpi.so.1(mca_common_cuda_fini+0x1c3)[0x2ba2f57718a3]
> > > [avicenna:32494] [ 6]
> > > /opt/openmpi-1.8.4/lib/libmpi.so.1(+0xf5e3e)[0x2ba2f57aee3e]
> > > [avicenna:32494] [ 7]
> > >
> /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_component_close+0x19)[0x2ba2f6122099]
> > > [avicenna:32494] [ 8]
> > >
> /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_components_close+0x42)[0x2ba2f6122112]
> > > [avicenna:32494] [ 9]
> > > /opt/openmpi-1.8.4/lib/libmpi.so.1(+0xd7515)[0x2ba2f5790515]
> > > [avicenna:32494] [10]
> > >
> /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_framework_close+0x63)[0x2ba2f612b3c3]
> > > [avicenna:32494] [11]
> > >
> /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_framework_close+0x63)[0x2ba2f612b3c3]
> > > [avicenna:32494] [12]
> > >
> /opt/openmpi-1.8.4/lib/libmpi.so.1(ompi_mpi_finalize+0x56d)[0x2ba2f573693d]
> > > [avicenna:32494] [13]
> > >
> /home/lev/Work/virtualenvs/PYTHON/lib/python2.7/site-packages/mpi4py/MPI.so(+0x2e694)[0x2ba2f53b2694]
> > > [avicenna:32494] [14] python(Py_Finalize+0x1a6)[0x42fb0f]
> > > [avicenna:32494] [15] python(Py_Main+0xbed)[0x46ac10]
> > > [avicenna:32494] [16]
> > > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x2ba2e9211ec5]
> > > [avicenna:32494] [17] python[0x57497e]
> > > [avicenna:32494] *** End of error message ***
> > >
> > > I also tried replacing line 55 with
> > >
> > > x_gpu = gpuarray.zeros(10, dtype=np.double)
> > > x_gpu.set(np.arange(100, 200, 10, dtype=np.double))
> > >
> > > which resulted in no error and
> > >
> > > x_gpu = gpuarray.empty(10, dtype=np.double)
> > > x_gpu.set(np.arange(100, 200, 10, dtype=np.double))
> > >
> > > which resulted in the same error as mentioned earlier.
> > >
> > > Any ideas as to what could be going on?
> > >
> > > [1] https://gist.github.com/8514d3456a94a6c73e6d
> >
> > Hi Lev,
> >
> > This code worked for me (even after changing line 55 to use
> > 'gpuarray.to_gpu(np.arange...'). I'm on an environment very similar to
> > yours.
>
> Did the code run without error on your system after modifying line 55 even
> without MPI.Finalize() added to the end of the code?
>
> > Just a couple of suggestions:
> >
> > 1. Insert MPI.Finalize() at the end of your code.
> > 2. If you're not already, pass the parameter '--mca pml ob1' to your
> > mpiexec command.
>
> Adding the call to MPI.Finalize() made the error go away even when using
> gpuarray.to_gpu(); adding the extra mca parameters didn't appear to have
> any effect.
> My understanding is that the call to MPI.Finalize() should be automatically
> registered to be executed when the processes exit; this makes me wonder
> whether
> my explicitly registering the pycuda method that cleans up the current
> context
> is causing problems. I'll see what the folks on the mpi4py list have to
> say.
>
> Thanks,
> --
> Lev Givon
> Bionet Group | Neurokernel Project
> http://www.columbia.edu/~lev/
> http://lebedov.github.io/
> http://neurokernel.github.io/
>
>
(Not sure if this is more of an mpi4py or a pycuda issue at this point.)
I recently tried running a gist a wrote in the past [1] to test communication of
data stored in GPU memory with pycuda using mpi4py compiled against OpenMPI
1.8.* (which contains CUDA support). Using the latest revision (9a70e69)
compiled against OpenMPI 1.8.4 (which was in turn compiled against CUDA 6.5 on
Ubuntu 14.04.1) and installed in a Python 2.7.6 virtualenv along with pycuda
2014.1 (also manually compiled against CUDA 6.5), I was able to run the gist
without any problems. However, when I changed line 55 from
x_gpu = gpuarray.arange(100, 200, 10, dtype=np.double)
to
x_gpu = gpuarray.to_gpu(np.arange(100, 200, 10, dtype=np.double))
the data transfer succeeded but was immediately followed by the following error:
[avicenna:32494] *** Process received signal ***
[avicenna:32494] Signal: Segmentation fault (11)
[avicenna:32494] Signal code: Address not mapped (1)
[avicenna:32494] Failing at address: (nil)
[avicenna:32494] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x2ba2e8fe2340]
[avicenna:32494] [ 1] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x1f60f5)[0x2ba2fd19b0f5]
[avicenna:32494] [ 2] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x20470b)[0x2ba2fd1a970b]
[avicenna:32494] [ 3] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x17ac02)[0x2ba2fd11fc02]
[avicenna:32494] [ 4] /usr/lib/x86_64-linux-gnu/libcuda.so.1(cuStreamDestroy_v2+0x52)[0x2ba2fd0eeb32]
[avicenna:32494] [ 5] /opt/openmpi-1.8.4/lib/libmpi.so.1(mca_common_cuda_fini+0x1c3)[0x2ba2f57718a3]
[avicenna:32494] [ 6] /opt/openmpi-1.8.4/lib/libmpi.so.1(+0xf5e3e)[0x2ba2f57aee3e]
[avicenna:32494] [ 7] /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_component_close+0x19)[0x2ba2f6122099]
[avicenna:32494] [ 8] /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_components_close+0x42)[0x2ba2f6122112]
[avicenna:32494] [ 9] /opt/openmpi-1.8.4/lib/libmpi.so.1(+0xd7515)[0x2ba2f5790515]
[avicenna:32494] [10] /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_framework_close+0x63)[0x2ba2f612b3c3]
[avicenna:32494] [11] /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_framework_close+0x63)[0x2ba2f612b3c3]
[avicenna:32494] [12] /opt/openmpi-1.8.4/lib/libmpi.so.1(ompi_mpi_finalize+0x56d)[0x2ba2f573693d]
[avicenna:32494] [13] /home/lev/Work/virtualenvs/PYTHON/lib/python2.7/site-packages/mpi4py/MPI.so(+0x2e694)[0x2ba2f53b2694]
[avicenna:32494] [14] python(Py_Finalize+0x1a6)[0x42fb0f]
[avicenna:32494] [15] python(Py_Main+0xbed)[0x46ac10]
[avicenna:32494] [16] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x2ba2e9211ec5]
[avicenna:32494] [17] python[0x57497e]
[avicenna:32494] *** End of error message ***
I also tried replacing line 55 with
x_gpu = gpuarray.zeros(10, dtype=np.double)
x_gpu.set(np.arange(100, 200, 10, dtype=np.double))
which resulted in no error and
x_gpu = gpuarray.empty(10, dtype=np.double)
x_gpu.set(np.arange(100, 200, 10, dtype=np.double))
which resulted in the same error as mentioned earlier.
Any ideas as to what could be going on?
[1] https://gist.github.com/8514d3456a94a6c73e6d
--
Lev Givon
Bionet Group | Neurokernel Project
http://www.columbia.edu/~lev/http://lebedov.github.io/http://neurokernel.github.io/
... and of you want to stick with the non-repo version, look into how to
upgrade six using pip. It's something like:
pip install --upgrade --user six
Eric
On Dec 21, 2014 9:38 PM, "Eric Larson" <larson.eric.d(a)gmail.com> wrote:
> I installed pycuda from the Ubuntu 14.04 repos and that worked fine. That
> might be why this hadn't been seen before, because other folks took the
> simpler route of either using all repo versions (apt-get) or all outside
> versions (e.g., anaconda). Generally speaking, I've found that repo
> versions are packaged to be compatible with themselves, but if you
> build/install your own versions of certain libraries, you will likely have
> to update other library versions as well in the process.
>
> Eric
> On Dec 21, 2014 6:44 PM, "Donald Osmeyer" <d_osmeyer(a)hotmail.com> wrote:
>
>>
>> I followed your advice and checked out why I have six version 1.5.2....
>>
>> It looks like six 1.5.2 is the version that Ubuntu 14.04 ships with (at
>> least the desktop x64 version).
>> I downloaded the most recent version of Ubuntu desktop from their website
>> and made a bootable USB with it.
>> I booted up Ubuntu from the USB and in the terminal did this:
>> $python
>> ...
>> >>>import six
>> >>>six.__version__
>> '1.5.2'
>>
>> Then I made a clean install of Ubuntu on to my hard drive and booted to
>> that. The same version was present.
>> So, of course the next step is to do apt-get update and upgrade. Still
>> 1.5.2.
>> I guess the question now is whether this is the version of six that is
>> supposed to come packed with Ubuntu or not.
>> After looking for solutions online I discovered the 'apt-cache policy'
>> command which seemed useful. Here's the output:
>>
>> donald@Beast:~$ apt-cache policy python-six
>> python-six:
>> Installed: 1.5.2-1
>> Candidate: 1.5.2-1
>> Version table:
>> *** 1.5.2-1 0
>> 500 http://us.archive.ubuntu.com/ubuntu/ trusty/main amd64
>> Packages
>> 100 /var/lib/dpkg/status
>>
>>
>> Based on what I found on the launchpad website (
>> https://launchpad.net/ubuntu/+source/six) this seems to be the right
>> version of six
>> for Ubuntu 14.04. But if that is the case, then I would think a lot more
>> people would have the same problem as I do. Unless there is a simple work
>> around that I don't know about. I find it unlikely that I'm the first
>> person to try to install pycuda on Ubuntu 14.04.
>> Are there any other suggestions/hints/helpful ideas??
>>
>>
>> > From: lists(a)informa.tiker.net
>> > To: d_osmeyer(a)hotmail.com; pycuda(a)tiker.net
>> > Subject: Re: [PyCUDA] Running test_driver.py, Import Error: cannot
>> import name intern
>> > Date: Sat, 20 Dec 2014 17:03:31 -0600
>> >
>> > Donald Osmeyer <donald.osmeyer(a)outlook.com> writes:
>> >
>> > > I just installed Ubuntu 14.04, the Nvidia driver 340.29, cuda version
>> 6.5.12. I tried to install pycuda-2014.1 using the instructions found at
>> http://wiki.tiker.net/PyCuda/Installation/Linux/Ubuntu
>> > >
>> > > Everything seems to install fine. In particular, I get this message
>> as part of the isntall:
>> > >
>> > >
>> > > Using /usr/local/lib/python2.7/dist-packages/py-1.4.26-py2.7.egg
>> > > Searching for six==1.5.2
>> > > Best match: six 1.5.2
>> > > Adding six 1.5.2 to easy-install.pth file
>> > >
>> > > Unfortunately, when I try to run the test I get this error:
>> > >
>> > > donald@Beast:~/pycuda-2014.1/test$ python test_driver.py
>> > > Traceback (most recent call last):
>> > > File "test_driver.py", line 17, in <module>
>> > > import pycuda.gpuarray as gpuarray
>> > > File
>> "/usr/local/lib/python2.7/dist-packages/pycuda-2014.1-py2.7-linux-x86_64.egg/pycuda/gpuarray.py",
>> line 3, in <module>
>> > > import pycuda.elementwise as elementwise
>> > > File
>> "/usr/local/lib/python2.7/dist-packages/pycuda-2014.1-py2.7-linux-x86_64.egg/pycuda/elementwise.py",
>> line 34, in <module>
>> > > from pytools import memoize_method
>> > > File
>> "/usr/local/lib/python2.7/dist-packages/pytools-2014.3.5-py2.7.egg/pytools/__init__.py",
>> line 5, in <module>
>> > > from six.moves import range, zip, intern, input
>> > > ImportError: cannot import name intern
>> > > donald@Beast:~/pycuda-2014.1/test$
>> >
>> > six 1.5.2 is very outdated. (1.8.0 is current.) I wonder how your system
>> > wound up figuring that was the best version to use. Could you try and
>> > troubleshoot that?
>> >
>> > Andreas
>>
>> _______________________________________________
>> PyCUDA mailing list
>> PyCUDA(a)tiker.net
>> http://lists.tiker.net/listinfo/pycuda
>>
>>
I just installed Ubuntu 14.04, the Nvidia driver 340.29, cuda version 6.5.12. I tried to install pycuda-2014.1 using the instructions found at http://wiki.tiker.net/PyCuda/Installation/Linux/Ubuntu
Everything seems to install fine. In particular, I get this message as part of the isntall:
Using /usr/local/lib/python2.7/dist-packages/py-1.4.26-py2.7.egg
Searching for six==1.5.2
Best match: six 1.5.2
Adding six 1.5.2 to easy-install.pth file
Unfortunately, when I try to run the test I get this error:
donald@Beast:~/pycuda-2014.1/test$ python test_driver.py
Traceback (most recent call last):
File "test_driver.py", line 17, in <module>
import pycuda.gpuarray as gpuarray
File "/usr/local/lib/python2.7/dist-packages/pycuda-2014.1-py2.7-linux-x86_64.egg/pycuda/gpuarray.py", line 3, in <module>
import pycuda.elementwise as elementwise
File "/usr/local/lib/python2.7/dist-packages/pycuda-2014.1-py2.7-linux-x86_64.egg/pycuda/elementwise.py", line 34, in <module>
from pytools import memoize_method
File "/usr/local/lib/python2.7/dist-packages/pytools-2014.3.5-py2.7.egg/pytools/__init__.py", line 5, in <module>
from six.moves import range, zip, intern, input
ImportError: cannot import name intern
donald@Beast:~/pycuda-2014.1/test$
I'm not sure where to go from here?
Donald
Luigi Assom <luigi.assom(a)gmail.com> writes:
> Hello Andreas,
>
> thank you for your feedback:
>
> Which prerequisite must have a data structure to be good for GPU?
> Should I allocate exact size of memory for each array ?
I hate to say it, but let me just state two facts: (1) There's no canned
functionality for what you'd like to accomplish (intersection of string
lists, IIUC). You'd have to piece this together out of parallel
primitives (scan/sort) or just handwritten (in C) kernels. String
sorting on the GPU is something that people publish legitimate research
papers on. (2) Learning to do these things (unfortunately) takes a
considerable time investment--think months. So please understand that
the friendly folks on the mailing list won't be able to give you a quick
tutorial, or even provide you with an 'easy answer'.
Sorry,
Andreas
Luigi Assom <luigi.assom(a)gmail.com> writes:
> pycuda.VERSION
>
> (2014, 1)
That's not specific enough unfortunately--where did you get your version
of PyCUDA?
Andreas
Luigi Assom <luigi.assom(a)gmail.com> writes:
> Hello,
>
> i had installed cuda driver and pycuda on mac, it looks working ok after
> testing it with tests in NVDIA installation guide.
>
> when I try to run a mod = SourceModule ()
> I run into this error which looks to have problems with nvcc compiler.
>
> I cannot figure out where the problem may be.
> I check PATH for finding nvcc and looks ok...
What version of PyCUDA are you using?
Andreas
Luigi,
here are a few problems with your approach:
- The contents of your SourceModule is not valid C (as in, C the
programming language)
- 'set' is a Python data structure. PyCUDA will not magically swap out
the code of 'set' and execute its operations on the GPU.
- Working with arrays of variable-size objects (such as strings) on the
GPU is somewhat tricky. You'll have to come up with a good data
structure. In particular, just copying over a Python data structure
will not help--if it succeeds, the pointers in the structure will
point to CPU memory and be entirely useless on the GPU.
Andreas
Luigi Assom <luigi.assom(a)gmail.com> writes:
> I need to parallelize a computation of intersection of sets of keywords
> over GPU .
>
> As example, I will take a cosine similarity computing the intersection
> between two sets.
> (see also post:
> http://stackoverflow.com/questions/22381939/python-calculate-cosine-similar…
> )
>
> I want to compute the similiarity, for each key value pairs of large
> dictionaries.
>
> The value of a key is indeed a set of thousands of elements, and they can
> be strings.
>
> Using multiprocessing I was able to improve by 4x, but i would like to try
> out GPU for really speed up the computation.
>
> in the source module, i actually don't know how to declare my parameters
> cause they are not float and i haven't found a tutorial using other data
> structures than numerical arrays with numpy.
> That's why I was I converted my lists of keywords in np.asarray() and I
> have tried the following:
>
>
>
> # convert list of strings into numpy array
> key1 = 'key1'
> array1 = np.asarray(D[key1])
>
> # convert list of strings into numpy array
> array2 = np.asarray(D[key2])
>
> # assign memory to cuda
>
> array1_cuda = cuda.mem_alloc(sys.getsizeof(array1))
> array2_cuda = cuda.mem_alloc(sys.getsizeof(array2))
>
> # and tried
>
> mod = SourceModule("""
> __global__ void cosine(*a, *b)
> {
> int idx = threadIdx.x + threadIdx.y*4;
> proxy =
> len(set(a[idx])&set(b[idx]))/math.sqrt(len(set(a[idx]))*len(set(b[idx])))
>
> }
> """)
>
>
>
> a_gpu = gpuarray.to_gpu(array1)
> b_gpu = gpuarray.to_gpu(array2)
>
> proxy =
> len(set(a_gpu)&set(b_gpu))/math.sqrt(len(set(a_gpu))*len(set(b_gpu)))
>
>
>
>
> but I get
>
> TypeError: GPUArrays are not hashable.
>
>
> Is it a problem of data structure, or am I following a conceptual mistake ?
>
>
> with multiprocessing (without pyCuda) my code is:
>
> ## Measuring Performance: 4x !
> with Timer() as t:
> key = 'key1'
> setParent = D[key]
> ngbrProxy = set([])
> p = Pool()
> for ngbr in p.imap_unordered(cosine,setParent):
> ngbrProxy.add(ngbr)
>
> print "=> elasped lpush: %s s" % t.secs
>
> I wonder how I could exploit the GPU for this type of computation: I am not
> working with numerical matrixes; on the documentation of pyCuda i read it
> is possibile to assign any type of data structures, even str, but I
> couldn't find an example.
>
> Could you please help in working this out ?
> _______________________________________________
> PyCUDA mailing list
> PyCUDA(a)tiker.net
> http://lists.tiker.net/listinfo/pycuda
Hello,
i had installed cuda driver and pycuda on mac, it looks working ok after
testing it with tests in NVDIA installation guide.
when I try to run a mod = SourceModule ()
I run into this error which looks to have problems with nvcc compiler.
I cannot figure out where the problem may be.
I check PATH for finding nvcc and looks ok...
echo $PATH
/Developer/NVIDIA/CUDA-6.5/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/bin/ruby
echo $DYLD_LIBRARY_PATH
/Developer/NVIDIA/CUDA-6.5/lib:
Traceback (most recent call last):
File "<stdin>", line 12, in <module>
File
"/Library/Python/2.7/site-packages/pycuda-2014.1-py2.7-macosx-10.9-x86_64.egg/pycuda/compiler.py",
line 251, in __init__
arch, code, cache_dir, include_dirs)
File
"/Library/Python/2.7/site-packages/pycuda-2014.1-py2.7-macosx-10.9-x86_64.egg/pycuda/compiler.py",
line 241, in compile
return compile_plain(source, options, keep, nvcc, cache_dir)
File
"/Library/Python/2.7/site-packages/pycuda-2014.1-py2.7-macosx-10.9-x86_64.egg/pycuda/compiler.py",
line 79, in compile_plain
checksum.update(get_nvcc_version(nvcc).encode("utf-8"))
File "<string>", line 2, in get_nvcc_version
File
"/Library/Python/2.7/site-packages/pytools-2014.3.5-py2.7.egg/pytools/__init__.py",
line 430, in _deco
result = func(*args)
File
"/Library/Python/2.7/site-packages/pycuda-2014.1-py2.7-macosx-10.9-x86_64.egg/pycuda/compiler.py",
line 13, in get_nvcc_version
result, stdout, stderr = call_capture_output(cmdline)
File
"/Library/Python/2.7/site-packages/pytools-2014.3.5-py2.7.egg/pytools/prefork.py",
line 197, in call_capture_output
return forker[0].call_capture_output(cmdline, cwd, error_on_nonzero)
File
"/Library/Python/2.7/site-packages/pytools-2014.3.5-py2.7.egg/pytools/prefork.py",
line 54, in call_capture_output
% ( " ".join(cmdline), e))
*pytools.prefork.ExecError: error invoking 'nvcc --version': [Errno 20] Not
a directory*
As example, I use:
mod = SourceModule("""
__global__ void process(char **dest, char **line)
{
int tID = threadIdx.x ;//+ blockIdx.x * blockDim.x;
dest[tID] = line[tID];
}
""")
from this example
http://stackoverflow.com/questions/18808279/pycuda-using-string-but-not-arr…
Could you please help ?
--
Luigi Assom
Skype contact: oggigigi