Kambiz Tavabi <ktavabi(a)uw.edu> writes:
> I am at a complete loss. I did a fresh reinstall of the OS (ubuntu 16.04)
> and the first thing I did was:
>
>
> - apt update; sudo apt upgrade
> - apt-get install nvidia-cuda-toolkit nvidia-361 nvidia-modprobe
> - Build install pycuda as before; nosetests failed with same error
> - pip uninstall pycuda
> - pip install pycuda
> - python -c "from pycuda import gpuarray, driver" > Same import error
>
> I wonder if I am hitting this issue
> <https://github.com/NervanaSystems/neon/issues/246> on the installation?
If you're seeing an error message that says
unsupported GNU version! gcc versions later than 4.9 are not supported
then yes, that's your problem. In general, if you don't quote (or
mis-quote, as you did) your error message, don't be surprised if nobody
can or wants to help you.
Andreas
I am at a complete loss. I did a fresh reinstall of the OS (ubuntu 16.04)
and the first thing I did was:
- apt update; sudo apt upgrade
- apt-get install nvidia-cuda-toolkit nvidia-361 nvidia-modprobe
- Build install pycuda as before; nosetests failed with same error
- pip uninstall pycuda
- pip install pycuda
- python -c "from pycuda import gpuarray, driver" > Same import error
I wonder if I am hitting this issue
<https://github.com/NervanaSystems/neon/issues/246> on the installation?
On Fri, Oct 28, 2016 at 2:41 PM Andreas Kloeckner <lists(a)informa.tiker.net>
wrote:
> Please keep the list cc'd.
>
> Kambiz Tavabi <ktavabi(a)uw.edu> writes:
> > It literally said
> _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev
> > which in retrospect is partly gibberish.
>
> It's not. :) For example, c++filt will decode it to
>
> std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> >::~basic_string()
>
> It's arguable whether that's gibberish--that depends on how much you
> like C++.
>
> At any rate, it tells me that different parts of your build process link
> against two different versions of the C++ standard library, which could
> very well be the result of the issue you describe regarding two versions
> of the toolkit being installed, or being built with two different
> compilers at least. (clang vs gcc maybe)
>
> > Sorry about that. Also, it was
> > just made clear to me that both Nvidia's toolkit package (i.e., CUDA 8)
> and
> > Ubuntu's Nvidia-cuda-toolkit package were installed, so this could all be
> > do to conflicts caused by faulty installations.
>
> Andreas
>
Kambiz Tavabi <ktavabi(a)uw.edu> writes:
> Hi
>
> I am Trying to get packages including pycuda-2016.1.2 working in a python
> 2.7 (Anaconda) environment. I am Running Ubuntu 1604 with working nvidia
> driver and CUDA 8.
> I installed pycuda via
>
> $ git clone http://git.tiker.net/trees/pycuda.git
> $ cd pycuda
> $ ./configure.py --cuda-enable-gl
> $ git submodule update --init
> $ make -j 4
> $ python setup.py install
>
> nosetests fail with ImportErrors referencing
> ...anaconda2/lib/python2.7/site-packages/pycuda-2016.1.2-py2.7-linux-x86_64.egg/pycuda/_driver.so:
> undefined symbol: gibbirish
Did it literally say 'gibbirish'? Because that symbol is the one thing
that might help me figure out what happened.
Andreas
Hi
I am Trying to get packages including pycuda-2016.1.2 working in a python
2.7 (Anaconda) environment. I am Running Ubuntu 1604 with working nvidia
driver and CUDA 8.
I installed pycuda via
$ git clone http://git.tiker.net/trees/pycuda.git
$ cd pycuda
$ ./configure.py --cuda-enable-gl
$ git submodule update --init
$ make -j 4
$ python setup.py install
nosetests fail with ImportErrors referencing
...anaconda2/lib/python2.7/site-packages/pycuda-2016.1.2-py2.7-linux-x86_64.egg/pycuda/_driver.so:
undefined symbol: gibbirish
The error persists after new anaconda python install, with clean pycuda
rebuild; as well as setting environment
<https://wiki.tiker.net/BoostInstallationHowto#LD_LIBRARY_PATH> using export
LD_LIBRARY_PATH=$HOME/pool/lib:${LD_LIBRARY_PATH}--although I don't really
understand what $HOME/pool/lib means since it doesn't even exist? I read a
few short reports about this on various forums and the problem is
identified as a mismatch between python headers--which unfortunately makes
little sense to me; I have no idea what kind of problem I am hitting here.
Thanks in advance for any help.
Hello all,
I get "pycuda._driver.LogicError: cuMemcpyDtoH failed: an illegal memory access was encountered" errors when I use pycuda with matrices over certain sizes. Only a restart of spyder remedies the issue. The matrix sizes are still well below what I believe my graphics card should be able to handle (a Geforce GTX 1060, 3GB). Is there a pycuda-driven limit?
I've created a fairly simple example which simply computes the cross products of two 3d-vectors.
The code works fine for up N approx. 35000 vectors. Above that, I get the following error:
Traceback (most recent call last): File "C:\owncloud\Python\float3_example.py", line 68, in <module> dest = c_gpu.get() File "C:\WinPython-64bit-3.5.2.2Qt5\python-3.5.2.amd64\lib\site-packages\pycuda-2016.1.2-py3.5-win-amd64.egg\pycuda\gpuarray.py", line 271, in get _memcpy_discontig(ary, self, async=async, stream=stream) File "C:\WinPython-64bit-3.5.2.2Qt5\python-3.5.2.amd64\lib\site-packages\pycuda-2016.1.2-py3.5-win-amd64.egg\pycuda\gpuarray.py", line 1190, in _memcpy_discontig drv.memcpy_dtoh(dst, src.gpudata)pycuda._driver.LogicError: cuMemcpyDtoH failed: an illegal memory access was encountered
Assuming the problem lies with my code rather than pyCuda - is there a problem with my usage of the float3 vector types inside but not outside the CUDA kernel? (The results are correct for small matrices.) I couldn't find a succint example of a best practice case of passing lists of 3d vectors (or float3s) to kernel using pyCuda. Or the way I have set up blocks and grids (I tried many)?
Many thanks!
Here's the very simple example:
from __future__ import print_functionfrom __future__ import absolute_importimport pycuda.autoinitimport numpyfrom pycuda.compiler import SourceModulefrom pycuda import gpuarray
mod = SourceModule("""__global__ void cross_products(float3* vCs, float3* vAs, float3* vBs, int w, int h){ const int c = blockIdx.x * blockDim.x + threadIdx.x; const int r = blockIdx.y * blockDim.y + threadIdx.y; int i = r * w + c; // 1D flat index // Check if within array bounds. if ((c >= w) || (r >= h)) { return; } float3 vA = vAs[i]; float3 vB = vBs[i]; float3 vC = make_float3(vA.y*vB.z - vA.z*vB.y, vA.z*vB.x - vA.x*vB.z, vA.x*vB.y - vA.y*vB.x); vCs[i] = vC; }""")
cross_products = mod.get_function("cross_products")N = 32000 #on my machine, this fails if N > 36000M = 3a = numpy.ndarray((N,M), dtype = numpy.float32)b = numpy.ndarray((N,M), dtype = numpy.float32)for i in range(0,N): a[i] = [1,0,0] b[i] = [0,1,0]
c = numpy.zeros((N,M), dtype = numpy.float32)
print("a x b")print(numpy.cross(a,b))
M_gpu = numpy.int32(M)N_gpu = numpy.int32(N)a_gpu = gpuarray.to_gpu(a) b_gpu = gpuarray.to_gpu(b)c_gpu = gpuarray.to_gpu(c)
bx = 32 #256by = 32 #1gdimX = (int)((M + bx-1) / bx);gdimY = (int)((N + by-1) / by); print("grid")print(gdimX)print(gdimY)cross_products(c_gpu, a_gpu, b_gpu, M_gpu, N_gpu, block=(bx,by,1), grid = (gdimX, gdimY))
dest = c_gpu.get()
print("dest")print(dest)print("diff")print(dest-numpy.cross(a,b))
Hello everybody,
I'm currently using pycuda and scikit-cuda to parallelize a simple code.
Basically I repeat this structure inside a for loop:
1-matrix/vector product (cublas.cublasDgemv)
2-elementwise division(cumisc.divide)
3-matrix/vector product
4-elementwise division
5-Error calculation
and I leave the loop when the error is small enough (You can see the
code at the end of the mail). I want to calculate the error on the GPU
and check with a if condition if it's small enough before breaking the
loop. error_dev and error_min_dev are both (1,) array but when I try to
compare them in the if condition, I get the following error:
File "./lib/Solvers/IPFP_GPU/functionsGPU.py", line 109, in
solve_IPFP_simple_gpu
if(error_dev < error_min_dev):
TypeError: an integer is required
and if I try to access to the only element of these arrays:
File "./lib/Solvers/IPFP_GPU/functionsGPU.py", line 129, in
solve_IPFP_simple_gpu
if(error_dev[0] < error_min_dev[0]):
File
"/home/slegrand/miniconda/lib/python2.7/site-packages/pycuda/gpuarray.py",
line 838, in __getitem__
array_shape = self.shape[array_axis]
IndexError: tuple index out of range
The only solution I found was to use the get_async() and to compare both
arrays on the CPU but I guess this is not the best solution... I
wondered if there is a way to compare these arrays without sending them
back to the CPU.
On the other hand, I wondered how is controlled the for loop. How are
the iterations synchronized with the GPU calculations?
Thanks for your time!
Best regards,
Simon Legrand
def solve_IPFP_simple_gpu(Mu, Nu, epsilon):
dtype = np.float64
mu = np.reshape(Mu.values,(1,np.size(Mu.values)))
nu = np.reshape(Nu.values,(np.size(Nu.values),1))
a = np.copy(nu)
C = quad_cost_matrix(Mu.vertices, Nu.vertices)
K = np.exp(-C/epsilon).astype(dtype)
handle = cublas.cublasCreate()
m = np.shape(K)[0]
n = np.shape(K)[1]
alpha = np.float64(1.0)
beta = np.float64(0.0)
mu_dev = gpuarray.to_gpu(mu)
s1_dev = gpuarray.empty(mu.T.shape,dtype)
nu_dev = gpuarray.to_gpu(nu)
s2_dev = gpuarray.empty(nu.shape,dtype)
K_dev = gpuarray.to_gpu(K)
a_dev = gpuarray.to_gpu(a)
an_dev = gpuarray.empty(a.shape,dtype)
b_dev = gpuarray.to_gpu(mu)
error_min_dev = gpuarray.to_gpu(np.array(1e-3).astype(np.float64))
niter_max = 1000
culinalg.init()
for i in xrange(0, niter_max):
cublas.cublasDgemv(handle, 't', m, n, alpha, K_dev.gpudata, m,
a_dev.gpudata, 1, beta, s1_dev.gpudata, 1)
b_dev = cumisc.divide(mu_dev,culinalg.transpose(s1_dev))
cublas.cublasDgemv(handle, 'n', n, m, alpha, K_dev.gpudata, n,
b_dev.gpudata, 1, beta, s2_dev.gpudata, 1)
an_dev = cumisc.divide(nu_dev, s2_dev)
error_dev =
cumisc.divide(cumisc.sum(cumisc.subtract(an_dev,a_dev)),cumisc.sum(a_dev))
a_dev = an_dev
print(error_dev.get_async(), error_min_dev.get_async())
if(error_dev < error_min_dev):
break
a = a_dev.get()
b = b_dev.get()
psi = np.reshape(epsilon*np.log(a),(np.size(a),))
phi = np.reshape(epsilon*np.log(b),(np.size(b),))
Gamma = K*a*b
cublas.cublasDestroy(handle)
return Gamma, phi, psi
I know that PyCUDA runs on the driver API, so it might be a little different
but I am looking for the equivalent of something like this
cudaGraphicsResourceGetMappedPointer(). I am manipulating an array in CUDA
and need to point the PBO to it so that it can be used in Interop between
OpenGL and PyCUDA. It looked like pycuda.gl.RegisteredMapping might be what
I need but I am unsure.
--
View this message in context: http://pycuda.2962900.n2.nabble.com/Pixel-Buffer-Object-Mapping-Pointer-tp7…
Sent from the PyCuda mailing list archive at Nabble.com.