[PyCUDA] failed test_gpuarray on GTX480
bogomips at post.pl
Sun Oct 3 10:20:23 PDT 2010
Dnia 2010-09-28, wto o godzinie 23:51 -0400, Andreas Kloeckner pisze:
> On Tue, 28 Sep 2010 23:56:47 +0200, Tomasz Rybak <bogomips at post.pl> wrote:
> > I have idea for (maybe) checking whether problem is with PyCUDA,
> > CUDA toolkit, or driver.
> > Can you force PyCUDA to generate not sm_20 code, but 1x?
> > I have found that it is determined in line 190 of file
> > pycuda/compiler.py:
> > arch = "sm_%d%d" % Context.get_device().compute_capability()
> > Try to change it to
> > arch = "sm_10"
> > and so on, and check whether you get incorrect 14 in such
> > a case.
> > If there is simpler way of changing architecture to which
> > PyCUDA generates code, feel free to use it and share this
> > information.
> arch can be overridden from the SourceModule arguments:
Yes, but code from this thread was calling GPUArray.dot,
which was calling ReductionKernel, and in none of those
I have seen ability to pass arch='sm_10' argument.
I have checked and
dot_ab_gpu = gpuarray.dot(a_gpu, b_gpu, arch='sm_11').get()
Tomasz Rybak <bogomips at post.pl> GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 836 bytes
Desc: This is a digitally signed message part
More information about the PyCUDA