[PyCUDA] Inconsistent results between numpy and gpuarray dot products

Andreas Kloeckner lists at informa.tiker.net
Fri Jan 27 08:23:58 PST 2012


Hi Bogdan,

On Fri, 27 Jan 2012 13:27:04 +1100, Bogdan Opanchuk <mantihor at gmail.com> wrote:
> Hello,
> 
> As far as I understand, this is somehow connected with pycuda::complex
> having default constructor. When you write (in reduction kernel):
> 
> __shared__ out_type sdata[BLOCK_SIZE];
> 
> and "out_type" has the default constructor, it is called from _every_
> thread for _every_ value of the array. The reduction kernel starts
> like:
> 
> __shared__ out_type sdata[BLOCK_SIZE];
> ... collecting values ...
> sdata[tid] = <some value>
> 
> which means that there is a race between the default constructor and
> actual initialization. I am not sure why nvcc does not complain about
> this (bug?), but the following things do help:
> 1. Using "extern __shared__ out_type sdata[]" and set the size of
> shared memory when preparing the kernel.
> or
> 2. Putting "__syncthreads()" between default initialization and actual
> initialization (not very good, since it leaves all those calls to
> default constructor, but still removes the symptom).
> 
> I googled a bit, but could not find any actual rules about default
> constructors and shared memory.

Thanks for this insight! I don't think this would've occurred to me just
From staring at the code. :) Indeed, inserting __syncthreads() after the
shared array declaration brings the error down to more reasonable values
for me. Jesse, my recommendation would be to use that as a workaround
while we figure out a more permanent fix.

I just searched as well, but couldn't find anything. I've pinged Nathan
Bell at Nvidia (coauthor of thrust) to see what he thinks about this.

Andreas

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.tiker.net/pipermail/pycuda/attachments/20120127/7b9bd5e2/attachment.pgp>


More information about the PyCUDA mailing list