[PyOpenCL] Curious Nvidia CL bug

Andreas Kloeckner lists at informa.tiker.net
Mon Jan 16 17:11:59 PST 2012


Hi all,

Here's what I think is a bug in the Nvidia CL implementation. The
assertion fails on my GTX 260, but is fine just about anywhere else.

8< --------------------------------------------------------
#! /usr/bin/env python

import pyopencl as cl
import pyopencl.array
import numpy as np


ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

n = 2**20 + 1
dtype = np.int32
host_data = np.random.randint(0, 10, n).astype(dtype)
dev_data = cl.array.to_device(queue, host_data)
host_data_2 = dev_data.get()

assert (host_data == host_data_2).all()
8< --------------------------------------------------------

Yes, you read that right--it seems they messed up just transferring a
bit of memory. As is to be expected, the bug is very sensitive to the
size (n).

Infuriatingly, this happened in the unit test for parallel scan, so for
a very long time I hunted for a scan bug that didn't exist. :(

I hope I'm not crazy. In any case, I thought this might be important
enough to warn you guys about. Happens for me with 290.x and 295.x
drivers on a GTX260. No problem on Fermi.

Andreas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.tiker.net/pipermail/pyopencl/attachments/20120116/af6f221a/attachment.pgp>


More information about the PyOpenCL mailing list