I've tested your code on AMD APU: running it on GPU does
not crash, but running on CPU causes crash.
It is not related to memory transfer - crash occurs even
when I comment out cl.enqueue_copy.
OTOH code does not crash when I comment out kernel call,
or when I comment out line
in the kernel.
Probably you have problems with code in your kernel accessing
memory outside allocated buffers - and it causes crashes.
Tomasz Rybak GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860
After fiddling with my (rudiments of a) particle simulation further, I ran
into some interesting bugs. Here's what the code is supposed to do:
* Set up an array named position of shape(number of particles, 3)
* Set up an array result with (smallest number >= number of particles that
is a multiple of the work group size, 3)
* Run the kernel which, for every global_id < number of particles
calculates the average of the position of every other particle and writes
it into the result buffer, and the zero vector for every other global_id
* Copy the result buffer into the result array
* Print the result array
What it actually *does* do is segfaulting when trying to print(result), or
printing an array mostly full of nan followed by "*** Error in `python':
free(): invalid next size (normal): 0x0000000001c6a000 ***" and a memory
map (though I think I might have just fixed that by .wait()ing for the
event returned by the kernel invocation; I assume that that means I tried
to print() the array before kernel was finished, meaning that I copied an
empty buffer into the array?).
Again, thanks in advance and for your patience.
Sebastian Hoffmann <null.s.hoffmann(a)googlemail.com> writes:
> I've just finished writing my first pyopencl program. While the OpenCL C
> part build()s, calling it fails with a TypeError in pyopencl/__init__.py
> I'm trying to run this on an Ubuntu 13.10 saucy with the python-pyopencl
> 2013.1~git20130613-1 provided in the Ubuntu repo (I've tried installing the
> 2013.2 version via pip, but run into compile errors due to it not finding
> CL/cl.h in /opt/AMDAPP/include/; how can I tell pip that it should make the
> compiler look there?).
> Here are call, traceback and error:
> program.simulate(queue, instances, None,
> Traceback (most recent call last):
> File "./particle_sim.py", line 88, in <module>
> File "./particle_sim.py", line 65, in simulate
> File "/usr/lib/python2.7/dist-packages/pyopencl/__init__.py", line 567,
> in kernel_call
> global_offset, wait_for, g_times_l=g_times_l)
> TypeError: object of type 'int' has no len()
> The full code is in the attachment for error-reproduction (and, as I've
> learned in the meantime, as an example on how not to do pairwise
> computation of particle influences). Thanks in advance.
program.simulate(queue, (instances,), None, ...)
The grid size needs to be a tuple.
I've just finished writing my first pyopencl program. While the OpenCL C
part build()s, calling it fails with a TypeError in pyopencl/__init__.py
I'm trying to run this on an Ubuntu 13.10 saucy with the python-pyopencl
2013.1~git20130613-1 provided in the Ubuntu repo (I've tried installing the
2013.2 version via pip, but run into compile errors due to it not finding
CL/cl.h in /opt/AMDAPP/include/; how can I tell pip that it should make the
compiler look there?).
Here are call, traceback and error:
program.simulate(queue, instances, None,
Traceback (most recent call last):
File "./particle_sim.py", line 88, in <module>
File "./particle_sim.py", line 65, in simulate
File "/usr/lib/python2.7/dist-packages/pyopencl/__init__.py", line 567,
global_offset, wait_for, g_times_l=g_times_l)
TypeError: object of type 'int' has no len()
The full code is in the attachment for error-reproduction (and, as I've
learned in the meantime, as an example on how not to do pairwise
computation of particle influences). Thanks in advance.
first I wish to thank your for developing and maintaining the PyOpenCL library, which is a great tool.
I'm using it for prototyping code for a research article.
I stumbled upon a problem past days,
Specifically, i was doing a cl.vector / scalar division, when i noticed in the array.py __div__ source code that the case scalar=1 is optimized.
my test code is:
a = array([1,2,3,4]).astype(float32)
a_gpu = cl.array.to_device(queue,a)
b = float(1)
c_gpu = a_gpu / b
at the end, the array c_gpu ends up being an alias to a_gpu, which was a very bad surprise.
I was expecting the '/' operator to allocate a new array and leave the arguments untouched.
Actually the surprise was later in the code, when I modifed the contents of the c_gpu array (using a kernel) and noticed a_gpu modifed too.
This is a very nasty side effect.
When inspecting array.py __div__ source code, i found out that these cases were purposely optimized out
def __div__(self, other):
"""Divides an array by an array or a scalar, i.e. ``self / other``.
if isinstance(other, Array):
result = self._new_like_me(
_get_common_dtype(self, other, self.queue))
self._div(result, self, other)
if other == 1:
# create a new array for the result
common_dtype = _get_common_dtype(self, other, self.queue)
result = self._new_like_me(common_dtype)
common_dtype.type(1/other), self, self.dtype.type(0))
Now, I'm questioning, this behaviour is intended as this? For me the consequence of the side-effect was disastrous, but maybe i'm missing something out.
Note, when working with pure numpy arrays, the results is always in a new copy.
I'm using the regular __div__ operator, not the __idiv__ which I understand should be in-place modifier.
I noticed similar optimization for the neutral elements in pyopencl array __add__, __sub__ operators.
One might ask why am I dividing with '1' in the first place, but actually the '1' comes as a result of a previous calculation.
Thank you very much,
Is there a particular reason why when I declare a read-only image as
input to my kernel, I have to make the input data buffer read-write?
Attached is a test case, you can find a copy online here:
And here is the output, it tries various buffer types:
ERR: <type 'str'> buffer fails: Cannot use string as modifiable buffer
OK: <class 'ctypes.c_char_Array_65536'> buffer
OK: <type 'buffer'> buffer
ERR: <type 'buffer'> buffer fails: buffer is read-only
Since the memory is only ever used as input, I would much prefer keeping
it in a read-only buffer.
This a bit inconvenient for us as PyOpenCL is just one of many data
consumers we have, and I've had to make all the memory buffers
read-write, which potentially exposes us to more bugs.
Is there a known workaround, or anything we can do to not require