Today I received notification
that git repository at https://git.tiker.net/trees/pyopencl.git
version of PyOpenCL (currently 2019.1.1), while one at GitHub
(https://github.com/inducer/pyopencl/tags) has newer
commits and newer version tags (currently 2020.2.2). Is it just
or does now development happen only on GitHub and X is dead repository?
Till now Debian packages were checking X repository for getting latest
version. Should I change URL to GitHub? I do not mind using GitHub
(although there were some voices inside Debian arguing against that)
before applying this change I'd like to understand situation fully. The
same goes for PyCUDA and pytools
Tomasz Rybak, Debian Developer <serpent(a)debian.org>
GPG: A565 CE64 F866 A258 4DDC F9C7 ECB7 3E37 E887 AA8C
Dear OpenCL users,
This issue is not directly related to pyopencl but rather to nvidia's OpenCL driver.
I discovered that the the support for OpenCL is variable depending on the underlying hardware.
The typical example is the `enqueue_fill_buffer` from OpenCL which is used for the initialisation in pyopencl.array.
This function is properly exposed in recent hardware (Kepler and newer)
but not is elder Fermi cards (and we still have a cluster full of Tesla
In : ary = pyopencl.array.zeros(queue, (10,10), "float32")
LogicError Traceback (most recent call last)
<ipython-input-7-d85341757b00> in <module>()
----> 1 ary = pyopencl.array.zeros(queue, (10,10), "float32")
/usr/lib/python3/dist-packages/pyopencl/array.py in zeros(queue, shape, dtype, order, allocator)
1973 result = Array(queue, shape, dtype,
1974 order=order, allocator=allocator)
-> 1975 result._zero_fill()
1976 return result
/usr/lib/python3/dist-packages/pyopencl/array.py in _zero_fill(self, queue, wait_for)
1191 cl.enqueue_fill_buffer(queue, self.base_data, np.int8(0),
-> 1192 self.offset, self.nbytes, wait_for=wait_for))
1194 zero = np.zeros((), self.dtype)
/usr/lib/python3/dist-packages/pyopencl/__init__.py in enqueue_fill_buffer(queue, mem, pattern, offset, size, wait_for)
1850 pattern = np.asarray(pattern)
-> 1852 return _cl._enqueue_fill_buffer(queue, mem, pattern, offset, size, wait_for)
1854 # }}}
LogicError: clEnqueueFillBuffer failed: INVALID_OPERATION
The same "bug" occurs in the PoCL driver when addressing nvidia GPU,
since the corresponding low-level primitive is absent in NVVM.
I wonder if we should best address this issue within our code or it
could be addressed at a higher level. Getting from nvidia that they fix
their code to conform for the specification is an illusion. But does it
make sense to address this as part of pyopencl ?
If so, I am willing to contribute with a patch.
Thanks for your advice,
I’m interested in using PyOpenCL with a Bittware 520MX board that has an Intel Stratix 10 with HBM memory. This shows up as 32 banks of 256 MB rather than as a single DDR memory system.
In order to use it, you must specify the flag CL_MEM_HETEROGENEOUS_INTELFPGA in calls to clCreateBuffer.
This seems like an easy addition to PyOpenCl and I wonder if anyone has done it already?
Otherwise, I suppose I need to get a PyOpenCL source kit, add the flag and use that rather than install with PIP? So far I haven’t found instructions for how to install from source.
Dear PyOpenCL community,
I am facing a bug where programs crash randomly when pyopencl is used
in combination while matplotlib displays results.
If one or the other is not used, the program runs fine.
Apparently it is the Python garbage collector which destroys C++
NannyEvents, but matplotlib is also in the stack (which I don't
I tested on Debian 10, debian10 with backports from sid.
Within a virtualenv with everything up to date, I do not manage to reproduce the bug.
It has probably been corrected somewhere, but do you have the reference of the bug ?
Thanks for you help,
I am trying to make a class that would be passable to Kernel's set_arg()
and otherwise behave like a Buffer (it will have a real pyopencl.Buffer
inside). Has someone done that before? I am having a hard time going
through the C++/generated code in pyopencl, and still don't understand what
exactly is required from such a class. The docs for Kernel.set_arg() list a
number of types that it accepts, but these are all defined at the C level
and are not very easy to subclass.
A question to Andreas: in PyCUDA, it is enough for a kernel argument to
just have a `.gpudata` attribute that is convertible to integer - would it
be possible to do the same in PyOpenCL (at least the first part, where
set_arg() would use duck typing and look for a specific attribute)?
Yves Surrel <yves(a)surrel.org> writes:
> Hi Andreas
> Before logging an issue, I just want to know whether I am simply stupid…
> AttributeError: module 'pyopencl' has no attribute ‘enqueue_task’. But it is in the doc at the same level as pyopencl.enqueue_nd_range_kernel
> So what’s wrong (it does not appear in __init__.py) ?
It's documented to be equivalent to be equivalent to
enqueue_nd_range_kernel with unit grid size. Commit
removed it, but (by the looks of it) left the docs in place. Do you see
a good reason to keep/reintroduce it?
removes the docs.
I have an old macbook that has a discrete GeForce on it, and have run into
the following problem. The simplified example is here:
Basically, what I'm trying to do:
- create a two-device context
- create a buffer
- split it into two subregions to use on each device
- run a kernel on each device in parallel working with the corresponding
- get the result back on the host
(the expected result is [0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7])
First, it turned out that if the context includes an nVidia card, the
Buffer must be necessarily created with the cl.mem_flags.ALLOC_HOST_PTR
flag, otherwise if one uses its subregion in a kernel, the program crashes.
If the context is created on a CPU + Iris Pro (the other two devices
available), everything works fine without this flag, giving the expected
After fixing that, the program finishes without crashing when run on a CPU
+ GeForce or Iris Pro + GeForce context, but the result is [0 1 2 3 4 5 6 7
0 0 0 0 0 0 0 0] - that is, the second kernel (on the GeForce device)
either did not run, or its changes to the subregion were not incorporated
into the whole buffer. Uncommenting the explicit migration in the end does
not help either. Does anyone know what I'm missing here? Or is it an
I installed pyopencl using apt on ubuntu16.04 but it installs an older
version (which doesn't work when CUDA is installed):
python-pyopencl/xenial 2015.1-2build3 amd64
Python module to access OpenCL parallel computation API
I was wondering if this pkg will be updated to install a new pyopencl.