Hello.
Today I received notification
(
https://salsa.debian.org/opencl-team/python-pyopencl/-/merge_requests/1
)
that git repository at https://git.tiker.net/trees/pyopencl.git
contains old
version of PyOpenCL (currently 2019.1.1), while one at GitHub
(https://github.com/inducer/pyopencl/tags) has newer
commits and newer version tags (currently 2020.2.2). Is it just
omission,
or does now development happen only on GitHub and X is dead repository?
Till now Debian packages were checking X repository for getting latest
version. Should I change URL to GitHub? I do not mind using GitHub
(although there were some voices inside Debian arguing against that)
but
before applying this change I'd like to understand situation fully. The
same goes for PyCUDA and pytools
Best regards.
--
Tomasz Rybak, Debian Developer <serpent(a)debian.org>
GPG: A565 CE64 F866 A258 4DDC F9C7 ECB7 3E37 E887 AA8C
Dear OpenCL users,
This issue is not directly related to pyopencl but rather to nvidia's OpenCL driver.
I discovered that the the support for OpenCL is variable depending on the underlying hardware.
The typical example is the `enqueue_fill_buffer` from OpenCL which is used for the initialisation in pyopencl.array.
This function is properly exposed in recent hardware (Kepler and newer)
but not is elder Fermi cards (and we still have a cluster full of Tesla
M2075)
In [7]: ary = pyopencl.array.zeros(queue, (10,10), "float32")
---------------------------------------------------------------------------
LogicError Traceback (most recent call last)
<ipython-input-7-d85341757b00> in <module>()
----> 1 ary = pyopencl.array.zeros(queue, (10,10), "float32")
/usr/lib/python3/dist-packages/pyopencl/array.py in zeros(queue, shape, dtype, order, allocator)
1973 result = Array(queue, shape, dtype,
1974 order=order, allocator=allocator)
-> 1975 result._zero_fill()
1976 return result
1977
/usr/lib/python3/dist-packages/pyopencl/array.py in _zero_fill(self, queue, wait_for)
1190 self.add_event(
1191 cl.enqueue_fill_buffer(queue, self.base_data, np.int8(0),
-> 1192 self.offset, self.nbytes, wait_for=wait_for))
1193 else:
1194 zero = np.zeros((), self.dtype)
/usr/lib/python3/dist-packages/pyopencl/__init__.py in enqueue_fill_buffer(queue, mem, pattern, offset, size, wait_for)
1850 pattern = np.asarray(pattern)
1851
-> 1852 return _cl._enqueue_fill_buffer(queue, mem, pattern, offset, size, wait_for)
1853
1854 # }}}
LogicError: clEnqueueFillBuffer failed: INVALID_OPERATION
The same "bug" occurs in the PoCL driver when addressing nvidia GPU,
since the corresponding low-level primitive is absent in NVVM.
I wonder if we should best address this issue within our code or it
could be addressed at a higher level. Getting from nvidia that they fix
their code to conform for the specification is an illusion. But does it
make sense to address this as part of pyopencl ?
If so, I am willing to contribute with a patch.
Thanks for your advice,
Cheers,
Jerome
I’m interested in using PyOpenCL with a Bittware 520MX board that has an Intel Stratix 10 with HBM memory. This shows up as 32 banks of 256 MB rather than as a single DDR memory system.
In order to use it, you must specify the flag CL_MEM_HETEROGENEOUS_INTELFPGA in calls to clCreateBuffer.
This seems like an easy addition to PyOpenCl and I wonder if anyone has done it already?
Otherwise, I suppose I need to get a PyOpenCL source kit, add the flag and use that rather than install with PIP? So far I haven’t found instructions for how to install from source.
Thanks!
-Larry
Dear PyOpenCL community,
I am facing a bug where programs crash randomly when pyopencl is used
in combination while matplotlib displays results.
If one or the other is not used, the program runs fine.
Apparently it is the Python garbage collector which destroys C++
NannyEvents, but matplotlib is also in the stack (which I don't
understand why).
https://github.com/silx-kit/pyFAI/issues/1341
I tested on Debian 10, debian10 with backports from sid.
Within a virtualenv with everything up to date, I do not manage to reproduce the bug.
It has probably been corrected somewhere, but do you have the reference of the bug ?
Thanks for you help,
Jérôme
I am trying to make a class that would be passable to Kernel's set_arg()
and otherwise behave like a Buffer (it will have a real pyopencl.Buffer
inside). Has someone done that before? I am having a hard time going
through the C++/generated code in pyopencl, and still don't understand what
exactly is required from such a class. The docs for Kernel.set_arg() list a
number of types that it accepts, but these are all defined at the C level
and are not very easy to subclass.
A question to Andreas: in PyCUDA, it is enough for a kernel argument to
just have a `.gpudata` attribute that is convertible to integer - would it
be possible to do the same in PyOpenCL (at least the first part, where
set_arg() would use duck typing and look for a specific attribute)?
Yves Surrel <yves(a)surrel.org> writes:
> Hi Andreas
>
> Before logging an issue, I just want to know whether I am simply stupid…
>
> AttributeError: module 'pyopencl' has no attribute ‘enqueue_task’. But it is in the doc at the same level as pyopencl.enqueue_nd_range_kernel
>
> So what’s wrong (it does not appear in __init__.py) ?
It's documented to be equivalent to be equivalent to
enqueue_nd_range_kernel with unit grid size. Commit
https://github.com/inducer/pyopencl/commit/6ca59dca1ae44fa6cafedaea3d193376…
removed it, but (by the looks of it) left the docs in place. Do you see
a good reason to keep/reintroduce it?
https://gitlab.tiker.net/inducer/pyopencl/-/merge_requests/105
removes the docs.
Andreas
Hello all,
I have an old macbook that has a discrete GeForce on it, and have run into
the following problem. The simplified example is here:
https://gist.github.com/fjarri/9aff0474868e2faf438f7e8229d194ec
Basically, what I'm trying to do:
- create a two-device context
- create a buffer
- split it into two subregions to use on each device
- run a kernel on each device in parallel working with the corresponding
subregion
- get the result back on the host
(the expected result is [0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7])
First, it turned out that if the context includes an nVidia card, the
Buffer must be necessarily created with the cl.mem_flags.ALLOC_HOST_PTR
flag, otherwise if one uses its subregion in a kernel, the program crashes.
If the context is created on a CPU + Iris Pro (the other two devices
available), everything works fine without this flag, giving the expected
result.
After fixing that, the program finishes without crashing when run on a CPU
+ GeForce or Iris Pro + GeForce context, but the result is [0 1 2 3 4 5 6 7
0 0 0 0 0 0 0 0] - that is, the second kernel (on the GeForce device)
either did not run, or its changes to the subregion were not incorporated
into the whole buffer. Uncommenting the explicit migration in the end does
not help either. Does anyone know what I'm missing here? Or is it an
nVidia/Apple bug?
Hi,
I installed pyopencl using apt on ubuntu16.04 but it installs an older
version (which doesn't work when CUDA is installed):
python-pyopencl/xenial 2015.1-2build3 amd64
Python module to access OpenCL parallel computation API
I was wondering if this pkg will be updated to install a new pyopencl.
Thanks,
Shivangi Agarwal
Dear all,
I known this is not the good mailing list to ask.
So I am sorry to ask but I do it anyway.
I have a simple cl kernel that totally work with nvidia-icd / neo-icd /
pocl-icd.
But today the pocl-icd do not compile at all.
I get this error message:
*pocl error: lt_dlopen("(null)") or lt_dlsym() failed with 'can't close
resident module'.**
**note: missing symbols in the kernel binary might be reported as 'file
not found' errors.**
*
I don't think it is pyopencl related more pocl.
I am on unbutunu 18.04, python 3.6.8, pyopencl 2019.1.2, pocl-opencl-icd
1.1-5.
google do not help me so much.
Maybe an opencl expert here could help me.
Thank a lot.
Best,
Samuel