Freddie Witherden <freddie(a)witherden.org> writes:
I have been having quite a few issues creating sub-buffers in my
The first (minor) issue relates to the convention of buf[start:end]
inheriting the flags of buf when creating the sub buffer. This is
problematic when buf was created with the COPY_HOST_PTR flag. Such a
flag is not meaningful when creating sub buffers and so results in an
exception. It would therefore be nice if this flag could be masked out.
Secondly, the __getitem__ handler seems to adopt a non-standard
convention for the slice range of buf[start:size] rather than
buf[start:end] this causes the following snippet to fail on AMD hardware
(which is strict about buffer lengths):
import pyopencl as cl
ctx = cl.create_some_context()
buf = cl.Buffer(ctx, cl.mem_flags.READ_WRITE, 55296000)
bufslc = buf[36864000:]
See line 1728 of wrap_cl.hpp which calls get_sub_region(start, end,
my_flags); but on line 1686 we have a prototype of: size_t origin,
size_t size, cl_mem_flags flags), hence the issue.
Fixed. Thanks for these reports.
Given the above two issues I decided to switch to calling
buf[a:a + sz] => buf.get_sub_region(a, sz)
While my code now functions on all platforms I now get a segfault when
my application terminates:
with the only difference being the above change. (I am stumped here!)
One thing that sort of killed sub-buffers for me as far as usefulness is
concerned is that they are allowed to have restrictive (and
platform-dependent!) alignment requirements, based on
cl.device_info.MEM_BASE_ADDR_ALIGN. On all the platforms where I
checked, that value is 1024 or 4096.
PyOpenCL's arrays for example store an offset rather than using
Your offsets appear fine with respect to that, so I'm not sure what
you're up against here. Looks like an Nvidia bug from here...