I have a question about pinned memory and hope that you can help me.
I found out that copying data from device to host takes
a very big part of my runtime, so I read about the issue
and came across "pinned memory".
There are several examples on the mailing list but I am not
sure if I am doing this the right way.
Do I need to initialize with drv.ctx_flags.MAP_HOST
or is this automatically activated if one of the
functions below is used?
dev = drv.Device(0)
ctx = dev.make_context(drv.ctx_flags.SCHED_AUTO | drv.ctx_flags.MAP_HOST)
Is drv.mem_host_register_flags.DEVICEMAP also needed if
the context is initialized with drv.ctx_flags.MAP_HOST ?
I found several methods that should do this
but none of them seems to work.
Are they all equivalent?
x = drv.register_host_memory(x, flags=drv.mem_host_register_flags.DEVICEMAP)
x_gpu_ptr = np.intp(x.base.get_device_pointer())
x = drv.pagelocked_empty(shape=x.shape, dtype=np.float32,
from pycuda.tools import PageLockedMemoryPool
pool = PageLockedMemoryPool()
x_ptr = pool.allocate(dest.shape , np.float32)
If I use
there is an error message
"BufferError: Object is not writable."