Hi, I have written an opencl program that involves relatively small
kernels. For a certain benchmarking script, I have added up the time used
by kernels as 0.06 seconds, while the tightest python loop I can think of
still requires .2 seconds to execute the 5000-or-so kernel calls. The
program involves repeatedly looping through the same kernels, with the same
arguments, so I was wondering if there was a way to enqueue several nd
range kernels at once, at least from Python's perspective. Is there such a
In other words, supposing I have kernels A and B, taking arguments x and y,
my program consists of:
A(x); B(y); A(x); B(y); ....
Ideally, I would like to enqueue 100 copies of the kernel sequence [(A, x),
(B, y)], but being able to enqueue even [(A, x), (B, y)] with one call
instead of 2 could be a big help.
Show replies by date