After some reading and profiling I realized that as long as they ain't
launched in different streams they'll be serialized. Sorry for the
On Tue, Dec 4, 2012 at 11:51 AM, Leandro Demarco Vedelago
Hi there. I have 2 kernels, one which must be launched
only after the
first one has finished it's computations as it uses the results
computed by the first.
My code looks something like:
----- Something ------
src = SourceModule(cudaCode)
f1 = src.get_function("f1")
f2 = src.get_function("f2")
f1(args1, res1, block=..., grid=...)
f2(args2, res2, block=..., grid=...)
---- Something else -----
where res1 is contained in args2
So, I was wondering if in "normal" conditions (I'm using the same
context and just one stream for both launches) f2 is launched only
after f1 has ended or it's possible (maybe because of the Cuda
scheduler) that it's launched before f1 is done, in which case I
should seek a way to prevent this.
Thanks in advance, Leandro Demarco.