Irving Enrique Reyna Nolasco <irvingenrique.reynanolasco(a)kaust.edu.sa>
I am a student in physics. I am pretty new
in pycuda. Currently I am interesting in finit volume methods running on
multiple GPUS in a single node. I have not found relevant documentation
related to this issue, specifically how to communicate different contexts
or how to run the same kernel on different devices at the same time.
Would you suggest me some literature/documentation about that?
I think the common approach is to have multiple (CPU) threads and have
each thread manage one GPU. Less common (but also possible, if
cumbersome) is to only use one thread and switch contexts. (FWIW,
(Py)OpenCL makes it much easier to talk to multiple devices from a
Lastly, if you're thinking of scaling up, you could just have one MPI
rank per device.
Hope that helps,