On Mon, Mar 9, 2009 at 6:44 AM, Nicholas S-A <novanasa(a)gmail.com> wrote:
* Chris Heuser <drummerdude3791(a)charter.net> [2009-03-09 00:30:58 -0400]:
Is there a way for me to use multiple python threads in order to run cuda
code on multiple GPUs?
I have created several threads, and in each I attempt to create a context
for a different cuda device, but I am getting an "invalid context" error
when I try to copy an array over.
I use pp (http://www.parallelpython.com/
) to run different python
instances. It is very easy and even allows execution on different
machines (though I have not tried this with CUDA code). Essentially it
spawns a new python instance, so each CUDA call runs in a different
process instead of a different thread and the Global Interpreter Lock is
There could be ways that involve less overhead, but this works fine for
Hope that helps!
I may end up doing just that. But before I change my implementation, I want
to make sure I am not simply making a noob error. A little bit more
Before the program splits into threads, i run cuda.Device.count() and store
that in a variable, in this case we will call it *cudaCnt*. Then later in
execution the program finally splits into *cudaCnt *threads. I am using a
thread class inheriting from threading.Thread, and in the overloaded
__init__ function, i create a context to a specific device. I am afraid that
the error might be in creating this context:
-------->self.dev = cuda.Device(self.ID)
-------->self.cntxt = self.dev.make_context()
Where self.ID represents one of the devices found through *cudaCnt*. the
first thread will have self.ID = 0, second has self.ID = 1, and so on, up to
*cudaCnt. *Then I begin executing cuda code in the class *run()*
1. Am I leaving something out in my creation of the individual contexts?
2. Am I correct in thinking that this will create contexts on separate cuda