Forwarding off-list reply.
-------- Original Message --------
Subject: Re: [PyCUDA] Contexts and Threading
Date: Sat, 29 Sep 2012 17:57:43 +0200
From: Eelco Hoogendoorn <e.hoogendoorn(a)uva.nl>
To: Freddie Witherden <freddie(a)witherden.org>
That is an interesting thought; I have been thinking about a similar
where the aim is to cleanly execute algorithms that are embarrassingly
parralel, in the sense of not requiring any inter-stream communication.
Indeed it would seem to me that python threads should be a good fit for
type of application. One process per device and one thread per stream seems
to be a natural match, given the implementation of those concepts in python.
Don’t know what kind of issues youd run into though; best to juststart
trying and see, but for what its worth; I imagine a design pattern with
anabstract subclass of thread, which creates and holds a cuda stream. You
could then implement youd algo in a subclass thereof, and thatd be fairly
What bothers me though is that the overloaded gpuarray operators do not
support stream arguments; I cant really think of an elegant way to solve
that, and I suppose there are lots of problems of that nature if you start
From: Freddie Witherden
Sent: Friday, September 28, 2012 7:26 PM
Subject: [PyCUDA] Contexts and Threading
I have a question regarding how PyCUDA interacts with CUDA 4.x's
support for sharing contexts across threads.
Broadly speaking I wish to create an analogue of CUDA streams that
also support invoking arbitrary Python functions (as opposed to just
CUDA kernels and memcpy operations).
My idea is to associate a Python thread with each CUDA stream in my
application and use a Queue (import Queue) to submit either CUDA
kernels or Python functions to the queue with the core code being
along the lines of:
def queue_worker(q, comm, stream):
item = q.get()
Allowing one to do:
q1, q2 = Queue(), Queue()
t1 = Thread(target=queue_worker, args=(q1, comm, a_stream1)
t2 = Thread(target=queue_worker, args=(q2, comm, a_stream2)
# Stick items into the queue for the thread to consume
However, this is only meaningful if it is possible to share a PyCUDA
context between threads. Can someone update me on if this is possible
at all (on the CUDA driver level) and if PyCUDA supports this?
PyCUDA mailing list