Thanks for the quick reply. So the reason my original message was 'a bit
off topic' is that I don't use pyopencl :) I write my own C code. But if
what you're saying is correct (and I guess it is) then if I request
those variables in the background and, as you say, clear them
occassionally then I should have no more problems.
Have you ever actually run into this problem? (using pyopencl)
On 15/10/2013 18:07, Andreas Kloeckner wrote:
David Higgins <dave(a)uiginn.com> writes:
> Hi guys,
> This is perhaps a tiny bit off topic, but the people here are the most
> experienced at actually using OpenCL in the real world. I'm running a
> simulation where I need to run the same OpenCL kernel an enormous number
> of times (greater than 10^8). It's the exact same kernel, it's the
> output of the previous kernel call transformed a tiny bit and fed back
> for an update. My code runs perfectly under AMD GPU and CPU, Intel and
> Apple GPU and CPU. But my compute cluster uses Nvidia under Linux and
> that's where I run into problems.
> If I run my code for long enough it eventually gobbles up all of the
> memory. It takes approx 250,000,000 kernel calls for this to happen each
> time but I need to run a simulation 2-3 times longer than this to get
> meaningful results.
> I've read that Nvidia prefers to handle resources in a way that seems a
> little odd to anyone who's played with the Linux kernel but I guess is
> somewhat in keeping with the OpenCL spec, if you're not so aware of how
> things are done elsewhere. This conversation was a lot of help to me
> My question is, has anyone on the list experienced this problem before?
> If you have, is it enough for me to actually implement an 'event'
> argument for my buffer reads and kernel launches and then the release
> this event after each utilisation? (It appears from the conversations
> online that Nvidia may be allocating these event handles even if you
> pass in a Null paramter to the calls and it's these event handles which
> are using up all of the memory.) I intend trying this approach tomorrow,
> but as the simulation takes over a day to reach memory saturation I'd
> rather somebody with experience feeds back to me anything they know
> about the issue.
Try calling 'queue.finish()' now and then. Also, as far as events are
concerned, PyOpenCL always asks for those, returns them, and if the user
doesn't hold onto them, they get garbage-collected. At some point, all
the enqueue functions may grow a get_event=True default kwarg, but not