Michael Boulton <michael.boulton(a)bristol.ac.uk> writes:
On 19/09/13 15:18, Andreas Kloeckner wrote:
>> The problem is that whenever I get the OpenCL platforms (whether it be
>> indirectly by doing create_some_context() or by directly calling
>> get_platforms()) it allocates either 32 or 64 gigabytes of memory
>> (seemingly at random depending on the system and type of devices). If I
>> try to delete the platform objects then the memory still stays there, so
>> it means that whenever I start a run I'm allocating a huge chunk of
>> memory that I can never deallocate.
> I don't think those are "real" memory allocations in the sense that
> are backed by physical system memory. You're probably seeing them in
> "top" (or similar). I'm guessing they might be some sort of aperture
> into which the driver maps GPU memory and various other stuff. Looking
> at /proc/self/maps from within the process should give you a better idea
> of what exactly is being mapped.
I get that it's not actually using the memory, but I've done plenty of
OpenCL stuff in C/C++ before and I've never seen that behaviour. It's
harmless but I was wondering if there was something wrong, and if it was
related to the other problems I was having
Huh, weird. PyOpenCL doesn't have any right to behave differently than
OpenCL as used directly from C/C++.
I get the out of resources error even if I spawn one thread at a
The devices are definitely different in each thread - they're named
different things, they have a different memory address (I originally
tried just creating a context just with
cl.Context(dev_type=cl.device_type.GPU) but then it throws a
"LogicError: Context failed: invalid platform").
For what it's worth, I'm also getting this problem when trying to run
between a CPU and an intel xeon phi as well. When I try to run it using
2 AMD 7970s, then it creates the context fine but then the thread just
silently exits without throwing an exception when I try to create a
command queue with the context.
Can you boil this down to a simple reproducing test case that I could
try on my hardware?
>> (which I'm also guessing should actually show up as a
> Could you check the type of the exception? I don't see how the current
> code would throw a non-pyopencl exception.
I worded this badly, I meant that the error shows up as "RuntimeError"
when all the other pyopencl exceptions show up as
>> then the command queue will 'become' invalid at some point. Calling
>> queue.finish() would throw an 'invalid queue' exception, but trying to
>> launch a kernel using the queue would cause it to just hang silently
>> and I'd have to kill the process in linux.
> That's also how (Nvidia) OpenCL "reports" segmentation faults (for
> instance), i.e. bugs in your code. Are you sure there aren't any bugs in
> your code that might cause the device to crash?
> Alternatively, have you looked at the output of 'dmesg' to see if
> there's anything incriminating? (The messages may look like gibberish,
> but they might say something important.)
I checked dmesg on all the platforms I was testing it on:
- on the one with the AMD GPUs where the thread silently exits, it seems
to be because of a seg fault
- nothing new shows up in dmesg when I get the "out of resources" error
when trying to create a context on 2 different NVIDIA GPUs
- same as above, but when trying to run across an intel cpu and an intel
This is just odd--I've never seen anything like this, but since it's
occuring on wildly different implementations, it shifts the 'blame' away
From those. Another question--what ICD loader are you using?