On Fri, 5 Nov 2010 09:15:33 +0100, Simone Mannori <simone.mannori(a)gmail.com> wrote:
I have successfully installed pyopencl on a i7 laptop with a Mobile
5850 ATI video card.
My software configuration is:
- Ubuntu 9.10 64 bit
- Stream 2.2 (AMD/ATI)
- latest AMD/ATI mobile video driver (10.10)
Playing with the various demos included in pyopencl :
1./ if the code runs on the CPU (using OpenCL) all the 8 "cores" of
the i7 goes to 100% , but the machine is still usable; mouse react and
the other windows are active;
2./ if the same code run on the GPU (using OpenCL) the machine
"freeze" completely during the execution of the kernel.
N.B: "freeze" means that the PC does not react during kernel execution
but, at the end, my laptop correctly restart normal operation: no
stability problem, sudden crash or memory leaks. Numerical results are
ok in both cases. I suspect that is not a stability related issue but
only how "wait" is implemented (probably there is a "busy wait"
the GPU code).
My "newbie" questions are:
- situation 2./ is it normal? Can I avoid the "freeze" of the machine ?
I don't think the machine *actually* freezes--it's just that while the
GPU is busy computing, your display will not update. (Which I find
understandable.) Try running a (CPU) program in parallel to your
computation and see if it makes progress while your GPU is active. I'd
bet it does.
This behavior is actually the same on Nvidia GPUs. The only recipe
against that is to only submit compute kernels that finish in ~10 ms or
less, making the interruptions bearable. The mouse movement still feels
a bit jittery, but not so severe as to make the machine unusable.
- can I use CPU _and_ GPU at the same time? How do that from pyopencl
You just create a context on each and go to town.
Another observation is about the amount of available memory
- pyopencl reports 3Gbytes for the CPU and 512Mbytes for the GPU, but
I have 6Gbyte of RAM and 1Gbyte of video RAM ....
That seems implementation-dependent, and not likely the fault of
PyOpenCL. For my Nvidia card, I get the full ~1GB. For the AMD CPU CL
implementation, I get 3 out of six GBytes like you. Not sure what's