[Hedge] Problem with Pycuda & hedge

Peter17 peter017 at gmail.com
Thu Jul 21 02:53:29 PDT 2011


2011/7/20 Paul Cazeaux <paul_cazeaux at brown.edu>:
>  - I had the same problem as Peter, with a "pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid value". More precisely, I get
>  block = (
>                given.devdata.smem_granularity,
>                plan.parallelism.parallel,
>                plan.aligned_image_dofs_per_microblock
>                #//given.devdata.smem_granularity)
> set to (512,1,0). Now I think that given.devdata.smem_granularity is always going to be 512 for my laptop's GeForce 9600 GT - so I hardcoded (512, 1, 1)
> as the block shape, and it seems to work. Is there a big problem with that?

Hi, Paul!

Thanks for the tip! I didn't try to hardcode the value before. It is
also working for me if I hardcode block to (512, 1, 1) in the 3
functions function_prepare, function_prepared_call and
function_prepared_timed_call of pycuda/pycuda/driver.py.

However, the execution is quite slow (about the same speed as in CPU)
and is using 100% of a CPU core... is this normal? I use a Tesla
C1060.

Andreas, do you know a way to display the usage of the GPU cores? I
could not find a good solution to check that the calculation is
actually done by the graphic card in my configuration...

Also, when trying to use the CUDA backend, I received a lot of:
    ImportError: no module named cuda
because the backend tries to import codepy.cgen.cuda instead of cgen.cuda.

I committed a fix here: https://github.com/peter17/hedge/commit/1006674

Best regards

--
Peter Potrowl



More information about the Hedge mailing list