[Hedge] Problem with Pycuda & hedge
Peter17
peter017 at gmail.com
Thu Jul 21 02:53:29 PDT 2011
2011/7/20 Paul Cazeaux <paul_cazeaux at brown.edu>:
> - I had the same problem as Peter, with a "pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid value". More precisely, I get
> block = (
> given.devdata.smem_granularity,
> plan.parallelism.parallel,
> plan.aligned_image_dofs_per_microblock
> #//given.devdata.smem_granularity)
> set to (512,1,0). Now I think that given.devdata.smem_granularity is always going to be 512 for my laptop's GeForce 9600 GT - so I hardcoded (512, 1, 1)
> as the block shape, and it seems to work. Is there a big problem with that?
Hi, Paul!
Thanks for the tip! I didn't try to hardcode the value before. It is
also working for me if I hardcode block to (512, 1, 1) in the 3
functions function_prepare, function_prepared_call and
function_prepared_timed_call of pycuda/pycuda/driver.py.
However, the execution is quite slow (about the same speed as in CPU)
and is using 100% of a CPU core... is this normal? I use a Tesla
C1060.
Andreas, do you know a way to display the usage of the GPU cores? I
could not find a good solution to check that the calculation is
actually done by the graphic card in my configuration...
Also, when trying to use the CUDA backend, I received a lot of:
ImportError: no module named cuda
because the backend tries to import codepy.cgen.cuda instead of cgen.cuda.
I committed a fix here: https://github.com/peter17/hedge/commit/1006674
Best regards
--
Peter Potrowl
More information about the Hedge
mailing list