On Sun, 26 Dec 2010 01:20:22 +0700, arief nur andono <ariefnurandono(a)gmail.com>
i'm trying to translate from cu code to pycuda but got unexpected result
because the flops value is too far
i think it because i don't really know how to translate it in pycuda...
especially in CUDA_CALL_SAFE and measure the time..
You need some sort of synchronization with the GPU, either
Context.synchronize() or using events, just like in CUDA C.