I have a weird problem when using the visual profiler: for about two
seconds my program works fine, but after that the kernel launches
become extremly slow (total running time goes up over hundredfold). I
made a small signal handler that reacts to SIGUSR1, and saw that while
the program does move on slowly, it's always busy waiting at
func._launch_kernel of driver.py. I then tried to decrease the number
of loops my simulation does to decrease the total run time to that
crucial two seconds for testing purposes, but the profiler runs the
program multiple times and on second time it's as slow right from the
I also tried running the program with CUDA_PROFILE=1, and everything
works just fine, runtime being roughly doubled compared to running
without any profiling.
Trying to use nvprof (that the visual profiler uses underneath, IIUC)
just gives "Warning: Application received signal 139".
Have you used the visual profiler or nvprof succesfully? Or noticed
In case it matters, I'm running the program on a remote headless
server with ssh -X, and using cuda 5.0.
Tomi Pieviläinen, +358 400 487 504
A: Because it disrupts the natural way of thinking.
Q: Why is top posting frowned upon?
Show replies by date