On Freitag 31 Juli 2009, Ahmed Fasih wrote:
Hi, I'm very surprised that google isn't
turning up something about
this topic because I thought it's been previously discussed, so my
apologies if it has.
I'm trying the NVIDIA CUDA Visual Profiler (v 2.2.05) in Windows XP
with a fairly recent PyCUDA git, on CUDA 2.2
(pycuda.driver.get_driver_version() returns 2020).
I provide the Visual Profiler with a Windows batch file that calls
"python my_pycuda_script.py -some -flags", but the Visual Profiler
(after running the script 4 times) only reports two methods,
"memcopy". All other counters are zero (so it doesn't display them in
the table). Manipulating the counters enabled doesn't change this.
Any assistance would be much appreciated. My application runs only
~10% faster on a Tesla C1060 than a G80 Quadro (despite having twice
as many MPs) so I'm hoping the profiler will help me understand why.
On Linux, I've had good success with just using the profiler from the command
Every one of my attempts to achieve the same thing using the visual profiler
has ended in tears so far. I'm not sure if the command line way of doing
things works in Windows, but I'd imagine so.
Once you figure out what's up, please add an FAQ entry!