Recently I have been playing with profiler and logger in CUDA.
Detailed description is in
Basically I have set environment variable
COMPUTE_PROFILE to 1 and run PyCUDA programs.
I have observed that logger does not put every function
call into text files. I have also observed that test cases
(functions decorated by pycuda.tools.mark_cuda_test)
were generating full logs. The only difference I have
found was calling context.detach() in mark_cuda_test.
I have then experimented a little bit and observed
that indeed when I was not using pycuda.autotools but
instead created context manually and then popped
_and detached_ it full log was generated.
I am attaching patch that adds ctx.detach to functions
called at exit of program in pycuda.autoinit.
I have tested PyCUDA with this patch, and all programs
from test/* run without problems.
I am also attaching two logs from examples/demo.py.
One is result of using autoinit with detach, one without.
As you can see the latter misses some of the functions
like (2*gpuarray).get() (axpb kernel).
So Andreas, please apply this patch before finalising 2011.1.
Tomasz Rybak <bogomips(a)post.pl> GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860