I am implementing an iterative algorithm in pyCUDA.
Inside the while loop I need to do a reduction,I have implemented it with a
gpuarray and it works nicely but i think that using a gpuarray inside a
loop is a bad idea because of the allocation and deallocation overhead.
So I was thinking that I could use the reduction example from the SDK.
The problem is that i need only the first value of the reduced array
returned to the host.
I couldn't find anywhere in the documentation how to do that.
Thank you in advance.
Show replies by date