You are computing the product of a [160080, 3] and a [3, 160080] matrix,
so the result is a [160080, 160080] matrix. To store a matrix of that
size (as float32) you would need 95GB of RAM. That's a though fit for a
3GB GPU ;-)
On 2015-11-23 17:10, Keith Brown wrote:
I have a 2 small matrix (160080,3) of type float32 and
calculating their dot product. While doing this, I keep getting
pycuda.__driver.MemoryError: cuMemAlloc failed out of memory.
I have 2 cards, each with 3GB of memory. Each matrix takes about 1875
kilobytes. I am not sure why this is occuring.
c_gpu = linalg.dot(a_gpu,b_gpu,'N','T',handle=handle)
My handle is a cublasxt (not regular cublas since blasxt apprently
does better memory handling).
Any idea what is going on?
PyCUDA mailing list