Hello ,

I am trying to check the speed on cpu and gpu for linear solver using numpy.solve() and cula.culaDeviceDgesv(). 
when I'm testing the function for about 300 samples the result on gpu is correct.but when I increase the number of samples I have two problem:

1- when increase to 400 and 500 I am getting this error:

numpy array time: 0.030175s
correctness= True

Traceback (most recent call last):
File "/home/jadidi/python-workespace/kernel/linear regression/solver.py", line 78, in 
gpu_result=gpu_solve(k,y)
File "/home/jadidi/python-workespace/kernel/linear regression/solver.py", line 61, in gpu_solve
t=cula.culaDeviceDgesv(n, nrhs, k_gpu.ptr, lda, ipiv_gpu.ptr, y_gpu.ptr, ldb)
File "/usr/local/lib/python2.7/dist-packages/scikits.cuda-0.042-py2.7.egg/scikits/cuda/cula.py", line 489, in culaDeviceDgesv
culaCheckStatus(status)
File "/usr/local/lib/python2.7/dist-packages/scikits.cuda-0.042-py2.7.egg/scikits/cuda/cula.py", line 210, in culaCheckStatus
raise culaExceptionsstatus
scikits.cuda.cula.culaRuntimeError: 4
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuEventDestroy failed: launch failed
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuEventDestroy failed: launch failed

2-when I increase the to 600 there is no error but the result on gpu are incorrect.!!

my code:

https://docs.google.com/document/d/1Owb20-6K_ffRuZH3FX2Vjgp4VD5YsLqXqF5jkWIL_wA/edit

I appreciate any help!

Mohsen