Yes, I understand that you were using relative error as the metric of
comparison. My point was that the pointwise absolute deviation from numpy's
values were diverging quickly and when I run tests on my data ( I am working
on convolutions of voxelized solid models) using numpy's fft I get exactly
the results I am looking for, with single precision. Therefore while the
relative error remains small, and justifiably so, the absolute deviation
tends to be the important metric for my application, i.e. comparing two
shapes created via convolutions. So I guess I will try using double
precision numbers, or stick to smaller arrays. Thank you for your help, and
On Tue, Jan 18, 2011 at 12:39 AM, Bogdan Opanchuk <mantihor(a)gmail.com>wrote;wrote:
On Tue, Jan 18, 2011 at 5:14 PM, Saigopal Nelaturi <saigopal(a)gmail.com>
Thanks for the quick response. My operating specs
are exactly the same as
yours, and when I run your test I get an error of ~3e-7. But I think that
number may have to do with dividing by the norm of the convolution in the
expression in the last line of your test
Of course, it is the relative error. Absolute error would depend on
the size of the array and hence would not provide any information.
Relative error of the order of 1e-6 - 1e-7 when working with
single-precision numbers is normal.
If the GPU convolution norm is high and the
difference between cpu and
values of convolution is relatively low, you
would get a low value for
Exactly. That's what I am checking in my code. You can try comparing
numpy.fft.fftn() results for single- and double- precision numbers -
you will get the same relative error.
The ratio of those two numbers is ~3e-7. But the
norm of the difference
between the two convolutions (cpu vs gpu) is high (69734).
something I am missing?
It is high because the array size is extremely large. Probably, the
following last line would illustrate my point better:
print numpy.max(numpy.abs((corr_cpu - corr_gpu) / corr_cpu))
This gives 6.6e-7 on my machine. This means that the relative
difference between every pair of elements with the same index in
CPU-produced array and GPU-produced array is smaller than 6.6e-7. You
cannot really ask for much more when you are using single-precision
numbers — it is defined by the size of the mantissa.
So, if this small difference is the actual reason of you getting
"garbage data", the only solution is to start using double precision
numbers (or maybe review your algorithms).