I realize the hardware are different vintage. I am really concerned with how AMD compares
to NVIDIA specifically with global atomic operations. In the initial profiling and
researching I had done the culprit appeared to be the global atomic addition i.e. the
conclusion was AMD is slower than NVIDIA when doing global atomic addition period. BUT,
crusaderky's comment about the 32 vs 64 threads being held up during operations got me
thinking. So I dug in to the NVIDIA Profiler and found that about 30% of the threads were
idle and most were idle in a search function. I had originally written this search
function using a brute force method. Now I have adjusted it to use a binary search. AND, I
think crusaderky hit the nail on the head and pushed it through the board. The times I get
now are a lot faster. The AMD card showed an ~10x speed up (344 seconds down to 39
seconds) and the NVIDIA card showed an ~5x speed up (11 seconds down to 2.5 seconds).
Since the speed up is double for the AMD card I infer it was holding up 64 threads in this
search function while the NVIDIA card was holding up only 32 threads. Removing this hold
up is the key, thanks crusaderky! I have yet to do much with the memory access. That is
the next task.
This does set my mind at ease concerning buying a laptop with an AMD card or an NVIDIA
card, either will probably due in the long run as long as I keep my algorithms efficient.
From: William Shipman [mailto:email@example.com]
Sent: Monday, August 24, 2015 3:10 PM
To: Joe Haywood
Subject: Re: [PyOpenCL] Opinions
Just thought I should point that the FirePro V4800 is 3 years older than the GTX 780 Ti
and has far fewer cores. Its bandwidth to global memory is 57.6 GB/s vs the 780 Ti's
336.5 GB/s. Comparing the two is pointless, the FirePro V4800 will always lose.
On 14 August 2015 at 19:12, CRV§ADER//KY
Look up opencl / cuda coalesced memory access on stack overflow, there's plenty of
This e-mail, including any attachments is the property of Trinity Health and is intended
for the sole use of the intended recipient(s). It may contain information that is
privileged and confidential. Any unauthorized review, use, disclosure, or distribution is
prohibited. If you are not the intended recipient, please delete this message, and reply
to the sender regarding the error in a separate email.
Show replies by date