So this is complete.


The reason Ubuntu was giving an incorrect answer is that I defined a numpy array using randint and passed that to an OpenCL buffer but used it as an unsigned int in the code. I changed the numpy definition to dtype=uint32 and now the code is also working as expected on Ubuntu. I have not tried to figure out how this was affecting the rest of the code but changing it fixed the problem I was seeing. I assume the context buffer was corrupt and affected the simulation in a non-trivial way.






On Windows everything calculates as expected that is I get the results I expect in my code test. In Ubuntu there appears to be a problem with my sorting function. (Code on windows and Ubuntu are exactly the same except for the program working directory.) That is particles in the simulation appear stacked to one side instead of uniformly randomly distributed across the simulation grid. If I remove the sorting function that problem goes away but the final answer I expect (100) is double (200). I see this on my Ubuntu 14.04 workstation with NVidia GTX 780 TI as well.


I assume the problem is in the CUDA toolkit and Pyopencl combination. No other systems have this behavior. I have the code running on windows on Intel Graphics Cards, AMD graphics cards, and now NVidia Cards. It doesn't work on Linux. The equivalent Cuda version works as expected on Linux and an OpenCL version using Scipy Weave inline works as expected (only a little slower).



Confidentiality Notice:
This e-mail, including any attachments is the property of Trinity Health and is intended for the sole use of the intended recipient(s). It may contain information that is privileged and confidential.  Any unauthorized review, use, disclosure, or distribution is prohibited. If you are not the intended recipient, please delete this message, and reply to the sender regarding the error in a separate email.