I have been converting my Monte Carlo code from cuda to Pyopencl and have run into the following problem. I am working between two machines 1)Ubuntu 14.04 machine with cuda 7.0 proprietary drivers and libraries, intel's most recent opencl drivers, and amd's opencl drivers, Pyopencl from default Ubuntu repo and 2) Windows 7 Ultimate with AMD SDK from about 2 years ago with python(x,y) and the Pyopencl distributed with that also from about 2 years ago.
On the Ubuntu machine I have the original Cuda code AND the Pyopencl code. I made a lot of changes to the cuda code to improve the performance like removing branching, changing ifs to switches and removing unused functions and variables. I then converted the cuda kernels to opencl kernels, copying and pasting a lot of the code directly. When I run the program in cuda on my test dataset I get the answer I expect, 100 +- small deviations due to randomness. The Pyopencl code returns ~twice that or 200 +- small deviations. It returns 200 no matter which opencl library is called, Nvidia (GPU calc), Intel (CPU calc), or AMD (CPU Calc).
On the Windows machine, the exact same Pyopencl code returns the expected value of 100.
Before running on the Windows machine I figured I had copied something incorrectly but every variable I checked prints approximately the same value that cuda prints within reason except for the final value I check, the Maximum of the output array. I expect them to be very close but the Pyopencl one is ~half the value of the cuda value. This is where the factor of 2 comes from. No other variables are different.
Any help or ideas are appreciated.
Joe Reese Haywood, Ph.D., DABR
Johnson Family Cancer Center
Mercy Health Muskegon
1440 E. Sherman Blvd, Suite 300
Muskegon, MI 49444
This e-mail, including any attachments is the property of Trinity Health and is intended for the sole use of the intended recipient(s). It may contain information that is privileged and confidential. Any unauthorized review, use, disclosure, or distribution is prohibited. If you are not the intended recipient, please delete this message, and reply to the sender regarding the error in a separate email.
Dnia 2015-09-08, wto o godzinie 15:22 -0500, Andreas Kloeckner pisze:
> Tomasz Rybak <tomasz.rybak(a)post.pl> writes:
> > As I'm not using PyOpenCL every day, I abstained from vote.
> > But - what's the result? If there was such a change, I'd like
> > to describe it in Debian package changelog to warn users.
> The complex numbers support was changed as described, since I didn't
> hear loud objections. I think this change, while incompatible, is
> overall a win because it turned up a number of subtle bugs in my code
> that were hidden by the vector nature of the prior complex number
OK, thanks for information - I've added note about this change
to Debian.NEWS; it'll be displayed to users when they update PyOpenCL.
Tomasz Rybak GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860
thanks to the hard work of Marko Bencun and Yichao Yu, the next version
of PyOpenCL will be substantially different internally from the previous
one. In particular, the wrapper will no longer be built using
Boost.Python but instead using cffi 1.0's ahead-of-time mode. One main
consequence of this is that PyOpenCL now works on Pypy.
This new code is now on the git master branch. (It used to live on the
'cffi' branch. The old Boost wrapper is now on the
From a user's perspective, nothing should have changed--on all machines
I have access to, PyOpenCL passes the same tests as before, on any
Python version more recent than 2.6, including Pypy. Nonetheless, before
I go ahead and release a new PyOpenCL based on this code, I'd like to
get as many of you as I can to try it and report back. If you package
PyOpenCL, or if you have a Mac or a Windows machine, I'd especially like
to hear from you.
I'm writing about PyOpenCL's support for complex numbers. As of right
now, PyOpenCL's complex numbers are typedefs of float2 and double2,
which makes complex+complex addition and real*complex addition match the
desired semantics, but lots of other operators silently do the wrong
thing, such as
I've come to regard this as a major flaw, and I can't count the number
of times I've had to hunt bugs related to this, and so I'd like to get
rid of it. I've thought about ways of doing this in a
backward-compatible manner, and they all strike me as flawed, and so I'd
prefer to move to a simple struct (supporting both .real and .imag as
well as the old .x and .y members) in one big change.
If you have code depending on PyOpenCL's complex number support and are
opposed to this change, please speak up now. I'll make the change in git
today to give you a preview.
What do you think?