Thanks. Configuration on Stampede seems to be a little tricky due to
their choice of MPI and operating system - at present missing MPSS. Some
revised information is here:
Jianbin Fang (http://www.pds.ewi.tudelft.nl/fang/publications/
) seems to
get reasonable performance for some Kernels with straght opencL, though
setup on Xeon Phi is unclear and am hoping to find out more.
On 7/30/15 6:57 PM, Andreas Kloeckner wrote:
Benson Muite <benson_muite(a)yahoo.com> writes:
> Has anyone tried OpenCL on Xeon Phi systems (eg. Stampede)? If so, how
> did you get it to work, in particular what runtime libraries did you use?
I have tried using PyOpenCL on the Xeon phi, and overall I would rate the
experience 2/10. :-) I should preface this by saying that I did this bit of
experimentation in about 2012, so it is not impossible that the situation may
have improved, even considerably. First, when I tried this, TACC did not
support OpenCL as one of the programming models for their Xeon phi cards. That
was easy enough to work around, I installed an ICD loader (the AMD one
actually, because that one can load ICDs from a user-specified path rather than
the system wide one) as well as Intel's ICD from their webpage in my home
Once that was done, I was able to see that Intel has (seemingly) not put
very much effort into their OpenCL implementation for the phi. In
particular, getting performance out of the card for even simple
axpy-like streaming kernels was a real pain. While the card advertises
something like 300 GB per second of memory bandwidth, I was only able to
get maybe 100, and that was after extensive mucking about with unrolling
and vectorizing the kernel. This essentially made me give up on Intel's
phi hardware for the time being. I seriously hope that the next
iteration isn't quite as desperately bad, and/or that Intel makes (or
has made) a much needed investment in their software infrastructure.
If you do end up trying, I and I imagine many other people on the list would
love to hear what you find.