I've done some analysis comparing CPU(on a nehalem) and GPU(on a tesla)
performance of PyOpenCL to parallel Cython using OpenMP. The performance of
PyOpenCL on the CPU(Intel Nehalem with AMD OpenCL 1.1) was very poor, even
slower than a single threaded Cython program. I realize that my OpenCL
implementation was fairly poor, but I expected performance to be a bit
better than it was.
The analysis is available here:
I'm hoping that someone can give some insight into how to improve it or why
it is so bad.
Also, I would like to run the analysis again with the Intel OpenCL driver,
but can't get PyOpenCL to recognize both Intel and AMD platforms, when I run
get_platforms it only shows AMD. Here is my siteconf.py file:
rcloud@Vertex:~/sources/pyopencl-2011.1.2$ cat siteconf.py
BOOST_INC_DIR = 
BOOST_LIB_DIR = 
BOOST_COMPILER = 'gcc43'
BOOST_PYTHON_LIBNAME = ['boost_python-gcc43-mt']
USE_SHIPPED_BOOST = True
CL_TRACE = False
CL_ENABLE_GL = False
CL_ENABLE_DEVICE_FISSION = True
CL_INC_DIR = ['/home/rcloud/sources/amd/AMD-APP-SDK-v2.5-RC2-lnx64/include']
CL_LIBNAME = ['OpenCL']
CXXFLAGS = 
LDFLAGS = 
thanks in advance,
Robert L Cloud
,,Warum willst du dich von uns Allen
Und unsrer Meinung entfernen?"
Ich schreibe nicht, euch zu gefallen;
Ihr sollt was lernen.