Hello.
Debian has two sets of OpenCL headers:
1.1 available in main branch
1.2 available in "experimental"
Trying to build PyOpenCL with 1.2 headers and 11-12 AMD drivers
fails. PyOpenCL expects to have 1.2 functions in libOpenCL.so
(e.g. for splitting devices) and library does not provide
those functions.
I have downloaded AMD drivers 851.6 (or 8-93-10 as they are called
on the web page
http://developer.amd.com/sdks/AMDAPPSDK/downloads/Pages/default.aspx
under the "OpenCL™ 1.2 and Static C++ kernel language preview"
and was able to install them on my 64-bit Debian.
clinfo (utility to list OpenCL devices) lists all my devices
(CPU and GPU) as 1.1, but I was able to build PyOpenCL with
1.2 headers. I guess that this testing driver exports needed
functions, even though devices do not support them yet.
At the same time unit tests fail on this setup.
test_wrapper works OK.
test_array shows 5 failed tests
test_clmath shows 13 failed tests
(see attachments)
I will try to see into those problems; I am not
sure whether this is problem with PyOpenCL or my setup,
because at the same time I have changed my drivers
and built new PyOpenCL with complex SIMD and Fortran
support (previously I had PyOpenCL 402cc8340e934574de
from 2012-01-22).
Best regards
--
Tomasz Rybak GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak
Hello,
I'm facing this exact same issue. The types of the arguments I am giving are:
('1st arg type: ', <class 'pyopencl.Program'>)
('2nd arg type: ', <type 'str'>)
('cl.Kernel type: ', <type 'Boost.Python.class'>)
The exact error I see is the same, although a little less verbose on the types:
Boost.Python.ArgumentError: Python argument types in
Kernel.__init__(Kernel, Program, str)
did not match C++ signature:
__init__(_object*, pyopencl::program, std::string)
I get this error on both a linux system with python 2.7.1 in a virtual environment built using gcc4.6 aswell as a Mac OS X system running python 2.7.2 in a virtual environment built using gcc4.2.
I tried out pyopencl-2011.1 aswell as pyopencl-2011.1, both have the same error.
Creating the kernel using the attribute accessing way works though. If you do know the kernel name ahead of time (which doesn't seem to be Kiri's situation), the code looks like the following for a kernel named "trsm":
kernel = program.trsm
If you're in Kiri's situation and the kernel name is only known as a string variable, you can still use this method in the following way:
kernel_name = "trsm"
kernel = program.__getattr__(kernel_name)
Hope this helps while we wait for a fix.
Hi all,
It looks like the Image constructor at line 152 of
pyopencl/__init__.py is incorrectly warning when hostbuf is None and
no memory flags are passed.
I believe line 149 should read "if hostbuf is not None and not \" instead.
139 class Image(_cl._ImageBase):
140 def __init__(self, context, flags, format, shape=None, pitches=None,
141 hostbuf=None, is_array=False, buffer=None):
142
143 if shape is None and hostbuf is None:
144 raise Error("'shape' must be passed if 'hostbuf' is not given")
145
146 if shape is None and hostbuf is not None:
147 shape = hostbuf.shape
148
149 if hostbuf is None and not \
150 (flags & (mem_flags.USE_HOST_PTR |
mem_flags.COPY_HOST_PTR)):
151 from warnings import warn
152 warn("'hostbuf' was passed, but no memory flags to
make use of it.")
Best,
Luke
Hello,
I am trying to build pyopencl without boost.
The building process uses the following siteconf.py file :
# siteconf.py
CL_TRACE = False
CL_ENABLE_GL = True
CL_ENABLE_DEVICE_FISSION = False
CL_INC_DIR = [r'C:\Program Files (x86)\NVIDIA GPU Computing
Toolkit\CUDA\v4.1\include',
r'C:\Program Files\Microsoft SDKs\Windows\v7.0\Include']
CL_LIB_DIR = [r'C:\Program Files (x86)\NVIDIA GPU Computing
Toolkit\CUDA\v4.1\lib\Win32',
r'C:\Program Files\Microsoft SDKs\Windows\v7.0\Lib']
CL_LIBNAME = ['OpenCL']
CXXFLAGS = ['/EHsc']
LDFLAGS = ['/FORCE']
But the installation process fails with the following message :
# "python .\setup.py build" standard output
.....
C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\Bin\cl.exe /c
/nologo /Ox
/MD /W3 /GS- /DNDEBUG -DPYGPU_PACKAGE=pyopencl -DBOOST_PYTHON_SOURCE=1
-DPYGPU_P
YOPENCL=1 -DBOOST_MULTI_INDEX_DISABLE_SERIALIZATION=1
-Dboost=pyopenclboost -DHA
VE_GL=1 -Ibpl-subset/bpl_subset "-IC:\Program Files (x86)\NVIDIA GPU
Computing T
oolkit\CUDA\v4.1\include" "-IC:\Program Files\Microsoft
SDKs\Windows\v7.0\Includ
e" -IC:\dev\x86\Python27\lib\site-packages\numpy\core\include
-IC:\dev\x86\Pytho
n27\include -IC:\Users\cyril\dev\traits\sandbox\PC
/Tpbpl-subset/bpl_subset/libs
\thread\src\pthread\once.cpp
/Fobuild\temp.win32-2.7\Release\bpl-subset/bpl_subs
et/libs\thread\src\pthread\once.obj /EHsc
once.cpp
bpl-subset/bpl_subset/libs\thread\src\pthread\once.cpp(9) : fatal error
C1083: C
annot open include file: 'pthread.h': No such file or directory
error: command 'cl.exe' failed with exit status 2
I suppose it is looking for something coming from boost.
Is there any flag to be added in siteconf.py ?
Thank you a lot,
Cyril.
On Sat, 18 Feb 2012 13:46:55 +0000, Giridhar Tammana <giridhar.t(a)gmail.com> wrote:
> After some debugging I'm able to fix this problem by below patch,
>
> In cache.py
> - if cache_dir != False:
> + if cache_dir != False and cache_dir != None:
>
> I see that cache_dir is None when I run in Eclipse.
cache_dir == None means 'use the default cache directory'. cache_dir ==
False means 'don't use caching'. Your patch has effectively defeated
kernel caching, which circumvented the code that gave you trouble. If
you'd like to investigate further, could you please print prg_devs and
devices where the assertion fails?
Thanks,
Andreas
Hi all,
I'm trying to speed up my code for AES encryption overlapping the
encryption phase with the reading/writing into device memory as
explained here
http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/Ope…, paragraph 3.1.3. Basically, I use two separate queues with the same context and split my data in two halves: while the first half is being encrypted by the first queue, the second one is moved into device memory. After the first half gets encrypted, I start reading the result on the first queue, and launch the encryption of the second half on the second queue. Finally, I read the encrypted second half from the second queue. Here's the code:
halveSize=len(data)/2
memSize = len(data)-halveSize
T0buff = cl.Buffer(self.context, clmem.READ_ONLY|clmem.COPY_HOST_PTR,
hostbuf=T0)
T1buff = cl.Buffer(self.context, clmem.READ_ONLY|clmem.COPY_HOST_PTR,
hostbuf=T1)
T2buff = cl.Buffer(self.context, clmem.READ_ONLY|clmem.COPY_HOST_PTR,
hostbuf=T2)
T3buff = cl.Buffer(self.context, clmem.READ_ONLY|clmem.COPY_HOST_PTR,
hostbuf=T3)
# Create input and output buffers
inBuffers = [cl.Buffer(self.context, clmem.READ_ONLY, halveSize),
cl.Buffer(self.context, clmem.READ_ONLY, remSize)]
outBuffers = [cl.Buffer(self.context, clmem.WRITE_ONLY, halveSize),
cl.Buffer(self.context, clmem.WRITE_ONLY, remSize)]
# Non-blocking copy of the first halve
cl.enqueue_copy(self.cmdQueues[0],
inBuffers[0], np.fromstring(data[0:halveSize],
dtype=np.uint8),
is_blocking=False)
self.cmdQueues[0].flush()
# Launch kernel on the first halve
program.aes_ecb(self.cmdQueues[0], (halveSize>>4,), (256,),
keyschedBuffer,
inBuffers[0], outBuffers[0],
T0buff, T1buff, T2buff, T3buff)
# Start copying the second halve
cl.enqueue_copy(self.cmdQueues[1],
inBuffers[1], np.fromstring(data[halveSize:],
dtype=np.uint8),
is_blocking=False)
self.cmdQueues[0].flush()
self.cmdQueues[1].flush()
# Launch kernel on the second halve
program.aes_ecb(self.cmdQueues[1], (remSize>>4,), (256,),
keyschedBuffer,
inBuffers[1], outBuffers[1],
T0buff, T1buff, T2buff, T3buff)
# Non-blocking read of the first halve
result = np.empty(len(data), dtype=np.uint8)
cl.enqueue_copy(self.cmdQueues[0], result, outBuffers[0],
is_blocking=False)
self.cmdQueues[0].flush()
self.cmdQueues[1].flush()
# Finally, read the second halve
cl.enqueue_copy(self.cmdQueues[1], result, outBuffers[1],
device_offset=halveSize)
When I try to read the result of the first encryption (that is, I
execute the enqueue_copy of outBuffers[0]) I get this error:
Traceback (most recent call last):
File "./pyclaes.py", line 40, in <module>
exit(main())
File "./pyclaes.py", line 31, in main
cipherText = enc.encrypt(args.key, data)
File
"/home/muogoro/Univ/sicurezza/tesina_gpu/pyclaes/pyclaes/pyclaes_ecb.py", line 401, in encrypt
cl.enqueue_copy(self.cmdQueues[1], result, outBuffers[1],
device_offset=halveSize)
File
"/usr/lib64/python2.7/site-packages/pyopencl-2011.2-py2.7-linux-x86_64.egg/pyopencl/__init__.py", line 780, in enqueue_copy
return _cl._enqueue_read_buffer(queue, src, dest, **kwargs)
pyopencl.LogicError: clEnqueueReadBuffer failed: invalid value
I cannot figure out what I'm doing wrong. Any hint would be really
appreciated :)
Daniele
After some debugging I'm able to fix this problem by below patch,
In cache.py
- if cache_dir != False:
+ if cache_dir != False and cache_dir != None:
I see that cache_dir is None when I run in Eclipse.
Cheers,
Giri
On Sat, Feb 18, 2012 at 2:10 AM, Giridhar Tammana <giridhar.t(a)gmail.com>wrote:
> Hi,
>
> I'm trying to use Eclipse to run/test a pyopencl program.
> If I run from command line I'm able to run the script but when I run or
> debug from Eclipse I see following error.
>
> Traceback (most recent call last):
> File "D:\work\eclipse\eclipse_workspace\test\test.py", line 406, in
> <module>
> prg = cl.Program(ctx, source).build()
> File "C:\Python27\lib\site-packages\pyopencl\__init__.py", line 116, in
> build
> cache_dir=cache_dir)
> File "C:\Python27\lib\site-packages\pyopencl\cache.py", line 457, in
> create_built_program_from_source_cached
> ctx, src, options, devices, cache_dir)
> File "C:\Python27\lib\site-packages\pyopencl\cache.py", line 389, in
> _create_built_program_from_source_cached
> assert prg_devs[i] == devices[dest_index]
> AssertionError
>
> Please help.
>
> Thank you.
> Regards,
> Giri
>
Hi,
I'm trying to use Eclipse to run/test a pyopencl program.
If I run from command line I'm able to run the script but when I run or
debug from Eclipse I see following error.
Traceback (most recent call last):
File "D:\work\eclipse\eclipse_workspace\test\test.py", line 406, in
<module>
prg = cl.Program(ctx, source).build()
File "C:\Python27\lib\site-packages\pyopencl\__init__.py", line 116, in
build
cache_dir=cache_dir)
File "C:\Python27\lib\site-packages\pyopencl\cache.py", line 457, in
create_built_program_from_source_cached
ctx, src, options, devices, cache_dir)
File "C:\Python27\lib\site-packages\pyopencl\cache.py", line 389, in
_create_built_program_from_source_cached
assert prg_devs[i] == devices[dest_index]
AssertionError
Please help.
Thank you.
Regards,
Giri
Hi,
I'm trying to write my first pyopencl program.
I have more than one kernel functions in kernel source.
On executing following,
prg = cl.Program(ctx, source).build()
kern = cl.Kernel(prg, args.kernel_name) #args.kernel_name contains kernel
function name
I see following error,
Boost.Python.ArgumentError: Python argument types in
Kernel.__init__(Kernel, Program, str)
did not match C++ signature:
__init__(struct _object *, class pyopencl::program, class
std::basic_string<char,struct
std::char_traits<char>,class std::allocator<char> >)
I tried casting to string for args.kernel_name but the error is same.
Why am I seeing this?
How can I chose which kernel to run at runtime?
Thank you.
Regards,
Giri
Hello Everyone,
I have been very happy using pyopencl built with gcc & I am trying to
get pyopencl up + running on another platform using:
icc
NVIDIA Cuda 4.0 SDK
python 2.6.4
numpy from github
The first issue was resolved by adding "-no-gcc" to CFLAGS, so the CUDA
SDK would avoid some gcc specific logic.
After this I ran into issues with missing files in bpl-subset:
For example, it appears the "edg" directory (which is needed for icc &
other edg based compilers) is missing at following path:
pyopencl-2011.1.2/bpl-subset/bpl_subset/boost/preprocessor/list/detail/
I looked at the bpl-subset repo and found the script that extracts the
desired subset of boost.
Running this on a fresh copy of the newest boost (1.48.0), I found bcp
to be the culprit - for some reason it omits this directory.
I replaced bpl-subset/bpl_subset/boost, with boost_1_48_0/boost, I made
it further,
however I hit some compiler errors:
The first error, icc is confused by using PyArray_BYTES macro as an lvalue:
src/wrapper/wrap_cl.hpp(2107): error: expression must be a modifiable lvalue
PyArray_BYTES(result.get()) = reinterpret_cast<char *>(mapped);
^
I was able to get it to compile with the following code:
PyArrayObject *py_ptr = (PyArrayObject*)result.get();
py_ptr->data = reinterpret_cast<char *>(mapped);
The second error occurs in logic for creating a context with gl sharing:
src/wrapper/wrap_cl.hpp(741): error: expected a statement
else
^
I think the compiler takes a bad path though the current #defines.
I stripped the code down to:
if (prop == CL_CONTEXT_PLATFORM)
{
py::extract<const platform &> value(prop_tuple[1]);
props.push_back(
reinterpret_cast<cl_context_properties>(value().data()));
}
else
throw error("Context", CL_INVALID_VALUE, "invalid context
property");
And I was able to compile, but of course this is not a good general
solution.
At this point, the module compiles, but when I try to use pyopencl I get
a runtime error due to a missing symbol:
import pyopencl._cl as _cl
ImportError:
zzz/site-packages/lib/python2.6/site-packages/pyopencl-2011.1beta3-py2.6-linux-x86_64.egg/pyopencl/_cl.so:
undefined symbol: _ZTIN5boost6python17error_already_setE (typeinfo for
boost::python::error_already_set)
Could this be a consequence of using 1_48_0, instead of the modified
version of 1_47_0 in the bpl_subset repo?
Thanks for any help you can provide. I am eager to work though these
issues, willing to test & contribute back any progress .
-Cyrus Harrison