Dear Andreas,
I am currently working on a cython based wrapper for the OpenCL FFT library from AMD: https://github.com/geggo/gpyfft
For this I need to create a pyopencl Event instance from a cl_event returned by the library. I attached a patch against recent pyopencl that adds this possibility, similar to the from_cl_mem_as_int() method of the MemoryObject class. Could you please add this to pyopencl.
Thanks for your help
Gregor
Hi all,
now that we have a flexible scan, a lot of stuff becomes quite easy:
http://documen.tician.de/pyopencl/array.html#sorting
:)
Performance isn't a dream yet, but I've also done exactly zero
tuning. It manages 34 MKeys/s on Fermi and 42 MKeys/s on Tahiti. For
comparison, numpy does about 10 MKeys/s on a CPU with a decent memory
system. The CL code on the CPU achieves about 10 MKeys/s on 4+ cores,
with the AMD implementation being 50% faster than Intel. (All this is on
32-bit integers.) If you've got some time to help tune this... :P
But the real good news here is that a) this was pretty easy to put
together on top of the existing scan primitive, and b) it actually
yields code that works on quite a bunch of CL implementations.
Hope you're finding this as exciting as me. :)
Andreas
Dear Andreas,
for current git version, when I called cl.image_from_array with a (21,29,29,2) array (and with num_channels = 2) I got a
LogicError: Image failed: invalid value - buffer too small
I think there is a bug in the array size check in wrap_cl.hpp/create_image() for 3d images. In short, instead of
std::max(pitch_x, width*itemsize) * std::max(height, pitch_y) * depth > cl_uint(len))
it should be
std::max(std::max(pitch_x, width*itemsize)*height, pitch_y) * depth > cl_uint(len))
I attached a patch, including a test for image_from_array for a 3d array of float2.
Best regards
Gregor
[resent after subscribing to the ML to avoid the message to be rejected]
Le 22/08/2012 10:24, Vincent Danjean a écrit :
> Le 21/08/2012 22:49, Tomasz Rybak a écrit :
>> Added Vincent Danjean (co-author of free OpenCL ICD loader) to CC.
>
> Adding Brice, the other OpenCL ICD Loader co-author
>
>> Dnia 2012-08-21, wto o godzinie 11:25 -0400, Andreas Kloeckner pisze:
>>> Andreas Kloeckner <lists(a)informa.tiker.net> writes:
>>>> Tomasz Rybak <tomasz.rybak(a)post.pl> writes:
>>>>> As I wrote in my email from 2012-08-14, I experienced
>>>>> crashes in image-related functions in test_wrapper.py
>>>>> on NVIDIA hardware. I managed to find the reason of that
>>>>> crash and fix it (patch in attachment). Below you can find
>>>>> explanation.
>>>>>
>>>>> Unlike all other vendors, NVIDIA still have not released
>>>>> OpenCL 1.2. Image creation functions have changed in
>>>>> OpenCL 1.2 - now clCreateImage expects to get cl_image_desc
>>>>> instead of bunch of arguments like height, width, etc.
>>>>> PyOpenCL tests in Image constructor (pyopencl/__init__.py,
>>>>> line 200-ish) whether it is run on OpenCL 1.2 or 1.1,
>>>>> and runs appropriate code based on this. It uses
>>>>> get_cl_header_version() for this check which fails in
>>>>> some situations, e.g. on Debian. In Debian we have opencl-headers
>>>>> (currently in 1.2), ICD loader (1.2) and ICD implementations
>>>>> with different versions. This means that headers will always
>>>>> have version 1.2 (or higher - but it'll be the highest
>>>>> possible version) but platforms might have lower versions.
>>>>> This was the case of this segfault. PyOpenCL expected to have
>>>>> new clImageCreate, ICD loader was ready to give pointer
>>>>> to this function to PyOpenCL, but platform (NVIDIA)
>>>>> was not providing it.
>>>>>
>>>>> I have changed Image constructor to base usage of clCreateImage on
>>>>> devices' platform version. I assumed that Context always have
>>>>> at least one device - if not, please change this code.
>>>>
>>>> I'm wondering--isn't this an issue with the ICD loader? I had sort of
>>>> expected that the header needs to match the loader, and if the loader
>>>> exports a 1.2 interface, then all of those functions are at least safe
>>>> to call--i.e. I as a user don't have to go around checking versions just
>>>> to determine what API I can call.
>>>>
>>>> In this particular instance, I had thought the ICD loader would
>>>> translate the call to the old 1.1 interface, or, if impossible, provide
>>>> an error.
>>>>
>>>> A segfault is most definitely not an appropriate response...
>>>>
>>>> OTOH, it seems both the AMD ICD loader and the open-source ICD loader
>>>> (as you indicate) behave this way, so we might not get a choice in this
>>>> matter.
>>>>
>>>> Does the spec say anything on this? What's your assessment? Are these
>>>> transient bugs in the ICD loader, design flaws in the spec, or something
>>>> completely different?
>
> I think this is a design flaws in the specs.
> An ICD Loader has no information at all (unless we decide to hardcode
> some of them) about the supported OpenCL functions by the loaded ICD.
>
> An ICD Loader merely get a address of of array of function pointers. It
> even does not know the size of the array it gets. This means, there is
> no reliable way to know if the address we got for a 1.2 function is
> garbage (belong the end of the table of an 1.1 implementation) or
> correct.
> Looking at version advertised by the ICD implementation is not a
> solution: Intel implementation advertises 1.1 but implement (part of)
> 1.2
>
> I'm willing to add/patch anything required in ocl-icd. We can add some
> more functions to the interface (it means that a program using these
> functions will not work with other OpenCL ICD loader) or provides
> them in an additional library (so that it works with any ICD Loader
> implementation).
> But, for now, I see no other way than using hardcoded information.
> If we go this path, we should think about which information we want
> exactly and how we want them to be presented (ie API/ABI)
>
> What I can propose is that, for any public symbol, we try to look
> if the corresponding function exists in the targeted ICD. Some
> sanity checks can be done automatically (non-null pointer, ...)
> but some hard-coded information will be required.
>
> I also ask me how/if we can divert the internal function pointer
> structure provided by the implementation in order to fully fill
> a whole structure (with error functions for the missing ones).
> I think that is is feasible. But that it would be possible for
> a strange ICD implementation to respect the standard and break
> with what I imagine (ie my implementation would be border-line
> with respect to the ABI specifications)
>
>
> About the specific problem of clCreateImage, I tried to look at
> it. If I understand correctly,
> OpenCL 1.1 defines clCreateImage2D and clCreateImage3D
> OpenCL 1.2 defines clCreateImage and deprecates clCreateImage2D
> and clCreateImage3D
> I.e, contrary to the initial message, I do not think that a
> prototype changes (this would be a severe bug with respect to
> ICD Loader specifications). But we have 2/3 functions with
> similar prototypes, some provided by some implementations,
> other provided by other implementations.
> But, for now, I do not think there is any way an implementation
> can currently reliably detect at run-time if an specific plate-form
> implement or not theses functions.
>
> Regards,
> Vincent
>
>>>> I must admit I'm pretty reluctant to call a bunch of GetInfo functions
>>>> and then do a bunch of string processing just to figure out what
>>>> function is safe to call. Maybe as a temporary workaround, but not as a
>>>> permanent thing.
>>>>
>>>> Any opinions/insights?
>>>
>>> Hi Tomasz,
>>>
>>> any news on this front?
>>
>> Sorry - did not have time to investigate it deeply.
>> Description of OpenCL ICD extension does not deal with case
>> of different platform versions.
>>
>> Vincent - any thoughts about how OpenCL should behave
>> in current case, when loader has version 1.2 and
>> tries to use ICD with version 1.1 (here NVIDIA one)?
>>
>> Best regards.
>>
>
>
--
Vincent Danjean Adresse: Laboratoire d'Informatique de Grenoble
Téléphone: +33 4 76 61 20 11 ENSIMAG - antenne de Montbonnot
Fax: +33 4 76 61 20 99 ZIRST 51, avenue Jean Kuntzmann
Email: Vincent.Danjean(a)imag.fr 38330 Montbonnot Saint Martin
[resent on behalf of Brice that is not subscribed to the pyopencl ML]
Le 22/08/2012 11:39, Brice Videau a écrit :
> On 08/22/2012 10:24 AM, Vincent Danjean wrote:
>> Le 21/08/2012 22:49, Tomasz Rybak a écrit :
>>> Added Vincent Danjean (co-author of free OpenCL ICD loader) to CC.
>> Adding Brice, the other OpenCL ICD Loader co-author
>>
>>> Dnia 2012-08-21, wto o godzinie 11:25 -0400, Andreas Kloeckner pisze:
>>>> Andreas Kloeckner <lists(a)informa.tiker.net> writes:
>>>>> Tomasz Rybak <tomasz.rybak(a)post.pl> writes:
>>>>>> As I wrote in my email from 2012-08-14, I experienced
>>>>>> crashes in image-related functions in test_wrapper.py
>>>>>> on NVIDIA hardware. I managed to find the reason of that
>>>>>> crash and fix it (patch in attachment). Below you can find
>>>>>> explanation.
>>>>>>
>>>>>> Unlike all other vendors, NVIDIA still have not released
>>>>>> OpenCL 1.2. Image creation functions have changed in
>>>>>> OpenCL 1.2 - now clCreateImage expects to get cl_image_desc
>>>>>> instead of bunch of arguments like height, width, etc.
>>>>>> PyOpenCL tests in Image constructor (pyopencl/__init__.py,
>>>>>> line 200-ish) whether it is run on OpenCL 1.2 or 1.1,
>>>>>> and runs appropriate code based on this. It uses
>>>>>> get_cl_header_version() for this check which fails in
>>>>>> some situations, e.g. on Debian. In Debian we have opencl-headers
>>>>>> (currently in 1.2), ICD loader (1.2) and ICD implementations
>>>>>> with different versions. This means that headers will always
>>>>>> have version 1.2 (or higher - but it'll be the highest
>>>>>> possible version) but platforms might have lower versions.
>>>>>> This was the case of this segfault. PyOpenCL expected to have
>>>>>> new clImageCreate, ICD loader was ready to give pointer
>>>>>> to this function to PyOpenCL, but platform (NVIDIA)
>>>>>> was not providing it.
>>>>>>
>>>>>> I have changed Image constructor to base usage of clCreateImage on
>>>>>> devices' platform version. I assumed that Context always have
>>>>>> at least one device - if not, please change this code.
>>>>> I'm wondering--isn't this an issue with the ICD loader? I had sort of
>>>>> expected that the header needs to match the loader, and if the loader
>>>>> exports a 1.2 interface, then all of those functions are at least safe
>>>>> to call--i.e. I as a user don't have to go around checking versions just
>>>>> to determine what API I can call.
>>>>>
>>>>> In this particular instance, I had thought the ICD loader would
>>>>> translate the call to the old 1.1 interface, or, if impossible, provide
>>>>> an error.
>>>>>
>>>>> A segfault is most definitely not an appropriate response...
>>>>>
>>>>> OTOH, it seems both the AMD ICD loader and the open-source ICD loader
>>>>> (as you indicate) behave this way, so we might not get a choice in this
>>>>> matter.
>>>>>
>>>>> Does the spec say anything on this? What's your assessment? Are these
>>>>> transient bugs in the ICD loader, design flaws in the spec, or something
>>>>> completely different?
>> I think this is a design flaws in the specs.
>> An ICD Loader has no information at all (unless we decide to hardcode
>> some of them) about the supported OpenCL functions by the loaded ICD.
>>
>> An ICD Loader merely get a address of of array of function pointers. It
>> even does not know the size of the array it gets. This means, there is
>> no reliable way to know if the address we got for a 1.2 function is
>> garbage (belong the end of the table of an 1.1 implementation) or
>> correct.
>> Looking at version advertised by the ICD implementation is not a
>> solution: Intel implementation advertises 1.1 but implement (part of)
>> 1.2
>>
>> I'm willing to add/patch anything required in ocl-icd. We can add some
>> more functions to the interface (it means that a program using these
>> functions will not work with other OpenCL ICD loader) or provides
>> them in an additional library (so that it works with any ICD Loader
>> implementation).
>> But, for now, I see no other way than using hardcoded information.
>> If we go this path, we should think about which information we want
>> exactly and how we want them to be presented (ie API/ABI)
>>
>> What I can propose is that, for any public symbol, we try to look
>> if the corresponding function exists in the targeted ICD. Some
>> sanity checks can be done automatically (non-null pointer, ...)
>> but some hard-coded information will be required.
> Problem is we don't get the array of function address once for any OpenCL implementation. It is carried by each object created by the implementation. And it could potentially be allocated at each creation (unlikely but valid in the context of the specs.). That means that in order to know what object is associated with what platform we would have to store all of them for future reference, meaning parsing function return values for newly created object and analysing error condition from every functions. All this is clearly out of the specs of the icd loader as defined by Khronos.
>
> Null pointer checking can be done, but it will only cover a few segfault cases, and the spec does not define a suitable error. Maybe trying to discuss with Khronos to see if the specs can be improved is a better way to address this problem. The objective with the free loader is to be able to get rid of the proprietary ones. I really fear being non compliant is the way to bring them back in force.
>> I also ask me how/if we can divert the internal function pointer
>> structure provided by the implementation in order to fully fill
>> a whole structure (with error functions for the missing ones).
>> I think that is is feasible. But that it would be possible for
>> a strange ICD implementation to respect the standard and break
>> with what I imagine (ie my implementation would be border-line
>> with respect to the ABI specifications)
>>
>>
>> About the specific problem of clCreateImage, I tried to look at
>> it. If I understand correctly,
>> OpenCL 1.1 defines clCreateImage2D and clCreateImage3D
>> OpenCL 1.2 defines clCreateImage and deprecates clCreateImage2D
>> and clCreateImage3D
>> I.e, contrary to the initial message, I do not think that a
>> prototype changes (this would be a severe bug with respect to
>> ICD Loader specifications). But we have 2/3 functions with
>> similar prototypes, some provided by some implementations,
>> other provided by other implementations.
>> But, for now, I do not think there is any way an implementation
>> can currently reliably detect at run-time if an specific plate-form
>> implement or not theses functions.
>>
>> Regards,
>> Vincent
>>
>>>>> I must admit I'm pretty reluctant to call a bunch of GetInfo functions
>>>>> and then do a bunch of string processing just to figure out what
>>>>> function is safe to call. Maybe as a temporary workaround, but not as a
>>>>> permanent thing.
>>>>>
>>>>> Any opinions/insights?
> Unfortunately this is exactly what has to be done from a software point of view. As an OpenCL application developer myself I had hoped to find another way but there was none I could discover. The easiest way is to attach the platform version to a newly created context.
>
> Regards,
>
> Brice
>>>> Hi Tomasz,
>>>>
>>>> any news on this front?
>>> Sorry - did not have time to investigate it deeply.
>>> Description of OpenCL ICD extension does not deal with case
>>> of different platform versions.
>>>
>>> Vincent - any thoughts about how OpenCL should behave
>>> in current case, when loader has version 1.2 and
>>> tries to use ICD with version 1.1 (here NVIDIA one)?
>>>
>>> Best regards.
>>>
>>
>
--
Vincent Danjean Adresse: Laboratoire d'Informatique de Grenoble
Téléphone: +33 4 76 61 20 11 ENSIMAG - antenne de Montbonnot
Fax: +33 4 76 61 20 99 ZIRST 51, avenue Jean Kuntzmann
Email: Vincent.Danjean(a)imag.fr 38330 Montbonnot Saint Martin
Adding list to CC - please keep discussion on the list to allow
others to find it later.
Dnia 2012-08-23, czw o godzinie 09:56 +1000, Matthew.Chadwick(a)csiro.au
pisze:
> yes, but should I first uninstall the version of pyopencl I've built ? if so, how
If you installed it without any changes it installed files into:
/usr/local/lib/python3/dist-packages/pyopencl*
/usr/local/include/pyopencl
Remove all those directories and then install PyOpenCL packages.
Regards.
> ________________________________________
> From: Tomasz Rybak [tomasz.rybak(a)post.pl]
> Sent: Wednesday, August 22, 2012 5:59 PM
> To: Andreas Kloeckner
> Cc: Chadwick, Matthew (ICT Centre, Marsfield); pyopencl(a)tiker.net
> Subject: Re: [PyOpenCL] pyopencl with python3
>
> Dnia 2012-08-22, śro o godzinie 02:48 -0400, Andreas Kloeckner pisze:
> > Hi Matthew,
> >
> > Matthew.Chadwick(a)csiro.au writes:
> > > hi, new to all this, just installed python3 & (think) successfully built pyopencl and installed it, but I get:
> > >
> > > Python 3.2.3 (default, May 3 2012, 15:51:42)
> > > [GCC 4.6.3] on linux2
> > > Type "help", "copyright", "credits" or "license" for more information.
> > >>>> import pyopencl as cl
> > > Traceback (most recent call last):
> > > File "<stdin>", line 1, in <module>
> > > File "pyopencl/__init__.py", line 357
> > > except Exception, e:
> > > ^
> > > SyntaxError: invalid syntax
> > >
> > >
> > > I am on a fresh 64-bit Ubuntu 12.04 install, latest ubuntu nvidia driver, python3-numpy & scipy are installed, as is latest cuda & nvidia gpusdk (I have built & ran the OpenCL examples)
> > >
> > > here's how I built it:
> > >
> > > python3 configure.py --cl-inc-dir=/usr/local/cuda/include/ --cl-lib-dir=/usr/lib/nvidia-current/ --cl-libname=OpenCL --cl-enable-gl
> > > python3 setup.py build
> > > sudo python3 setup.py install
> >
> > cd ..
> >
> > should fix your problem. (I.e. don't try to run within the pyopencl
> > package directory)
>
> Also - you do not need to build PyOpenCL manually as Ubuntu contains
> PyOpenCL packages. 12.04 contains old version (0,92 - very old version)
> but Quantal Quetzal contains the same version as Debian - 2012.1:
> https://launchpad.net/ubuntu/+source/pyopencl/2012.1-1ubuntu1
> There are packages for Python2 and Python3.
>
> Can you test it whether it works? I am regularly testing
> Debian packages but do not have Ubuntu machine on which I could
> test OpenCL-related stuff.
>
> Best regards.
>
> --
> Tomasz Rybak <tomasz.rybak(a)post.pl> GPG/PGP key ID: 2AD5 9860
> Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860
> http://member.acm.org/~tomaszrybak
--
Tomasz Rybak <tomasz.rybak(a)post.pl> GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak
Hi Matthew,
Matthew.Chadwick(a)csiro.au writes:
> hi, new to all this, just installed python3 & (think) successfully built pyopencl and installed it, but I get:
>
> Python 3.2.3 (default, May 3 2012, 15:51:42)
> [GCC 4.6.3] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import pyopencl as cl
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "pyopencl/__init__.py", line 357
> except Exception, e:
> ^
> SyntaxError: invalid syntax
>
>
> I am on a fresh 64-bit Ubuntu 12.04 install, latest ubuntu nvidia driver, python3-numpy & scipy are installed, as is latest cuda & nvidia gpusdk (I have built & ran the OpenCL examples)
>
> here's how I built it:
>
> python3 configure.py --cl-inc-dir=/usr/local/cuda/include/ --cl-lib-dir=/usr/lib/nvidia-current/ --cl-libname=OpenCL --cl-enable-gl
> python3 setup.py build
> sudo python3 setup.py install
cd ..
should fix your problem. (I.e. don't try to run within the pyopencl
package directory)
HTH,
Andreas
hi, new to all this, just installed python3 & (think) successfully built pyopencl and installed it, but I get:
Python 3.2.3 (default, May 3 2012, 15:51:42)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyopencl as cl
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pyopencl/__init__.py", line 357
except Exception, e:
^
SyntaxError: invalid syntax
I am on a fresh 64-bit Ubuntu 12.04 install, latest ubuntu nvidia driver, python3-numpy & scipy are installed, as is latest cuda & nvidia gpusdk (I have built & ran the OpenCL examples)
here's how I built it:
python3 configure.py --cl-inc-dir=/usr/local/cuda/include/ --cl-lib-dir=/usr/lib/nvidia-current/ --cl-libname=OpenCL --cl-enable-gl
python3 setup.py build
sudo python3 setup.py install
hi, new to all this, just installed python3 & (think) successfully built pyopencl and installed it, but I get:
Python 3.2.3 (default, May 3 2012, 15:51:42)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyopencl as cl
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pyopencl/__init__.py", line 357
except Exception, e:
^
SyntaxError: invalid syntax
I am on a fresh 64-bit Ubuntu 12.04 install, latest ubuntu nvidia driver, python3-numpy & scipy are installed, as is latest cuda & nvidia gpusdk (I have built & ran the OpenCL examples)
here's how I built it:
python3 configure.py --cl-inc-dir=/usr/local/cuda/include/ --cl-lib-dir=/usr/lib/nvidia-current/ --cl-libname=OpenCL --cl-enable-gl
python3 setup.py build
sudo python3 setup.py install
As I wrote in my email from 2012-08-14, I experienced
crashes in image-related functions in test_wrapper.py
on NVIDIA hardware. I managed to find the reason of that
crash and fix it (patch in attachment). Below you can find
explanation.
Unlike all other vendors, NVIDIA still have not released
OpenCL 1.2. Image creation functions have changed in
OpenCL 1.2 - now clCreateImage expects to get cl_image_desc
instead of bunch of arguments like height, width, etc.
PyOpenCL tests in Image constructor (pyopencl/__init__.py,
line 200-ish) whether it is run on OpenCL 1.2 or 1.1,
and runs appropriate code based on this. It uses
get_cl_header_version() for this check which fails in
some situations, e.g. on Debian. In Debian we have opencl-headers
(currently in 1.2), ICD loader (1.2) and ICD implementations
with different versions. This means that headers will always
have version 1.2 (or higher - but it'll be the highest
possible version) but platforms might have lower versions.
This was the case of this segfault. PyOpenCL expected to have
new clImageCreate, ICD loader was ready to give pointer
to this function to PyOpenCL, but platform (NVIDIA)
was not providing it.
I have changed Image constructor to base usage of clCreateImage on
devices' platform version. I assumed that Context always have
at least one device - if not, please change this code.
Best regards.
--
Tomasz Rybak <bogomips(a)post.pl> GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak