Dear all,
we are trying to implement a K nearest neighbor search on GPUs with
PyOpenCL. The goal of the algorithm is: For a given target point,
find the nearest points from a given set (training data). The distance
between two points is computed by the squared euclidean distance.
One of our implementations is a brute force approach, which aims
at processing big data sets in parallel, e.g. 1 million training data and
some millions of targets (test data). For every target point one kernel
instance is created which finds the k nearest points out of the
training points.
Our problem is the following. Everything works fine for small data sets
and the results are as expected on both GPU (GeForce GTX 650 with
nVidia Driver 313.09.) and CPU(Intel Core i5-3450 with AMD APP SDK)
running Ubuntu 12.10, PyOpenCL 2013.1-py2.7-linux-x86_64.
But if we increase the size of the data sets, the GPU version crashes
with the following error:
> File "brutegpu.py", line 65, in query
> cl.enqueue_copy(self.queue, d_min, self.d_min_buf).wait()
> File "/usr/local/lib/python2.7/dist-packages/
> pyopencl-2013.1-py2.7-linux-x86_64.egg/pyopencl/__init__.py",
> line 935, in enqueue_copy
> return _cl._enqueue_read_buffer(queue, src, dest, **kwargs)
> pyopencl.LogicError: clEnqueueReadBuffer failed: invalid command queue
The CPU-Version still works fine with 1 million training points
and 1 million of test points. Attached you can find the corresponding
source code as working minimal example, which consists of on
Host-Python-File
and one OpenCL-Kernel-File.
We would highly apprecriate any help - maybe we made a
mistake which is already known to you.
So the big question for us is: Why is it working on CPU and why isn't it
working on the GPU?
Are there nVidia-specific pitfalls for such big data sets?
The compiler says:
> ptxas info : Compiling entry function 'find_knn' for 'sm_30'
> ptxas info : Function properties for find_knn
> 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
> ptxas info : Used 17 registers, 336 bytes cmem[0], 4 bytes cmem[3]
Or are there any rules for using a kernel for big data sets such as setting
the work group sizes or maximum memory usage?
The error message "invalid command queue" is confusing and I wasn't able
to find any helpful information (except that oftentimes "invalid command
queue" means segfault, but i could not find any wrong array adress yet.)
Maybe one of you could have a look at our code and finds some stupid
mistake.
We would be very grateful for every hint.
Best regards,
Justin Heinermann,
University Oldenburg
Bogdan Opanchuk <mantihor(a)gmail.com> writes:
> Hi Andreas,
>
> On Wed, May 28, 2014 at 4:04 PM, Andreas Kloeckner
> <lists(a)informa.tiker.net> wrote:
>> - PyOpenCL's array and algorithms functionality will be spun off into
>> separate packages ("clarray" and "clalgorithms"?).
>
> Isn't `compyte` supposed to be the array package (and I remember it
> containing some algorithms as well)? Will its status change somehow?
Compyte has a role across PyCUDA and PyOpenCL. The things I'm describing
here are CL-only.
Andreas
Hello.
PyOpenCL packages in Debian has been moved from contrib
to main. To those less familiar with Debian taxonomy
it means that PyOpenCL is in "True Debian(TM)"
in fully supported section. It also means that it is
visible in default install - user does not need
to enable additional sections to have it available.
Debian currently contains 4 OpenCL implementations:
1. AMD
2. Beignet
3. Mesa
4. NVIDIA
so there is place to play.
More details of this change is in post:
http://tomaszrybak.wordpress.com/2014/05/27/pyopencl-in-main/
Best regards.
--
Tomasz Rybak GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi all,
Recently I have been playing around with OpenCL on the Intel Xeon Phi.
A quirk of the OpenCL stack (which also applies to the Intel CPU
stack) is that the compiler is extremely chatty. Running a simple
axpy kernel gives:
/usr/local/lib/python2.7/site-packages/pyopencl/__init__.py:57:
Build on <pyopencl.Device 'Intel(R) Many Integrated Core Acceleration
Card' on 'Intel(R) OpenCL' at 0x1a207d8> succeeded, but said:
Compilation started
Compilation done
Linking started
Linking done
Build started
Kernel <sum> was successfully vectorized
Done.
warn(text, CompilerWarning)
/usr/local/lib/python2.7/site-packages/pyopencl/__init__.py:57:
CompilerWarning: From-binary build succeeded, but resulted in
non-empty logs:
Build on <pyopencl.Device 'Intel(R) Many Integrated Core Acceleration
Card' on 'Intel(R) OpenCL' at 0x1a207d8> succeeded, but said:
Linking started
Linking done
Build started
Kernel <sum> was successfully vectorized
Done.
warn(text, CompilerWarning)
I am not sure the best way to handle this; PyOpenCL does not appear to
expose an API for programmatic control over if these 'warnings' are
generated. It is of course possible to sink the output but this does
not help us to decide if a genuine warning has been generated or not
(which are absolutely of interest and should be displayed).
How have other people handled this?
Regards, Freddie.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.20 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iEYEARECAAYFAlOHbSEACgkQ/J9EM/uoqVfbxQCdFv/SKh3JAGGvg2jVsRSSScJs
Dn8An0E4hWkjXBT2+K9gbJzePiIRX6YZ
=4cPY
-----END PGP SIGNATURE-----
Hi Guilherme,
Guilherme Gonçalves Ferrari <gg.ferrari(a)gmail.com> writes:
> So, I think the problem is with the pytools' wheel.
>
> using "pip install --no-use-wheel pytools" everything works just fine.
>
> however, here is what happened when I used just "pip install pytools":
Thanks for investigating!
I've deleted the wheels from my 2to3-using packages for the time
being--it'd be great if you could verify that this addresses the
issue. It does for me.
Thanks again,
Andreas
Guilherme Gonçalves Ferrari <gg.ferrari(a)gmail.com> writes:
> import pyopencl fails because pytools seems to be broken on py3k. I'm using
> the latest version of both packages.
>
> -------------------------------
> import pyopencl as cl
>
> File
> "/home/travis/miniconda/envs/testenv3.4/lib/python3.4/site-packages/pyopencl/__init__.py",
> line 716, in <module>
>
> _add_functionality()
>
> File
> "/home/travis/miniconda/envs/testenv3.4/lib/python3.4/site-packages/pyopencl/__init__.py",
> line 333, in _add_functionality
>
> from pytools import memoize_method
>
> File
> "/home/travis/miniconda/envs/testenv3.4/lib/python3.4/site-packages/pytools/__init__.py",
> line 1313
>
> print value, bin_nr, bin_starts
>
> ^
>
> SyntaxError: invalid syntax
> -------------------------------
How did you install pytools? It uses 2to3, so that print statement would
have gotten translated if 2to3 had run as intended. Apparently it has
not, and I'm trying to figure out why.
Andreas
Hi all,
thanks to the tireless efforts of Marko Bencun and Yichao Yu, PyOpenCL
now has a reasonably functional branch [1] based on cffi instead of Boost.Python.
[1] https://github.com/pyopencl/pyopencl/tree/cffi
What this means is that there's real hope that at some point in the
not-too-distant future there will be a fully functional PyOpenCL that
runs on PyPy, which I personally find very exciting. This would amount
to a frighteningly nice environment within which to do high-performance
work.
Now, we're not quite there yet. Let me quickly summarize the state of
things:
- I can go grab PyPy 2.3 with apt-get (Debian has it, for example), make
a virtualenv, install numpy following the instructions [2], install
PyOpenCL as usual, run the tests, and passes 85% of them. (79/93 on my
machine)
I'll admit that I had never really played much with PyPy. The extent
to which all of this 'just works' with it is really impressive.
- That said, important bits are still missing. It's not ready for
production use. If you'd like to help out, see this file [3].
We're tantalizingly close.
[2] https://bitbucket.org/pypy/numpy
[3] https://github.com/pyopencl/pyopencl/blob/cffi/TODOs
This has important implications for the future of PyOpenCL. Here's where
I imagine we might be headed:
- For a while, there will be two equivalent implementations of PyOpenCL,
pyopencl_bpl and pyopencl_cffi. Eventually, pyopencl_bpl may be
deprecated. Both will install as "pyopencl". The "pyopencl" package
index entry will be an alias of one of the two--likely "pyopencl_bpl"
initially.
- PyOpenCL's array and algorithms functionality will be spun off into
separate packages ("clarray" and "clalgorithms"?). To ease the
transition, both pyopencl versions will depend on these packages, but
using them through their old names ("pyopencl.array" etc) will be
deprecated. Naturally, this will come with a long transition period to
give dependencies time to adapt. Interfaces will stay the same, so
that all that's needed in downstream software is a change of the
import name.
This step not just eases maintenance of both base wrappers, it is also
fairer to alternative CL algorithms packages like Bogdan's reikna and
potential alternative array libraries, which now don't have to face
the question of why they're replacing something that already comes
bundled with the wrapper. In addition, I've found that much of the
algorithms functionality often gets overlooked, because the name
'PyOpenCL' doesn't indicate that there's quite a bit more in the
package.
- To make all of this easier, I will make one more release (2014.1) that
supports Python 2.4 and later, and for all releases after that, Python
2.6 will be required.
Comments?
Andreas
import pyopencl fails because pytools seems to be broken on py3k. I'm using
the latest version of both packages.
-------------------------------
import pyopencl as cl
File
"/home/travis/miniconda/envs/testenv3.4/lib/python3.4/site-packages/pyopencl/__init__.py",
line 716, in <module>
_add_functionality()
File
"/home/travis/miniconda/envs/testenv3.4/lib/python3.4/site-packages/pyopencl/__init__.py",
line 333, in _add_functionality
from pytools import memoize_method
File
"/home/travis/miniconda/envs/testenv3.4/lib/python3.4/site-packages/pytools/__init__.py",
line 1313
print value, bin_nr, bin_starts
^
SyntaxError: invalid syntax
-------------------------------
I am having (what I hope is) a basic OpenGL user-interface issue with a
sample program I am working on.
This glut program has a visualized OpenCL data structure whose points are
calculated via an OpenCL kernel. The scene is then combined with a standard
Utah Teapot and dodecahedron straight out of the glut library.
The teapot and polyhedron are rendered using shaders--so if anyone is looking
for an example using OpenCL with shader visualizations this example might be
useful.
The problem I have is that if you manipulate the scene with the mouse you can
tell that the OpenCL particle fountain is always moving according to the mouse
input, but the teapot and polyhedron sometimes seem to move in the opposite
direction.
I know this isn't an OpenCL question per se, but it's almost impossible to find
OpenGL people who are set up to also run and evaluate OpenCL code, but people
who understand and use OpenCL are often times also able to run OpenGL samples.
Can someone help me figure out what I'm doing wrong?
https://gist.github.com/kbrafford/18665bdafa2b0a822f4f
Thanks!
--Keith