Hi all,
I'd like to ask one thing. Is there a way to "nicely" do row-wise and
column-wise reduction or scan in pyopencl? What do I mean by that is,
that I have
1 1 1
1 1 1
1 1 1
then row-wise scan would return
1 2 3
1 2 3
1 2 3
and column one
1 1 1
2 2 2
3 3 3
Is there a way to do this? I can of course implement it with my own
kernel (though I have a feeling that my implementation is in no way
efficient), but I need this for a project I wanted to use as a showcase
of pyopencl's abilities, so I want as simple and "canonical" way to do
this as possible.
Thanks,
Jake
Hi,
is there some more documentation or examples for using vector types?
I have tried to do the naive way: https://gist.github.com/2236138 but
that doesn't seem to work. What I'm I doing wrong?
J
<#part sign=pgpmime>
On Tue, 27 Mar 2012 12:06:31 -0600, Ryan Haynes <rhaynesak(a)gmail.com> wrote:
> I have 4 54 megabyte buffers which I want to perform byte by byte
> analysis on. I can copy the data in roughly 100msec, this seems like
> decent tranfer time ~2gbyte/ second. However, when I go to execute my
> kernel the overhead passing in my device pointers is huge. Something
> like 500msec even on a no-op kernel.
What implementation are you using?
Andreas
I have 4 54 megabyte buffers which I want to perform byte by byte
analysis on. I can copy the data in roughly 100msec, this seems like
decent tranfer time ~2gbyte/ second. However, when I go to execute my
kernel the overhead passing in my device pointers is huge. Something
like 500msec even on a no-op kernel.
Data transfer and call code:
ueblock_d = cl.Buffer(self.ctx, mf.READ_ONLY |
mf.COPY_HOST_PTR, hostbuf=ueb)
leblock_d = cl.Buffer(self.ctx, mf.READ_ONLY |
mf.COPY_HOST_PTR, hostbuf=leb)
urblock_d = cl.Buffer(self.ctx, mf.READ_ONLY |
mf.COPY_HOST_PTR, hostbuf=urb)
lrblock_d = cl.Buffer(self.ctx, mf.READ_ONLY |
mf.COPY_HOST_PTR, hostbuf=lrb)
eb_errors_d = cl.Buffer(self.ctx, mf.WRITE_ONLY |
mf.COPY_HOST_PTR, hostbuf=eb_errors)
num_bytes = self.geom.block_max_bytes / 2
self.cl_funcs.calc_charge_lvl(self.queue, (num_bytes ,), None,
ueblock_d, leblock_d, urblock_d, lrblock_d, eb_errors_d)
no-op kernel:
__kernel void calc_charge_lvl(__global uchar* ueblock,
__global uchar* leblock,
__global uchar* urblock,
__global uchar* lrblock,
__global uint* errors)
{
uint i = get_global_id(0);
}
Am I doing something absurdly broken? If I only pass in errors which
is a 393 kbyte buffer the overhead is almost nil.
Thanks,
Ryan
I found a frustrating problem in pyopencl - after each kernel execution
host memory consumption increases by approx 1.5 MB. Taking into account
program workflow (modelling some amount of iterations on card, finishing
kernel, reading data via enqueue_copy, writing it to file, then starting
kernel again), I run out of my 4 GB of RAM after some thousands of such
calls.
In a previous project which was written on C++ I solved this problem this
way. An event object was dynamically created and passed to
enqueueNDRangeKernel(). After reading data from GPU event object was
deleted. Obviously it's impossible to use this method in Python.
Also it's worth to notice that the memory leak occures when using both CPU
or GPU. I'm using Intel CPU and nVidia GPU.
What can be done to fix the problem?
On Mar 19, 2012, at 3:05 PM, Vincent Favre-Nicolin <vincent.favre-nicolin(a)cea.fr> wrote:
> Hi,
>
> I just tried current git and it worked on OSX 10.7 & XCode 4.3.1.
> Thanks for the quick fix !
>
> Cheers,
> --
> Vincent Favre-Nicolin http://inac.cea.fr
>
> CEA/Grenoble Institut Nanosciences & Cryogénie
> Laboratoire SP2M/Nano-structures et Rayonnement Synchrotron
> 17, rue des Martyrs
> 38054 Grenoble Cedex 9 - France
>
> Université Joseph Fourier http://www.ujf-grenoble.fr
>
> tél: (+33) 4 38 78 95 40 fax: (+33) 4 38 78 51 38
>
I'm still getting one error:
src/wrapper/wrap_constants.cpp:776:5: error: use of undeclared identifier
'CL_PARTITION_BY_COUNTS_LIST_END'
ADD_ATTR(, PARTITION_BY_COUNTS_LIST_END);
^
thanks
karsten
Le 19/03/2012 21:12, Karsten Anderson a écrit :
> I'm still getting one error:
>
> src/wrapper/wrap_constants.cpp:776:5: error: use of undeclared identifier
> 'CL_PARTITION_BY_COUNTS_LIST_END'
> ADD_ATTR(, PARTITION_BY_COUNTS_LIST_END);
> ^
I don't see this. For the record, I used the following commands:
mkdir pyopencl-TEST
cd pyopencl-TEST/
git clone http://git.tiker.net/trees/pyopencl.git
cd pyopencl
git submodule init
git submodule update
python2.7 configure.py
make
sudo make install
And the python2.7 I use is the one from macports.
Vincent
--
Vincent Favre-Nicolin http://inac.cea.fr
CEA/Grenoble Institut Nanosciences & Cryogénie
Laboratoire SP2M/Nano-structures et Rayonnement Synchrotron
17, rue des Martyrs
38054 Grenoble Cedex 9 - France
Université Joseph Fourier http://www.ujf-grenoble.fr
tél: (+33) 4 38 78 95 40 fax: (+33) 4 38 78 51 38
Hi,
I just tried current git and it worked on OSX 10.7 & XCode 4.3.1.
Thanks for the quick fix !
Cheers,
--
Vincent Favre-Nicolin http://inac.cea.fr
CEA/Grenoble Institut Nanosciences & Cryogénie
Laboratoire SP2M/Nano-structures et Rayonnement Synchrotron
17, rue des Martyrs
38054 Grenoble Cedex 9 - France
Université Joseph Fourier http://www.ujf-grenoble.fr
tél: (+33) 4 38 78 95 40 fax: (+33) 4 38 78 51 38
On Mar 19, 2012, at 3:02 AM, Vincent Favre-Nicolin <vincent.favre-nicolin(a)cea.fr> wrote:
> ...I replaced this path in siteconfig.py by "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.7.sdk", and it seems to be working !
>
> Cheers,
> Vincent
> --
> Vincent Favre-Nicolin http://inac.cea.fr
That does seem to get the make working. I followed his steps but am getting 4 errors in make now:
src/wrapper/wrap_constants.cpp:773:5: error: use of undeclared identifier
'CL_DEVICE_PARTITION_EQUALLY_EXT'
ADD_ATTR_SUFFIX(DEVICE_PARTITION_, EQUALLY, _EXT);
^
src/wrapper/wrap_constants.cpp:117:21: note: expanded from macro
'ADD_ATTR_SUFFIX'
cls.attr(#NAME) = CL_##PREFIX##NAME##SUFFIX
^
<scratch space>:55:1: note: expanded from macro 'CL_'
CL_DEVICE_PARTITION_EQUALLY_EXT
^
src/wrapper/wrap_constants.cpp:774:5: error: use of undeclared identifier
'CL_DEVICE_PARTITION_BY_COUNTS_EXT'
ADD_ATTR_SUFFIX(DEVICE_PARTITION_, BY_COUNTS, _EXT);
^
src/wrapper/wrap_constants.cpp:117:21: note: expanded from macro
'ADD_ATTR_SUFFIX'
cls.attr(#NAME) = CL_##PREFIX##NAME##SUFFIX
^
<scratch space>:55:1: note: expanded from macro 'CL_'
CL_DEVICE_PARTITION_BY_COUNTS_EXT
^
src/wrapper/wrap_constants.cpp:775:5: error: use of undeclared identifier
'CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN_EXT'
ADD_ATTR_SUFFIX(DEVICE_PARTITION_, BY_AFFINITY_DOMAIN, _EXT);
^
src/wrapper/wrap_constants.cpp:117:21: note: expanded from macro
'ADD_ATTR_SUFFIX'
cls.attr(#NAME) = CL_##PREFIX##NAME##SUFFIX
^
<scratch space>:55:1: note: expanded from macro 'CL_'
CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN_EXT
^
src/wrapper/wrap_constants.cpp:776:5: error: use of undeclared identifier
'CL_PARTITION_BY_COUNTS_LIST_END_EXT'
ADD_ATTR_SUFFIX(, PARTITION_BY_COUNTS_LIST_END, _EXT);
^
src/wrapper/wrap_constants.cpp:117:21: note: expanded from macro
'ADD_ATTR_SUFFIX'
cls.attr(#NAME) = CL_##PREFIX##NAME##SUFFIX
^
<scratch space>:55:1: note: expanded from macro 'CL_'
CL_PARTITION_BY_COUNTS_LIST_END_EXT
^
8 warnings and 4 errors generated.
error: command 'clang' failed with exit status 1
make: *** [all] Error 1
<#part sign=pgpmime>
Hi Vincent,
On Mon, 19 Mar 2012 09:02:35 +0100 (CET), Vincent Favre-Nicolin <vincent.favre-nicolin(a)cea.fr> wrote:
> I replaced this path in siteconfig.py by
> "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.7.sdk",
> and it seems to be working !
I've tried to fix PyOpenCL based on your feedback, can you (and Karsten,
Steve, or Lewis) please try to build current git and report back if it
works out of the box? (and if not, what needs to be fixed or what the
error is?)
Thanks!
Andreas