Andreas Klöckner wrote:
On Montag 02 November 2009, Neil Pilgrim wrote:
You can easily mimic the vector types' memory layout in numpy.
Just make sure
that the stride of the relevant dimension (see myarray.strides, in bytes) is
the smallest one. The example you give:
> I ask since in python I have something like:
> self.r = zeros((num_particles,3))
should already work perfectly for a double3 (if such a thing existed in
So to clarify, a numpy array of:
arr = zeros((N,4),dtype=float32)
should be able to be converted to a buffer buf and then accessed via
something like the following in a kernel:
float *off = buf + get_global_id(0)*4;
float4 v = float4(off,off+1,off+2,off+3);
...followed by various vector operations on v.
Perhaps there is some better way, or is this the expectation of how
vectors are going to be used/coded?
I assume that one kernel function could be called from the host to
declare __local arrays in a particular format (vector or otherwise) and
which are initialized from host memory, ie. the data on the card is
"known" as being in float4. Or perhaps the assumption is that
constructing vector variables 'on the fly' is sufficient, as above.
There doesn't seem to be much guidance on this, from what I've seen so
far - and CUDA doesn't seem to focus on vector operation support too
much in comparison, though I've not read anywhere near as much about
CUDA as openCL.
> Of course, this partly assumes that vector operations are even
> worthwhile at the current stage of development...any anecdotal or other
> evidence, anyone?
None, sorry. In CUDA, the answer depends on a number of factors, too.
Eventually I might manage to investigate this... [sigh]