Hi Zac,

You could try the following approach: for each element of the source array calculate its position in the destination array and move it there. As long as your last dimension is large enough, coalescing should make it reasonably fast. The borders can be filled with zeros in two ways:
1) fill the destination array with zeros beforehand;
2) use the destination array as the "guiding shape" and in every thread check if the corresponding element of the destination array maps to the source array; if not, fill it with zero, if yes, transfer the value from the source array

On Thu, Jan 14, 2016 at 2:53 AM, Zac Diggum <Diggum@gmx.de> wrote:
Hi all,

recently I needed a way to zeropad a 2D array with pyopencl like
numpy.pad() does. Since changing array size only works in the first
dimension I came up with this:

concatenate old array with zeros <- in 1st dim
transpose new array <- flip dims
concatenate new array with zeros <- again in 1st dim
transpose again <- flip back dims

That works well with the Transposition() in Bogdan Opanchuk's Reikna
package. I only need to do a reshape() after each transposition.
With pyopencl's transpose() strides are also flipped so I can't carry
out the 2nd transposition (strides don't match). So here is my question:
How can I make use of transpose() when the result always has flipped
Also, can you think of a simpler way to zeropad in 2D? The array should
be padded symmetrically so that the old array is embedded in the center
of the padded array.

Thanks in advance...

PyOpenCL mailing list