Hello,

I guess that some changes would fix the problem. The first atomic_add can be written like:

atomic_add(&l_total, t);

The other atom_add:

atom_add(&result[0], l_total);

The thing is that atomic_add needs the first parameter (where to add) as a pointer, what is easily achieved with &. Hope that your problem has been solved.

Regards,
Alex.

On Mon, Sep 17, 2012 at 6:44 PM, J Diviney <justdivs@gmail.com> wrote:
Hi,

I'm writing a relatively simple function to sum up a set of results from a simulation I'm running, and want to do it on the GPU, since it involves summing a large dataset. The function is as follows:


__kernel void k(__global float4 *spins,
                __global uint *result)
{   
    ushort lid = 2 * get_local_id(0);
    ushort gid = 2 * get_group_id(1);
   
    float total = 0;
    local uint l_total;
   
    if(lid == 0)
    {l_total = 0;}
   
    barrier(CLK_LOCAL_MEM_FENCE);
   
    total += spins[lid * Y + gid].w;
    total += spins[lid * Y + gid + 1].w;
    total += spins[(lid + 1) * Y + gid].w;
    total += spins[(lid + 1) * Y + gid + 1].w;
   
    uint t = (uint) total;
   
    uint atomic_add(uint l_total, uint t);
   
    barrier(CLK_LOCAL_MEM_FENCE);
   
    if(lid == 0)
    {uint atom_add(uint result[0], uint l_total);}
}
"""

Note that this isn't my original code, I've been modifying it loads to try and get it to work, to no avail. (Y is an uint that I declare elsewhere). I've also tried adding __OVERLOADABLE__, __local and __global to the additions, but that didn't help either.

The error message I get when compiling is:


pyopencl.RuntimeError: clBuildProgram failed: build program failure -

Build on <pyopencl.Device 'GeForce GTX 560 Ti' on 'NVIDIA CUDA' at 0x29f4d80>:

:36:10: error: overloaded function 'atom_add' must have the 'overloadable' attribute
    uint atomic_add(uint l_total, uint t);
         ^
<built-in>:3942:20: note: instantiated from:
#define atomic_add atom_add
                   ^
<built-in>:3845:24: note: previous overload of function is here
ulong __OVERLOADABLE__ atom_add(__local volatile ulong *ptr, ulong val);
                       ^
:41:11: error: overloaded function 'atom_add' must have the 'overloadable' attribute
    {uint atom_add(uint result[0], uint l_total);}
          ^
:36:10: note: previous overload of function is here
    uint atomic_add(uint l_total, uint t);
         ^
<built-in>:3942:20: note: instantiated from:
#define atomic_add atom_add
                   ^

Any help would be greatly appreciated.

Thanks,
Justin

_______________________________________________
PyOpenCL mailing list
PyOpenCL@tiker.net
http://lists.tiker.net/listinfo/pyopencl