Hi,
I'm trying to access GL textures from pyopencl. Here is my test program:
import sys, os, pygame
from OpenGL.GL import *
sys.path.append("extern/pyopencl/build/lib.linux-x86_64-2.6")
import pyopencl
pygame.init()
screen = pygame.display.set_mode((1024, 768), pygame.HWSURFACE |
pygame.OPENGL | pygame.DOUBLEBUF)
if pyopencl.have_gl():
context = pyopencl.create_some_context()
tex = glGenTextures(1)
glBindTexture(GL_TEXTURE_2D, tex)
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 256, 256, 0, GL_RGBA,
GL_UNSIGNED_BYTE, None)
cltex = pyopencl.GLTexture(context, pyopencl.mem_flags.READ_ONLY,
GL_TEXTURE_2D, 0, tex, 2)
It fails with error:
Traceback (most recent call last):
File "cl.py", line 14, in <module>
cltex = pyopencl.GLTexture(context, pyopencl.mem_flags.READ_ONLY,
GL_TEXTURE_2D, 0, tex, 2)
pyopencl.LogicError: clCreateFromGLTexture2D failed: invalid context
I thought that the problem might be in pyopencl's context creation,
which doesn't take the GL context into account. I tried to fix it by
adding appropriate CL_GL_CONTEXT_KHR, CL_GLX_DISPLAY_KHR and
CL_CONTEXT_PLATFORM props to the context, but then I got another error
"pyopencl.LogicError: clCreateFromGLTexture2D failed: invalid value". I
can run kernels just fine with my setup, but this GL stuff won't work.
What am I doing wrong?
Hi Jan,
On Mon, 22 Nov 2010 09:04:03 +0100, "Meinke, Jan" <j.meinke(a)fz-juelich.de> wrote:
> I have a problem with PyOpenCL using the ATI Stream SDK 2.2. I have a
> double precision ndarray a. I'm casting it to single precision using
>
> a.astype(np.float32) before I transfer it to the device. If I allocate
> the buffer a_dev using the COPY_HOST_PTR flag and
> hostbuf=a.astype(np.float32) everything works fine, but if I first
> allocate memory and then use enqueue_write_buffer to transfer the
> data, the first couple of entries on the device contain junk. Starting
> with a_dev[4] (32 bit) or a_dev[8] (64 bit) everything is correct
> again. The problem occurs no matter if the device is a CPU or GPU. It
> does not show up with the NVIDIA SDK.
Are both the computation and the transfer in the same command queue? Are
you passing 'True' for is_blocking? (Alternatively--do you wait for the
write to complete?) Can you post sample code to reproduce?
HTH,
Andreas
On Tue, Nov 23, 2010 at 4:03 PM, Andreas Kloeckner
<kloeckner(a)cims.nyu.edu> wrote:
> On Tue, 23 Nov 2010 10:42:20 +0100, christian schilling <initcrash(a)googlemail.com> wrote:
>> Hi Andreas,
>> thank you for the clarification. I suspected this, but was not sure
>> whether this is the expected behaviour or some kind
>> of bug.
>> The sad thing is, this does not seem to be documented anywhere. This
>> makes a thing that should be totally straightforward, like loading
>> an image from a file (into an np.array) scaling it using the gpu's
>> texture unit and writing it back to a file, a bit difficult.
>> What i am saying is, there should be a note in the docs about the
>> potential problem and/or an example (i'd provide this)
>> to prevent others from wasting hours on this like i did.
>
> The relevant documentation is in doc/source/runtime.rst. I'd be happy to
> take a patch.
>
I added a note to the docs:
https://github.com/initcrash/pyopencl
Dear devs,
I'm trying to profile some PyOpenCL scripts using NVIDIA's Compute
Visual Profiler. However, I always receive an error suggesting I might
not have released some resources properly.
Some people at NVIDIA's OpenCL forum have run into the same error and
reported that including clReleaseEvent(event-name) at the end of their
code solved the problem.
I can't find a binding for this function in the PyOpenCL documentation,
and I'm curious if this feature is implemented or if anyone could send
me a script that currently works for their profiler.
I have included a condensed version of my code plus a file that can be
used to run the script through the profiler (the script was made to
investigate how strided access of global memory affects memory bandwidth).
PS:
The exact error message reads: "Compute Visual Profiler Error. Profiler
data file "path".csv does not contain profiler output. This can happen
when: a) Profiling is disabled... b)The application does not invoke any
kernels or memory transfers. c) The application does not release
resources (contexts, events, etc.). The program needs to be modified to
properly free up all resources before termination.
PPS:
The PyOpenCL examples included in the download give similar errors.
Kind regards
---
- Andreas Reiten
Hi,
I have a problem with PyOpenCL using the ATI Stream SDK 2.2. I have a double precision ndarray a. I'm casting it to single precision using
a.astype(np.float32) before I transfer it to the device. If I allocate the buffer a_dev using the COPY_HOST_PTR flag and hostbuf=a.astype(np.float32) everything works fine, but if I first allocate memory and then use enqueue_write_buffer to transfer the data, the first couple of entries on the device contain junk. Starting with a_dev[4] (32 bit) or a_dev[8] (64 bit) everything is correct again. The problem occurs no matter if the device is a CPU or GPU. It does not show up with the NVIDIA SDK.
Here's a code example that shows the problem.
# -*- coding: utf-8 -*-
import pyopencl as cl
import numpy as np
kernel = """/// Simple copy kernel
__kernel void copy(__global float* a, __global float* b){
int i = get_global_id(0);
b[i] = a[i];
}
"""
def testCopy():
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx, properties =
cl.command_queue_properties.PROFILING_ENABLE)
prg = cl.Program(ctx, kernel).build(options="-g")
mf = cl.mem_flags
orig = np.random.random(1024)
a = orig.astype(np.float32)
b = np.zeros_like(a)
b2 = np.zeros_like(a)
a_dev = cl.Buffer(ctx, mf.READ_ONLY, a.nbytes)
a2_dev = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR,
hostbuf = orig.astype(np.float32)) # This works
b_dev = cl.Buffer(ctx, mf.WRITE_ONLY, a.nbytes)
b2_dev = cl.Buffer(ctx, mf.WRITE_ONLY, a.nbytes)
cl.enqueue_write_buffer(queue, a_dev, orig.astype(np.float32))
# cl.enqueue_write_buffer(queue, a_dev, a) # This works, too
prg.copy(queue, a.shape, None, a_dev, b_dev)
prg.copy(queue, a.shape, None, a2_dev, b2_dev)
cl.enqueue_read_buffer(queue, b_dev, b).wait()
cl.enqueue_read_buffer(queue, b2_dev, b2).wait()
print np.sum(a), np.sum(b), np.sum(b2) # Results differ
if __name__ == "__main__":
testCopy()
Any idea what's going on?
Jan
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Hi,
I have successfully installed pyopencl on a i7 laptop with a Mobile
5850 ATI video card.
My software configuration is:
- Ubuntu 9.10 64 bit
- Stream 2.2 (AMD/ATI)
- latest AMD/ATI mobile video driver (10.10)
Playing with the various demos included in pyopencl :
1./ if the code runs on the CPU (using OpenCL) all the 8 "cores" of
the i7 goes to 100% , but the machine is still usable; mouse react and
the other windows are active;
2./ if the same code run on the GPU (using OpenCL) the machine
"freeze" completely during the execution of the kernel.
N.B: "freeze" means that the PC does not react during kernel execution
but, at the end, my laptop correctly restart normal operation: no
stability problem, sudden crash or memory leaks. Numerical results are
ok in both cases. I suspect that is not a stability related issue but
only how "wait" is implemented (probably there is a "busy wait" inside
the GPU code).
My "newbie" questions are:
- situation 2./ is it normal? Can I avoid the "freeze" of the machine ?
- can I use CPU _and_ GPU at the same time? How do that from pyopencl ?
Another observation is about the amount of available memory reported:
- pyopencl reports 3Gbytes for the CPU and 512Mbytes for the GPU, but
I have 6Gbyte of RAM and 1Gbyte of video RAM ....
Thanks in advance for your help
Simone Mannori
www.scicos.org ; www.scicoslab.org ;