> Indeed. In the future, if OpenCL is the way to go, it may even be
> helpful to have Numpy using OpenCL directly, as AMD provides an SDK
> for OpenCL, and with Larrabee approaching, Intel will surely provide
> one of its own.
I was just in a lecture by one of the Intel people about OpenCL:
He offered no schedule for an Intel OpenCL implementation, but said
that they were committed to it.
The lectures in general were effective in pointing out what a
time-consuming effort it can be moving algorithms into the the
parallel world - including GPUs. The lecture just passed cited the
example of a CUDA-based BLAS implementation on the GPU that was slower
than the CPU version. Making BLAS go faster required a lot of work
to find optimal strategies for blocking, transfer between CPU / GPU
shared memory / GPU registers, vector sizes and so on - this on a
specific NVIDIA architecture.
I can imagine Numpy being useful for scripting in this
C-and-assembler-centric world, making it easier to write automated
testers, or even generate C code.
Is anyone out there working on this kind of stuff? I ask only because
there seems to be considerable interest here on the Berkeley campus.
This is exactly the sort of thing you can do with PyCUDA, which makes it
The metaprogramming section of the slides and source code from Nicolas
Pinto and Andreas Klöckner *excellent* SciPy2009 Tutorials is even more thorough: