Leandro Demarco Vedelago <leandrodemarco(a)gmail.com> writes:
Hello: I've been reading and learning CUDA in the
last few weeks and
last week I started writing (translating to Pycuda from Cuda-C) some
examples taken from the book "Cuda by Example".
I started coding on a laptop with just one nvidia GPU (a gtx 560M if
my memory is allright) with Windows 7.
But in the project I'm currently working at, we intend to run (py)cuda
on a multi-gpu server that has three Tesla C2075 cards.
So I installed Ubuntu server 10.10 (with no GUI) and managed to
install and get running the very same examples I ran on the single-gpu
laptop. However they run really slow, in some cases it takes 3 times
more than in the laptop. And this happens with most, if not all, the
examples I wrote.
How do you control the multiple GPUs? By threading? How many CPU cores do
you have in the machine? (Should be >= 3.)
Also try and switch away from the busy-wait sync: