Yiming Peng <ypeng(a)u.northwestern.edu> writes:
I am a former student of your CS 450 and now I am a incoming PhD student in
operations research at Northwestern.
Since I am interested in applying parallel computing, preferably using
Python, to my future research, I have been looking for software which
combines Python with CUDA. Then I found PyCUDA on your website. And I found
NumbaPro. It seems that these two are the most popular choices for people
with needs like mine.
So my question is: which one do I begin to learn and use first? Could you
give some comments on pros and cons about the two?
Cc'ing the PyCUDA list for archival/searchability.
- PyCUDA lets you/forces you to write CUDA C for your kernels.
- Numba lets you write (a narrow subset of) Python for your kernels,
including arrays I believe.
- The code you write for both will be roughly equivalent modulo
spelling, since you'll have to
- PyCUDA exposes (nearly) the entire CUDA runtime, including streams,
profiling, textures, ... Numba is more restricted.
- PyCUDA comes with an on-device array type. I'm not sure if Numba's
arrays stay on-device after the computation finishes--i.e. you may
have some implicit copying.
- PyCUDA comes with some pre-made parallel algorithms such as scans
- You may also want to take a look at
Hope that helps,