On Mittwoch 29 Oktober 2008, you wrote:
I guess I was a version behind (0.90).
As for Windows, I am using MinGW compilers (version 3.4.5). The tricky
things about installing PyCuda were that MinGW does not have either
configure or ctags.
Instead of running configure, you can manually create siteconf.py. Here is
BOOST_INC_DIR = ['C:\\Program Files\\boost\\boost_1_36_0']
BOOST_LIB_DIR = ['C:\\Program Files\\boost\\boost_1_36_0\\stage\\lib']
BOOST_PYTHON_LIBNAME = ['boost_python-mgw34']
CUDA_ROOT = 'C:\\CUDA'
CUDADRV_LIB_DIR = ['C:\\CUDA\lib']
CUDADRV_LIBNAME = ['cuda']
CXXFLAGS = 
LDFLAGS = 
Instead of running make, you can just call python setup.py install
This works out of the box with 0.91rc (the __stdcall wrappers work great).
Ok, I've added a blurb to that effect to the documentation.
I see this as
more of a documentation issue than anything else. I wrote
with the goal of having the same indexing order in Numpy as in the
tex2D() function call--and the X coordinate varies fastest, which ends up
being the row index, which makes Fortran order correct. I've added a
blurb to this effect to the documentation. If there's a majority in
favor, I'd be willing to
add a "transposed" flag or some such to this call.
This is an interesting problem.
Texture is one of the few (or maybe the only) place that it matters how you
represent an image.
I am used to the Matplotlib/MATLAB style, in which horizontal raster lines
on the image are rows in the matrix (corresponds to the same order you
write out a matrix). So horizontally contiguous pixels are contiguous in
memory, the opposite of what you describe.
I think it therefore makes sense to add an extra parameter (as you suggest)
which defaults to your current behavior but allows users like me to specify
the transposed format. Here are my code changes in driver.py:
def matrix_to_array(matrix, matrixOrder):
def matrix_to_texref(matrix, texref, order="F"):
bind_array_to_texref(matrix_to_array(matrix, order), texref)
There may also be other functions like make_multichannel_2d_array that may
require similar modification.
I've applied this, with two changes: I've renamed matrixOrder to just order,
and I haven't defaulted it to "F". This breaks existing code, but in an
obvious way. Since it's non-obvious, people should give this some thought, and
the extra parameter reminds them that there is something to be thought about.
Your DiskDict didn't work for me.
What was the error message? If something doesn't work out, I wrote to fail
over to a memory dict, apparently that bit's broken, too. Since I use that
code elsewhere, too, I'd be happy to know what broke.
I changed my disk-based cache code to make it more
efficient. Instead of
creating temp files, if a cache directory is specified, it directly
compiles into that directory. Here is the new code (at the bottom of
I've merged this, too. Couple changes, though:
- cache_dir defaults to ~/.pycuda-compiler-cache. Having to compute that in
every user program is not so nice. Having it overridable is nice, though.
Also, if you pass cache_dir=False, then caching is turned off.
- I took out the 'added efficiency' because that means `keep` and caching
aren't orthogonal any more. And in the presence of nvcc's garish slowness, an
extra temp directory is pretty much irrelevant.
One (hypothetical) concern:
- Should we add code to catch MD5 collisions?
Two nits for the future:
- Please stick to PEP 8  when naming variables.
- Please send patches as attached unified diffs, rather than inlined source
Thanks for your input!
The results of this are in git now, and I'll release them as 0.91rc2. Please