I am testing the following python script in a miniconda shell on a win 10 notebook with cuda 9.0 amd python 3.6: 

import pycuda.autoinit
import pycuda.driver as drv
import numpy
from pycuda.compiler import SourceModule
mod = SourceModule("""
__global__ void multiply_them(float *dest, float *a, float *b)
  const int i = threadIdx.x;
  dest[i] = a[i] * b[i];
multiply_them = mod.get_function("multiply_them")
a = numpy.random.randn(400).astype(numpy.float32)
b = numpy.random.randn(400).astype(numpy.float32)
dest = numpy.zeros_like(a)
        drv.Out(dest), drv.In(a), drv.In(b),
        block=(400,1,1), grid=(1,1))
c= dest-a*b
print (c)

Unfortunately, I get the following error:

(C:\ProgramData\Miniconda3) C:\SciSoft>python hello_gpu.py

Traceback (most recent call last):
  File "hello_gpu.py", line 1, in <module>
    import pycuda.autoinit
  File "C:\ProgramData\Miniconda3\lib\site-packages\pycuda\autoinit.py", line 2, in <module>
    import pycuda.driver as cuda
  File "C:\ProgramData\Miniconda3\lib\site-packages\pycuda\driver.py", line 5, in <module>
    from pycuda._driver import *  # noqa
ImportError: DLL load failed: The specified module could not be found.

Please advise.