Not sure this can be sorted out without a example code.
What I did notice starting with 2015.2 is that the kernel object generation took a lot of
time, i.e. keeping a temporary copy of ‘prg.sum’ (where arg is the program and sum the
kernel) is essential, as stated in
(see § on
kernel_name) - always calling as ‘prg.sum’ was significantly slower.
Le 4 sept. 2017 à 14:06, Jerome Kieffer
<Jerome.Kieffer(a)esrf.fr> a écrit :
I recently upgraded my computer from debian8 to debian9 and noticed a
"slow-down" in an application I am developing.
The basic "unit" operation used to take <1ms on a hi-end GPU (nvidia
GTX1080Ti) and is now about 8 ms.
On the same computer, with the same Python in two virtual env, I
measured with timeit for two version of pyopencl:
(2015, 2) -> 832 µs / operation
(2016, 2) -> 7.69 ms /operation
This could be related to the cffi migration but I am not sure. I may
profile further down but maybe there is already some litterature which
I was now aware of.
Did you notice anything similar ?
PS: I can provide the "material" to reproduce the effect if needed.
Co-editor, J. Synchrotron Radiation http://journals.iucr.org/s/
Director, HERCULES school http://hercules-school.eu
ESRF-The European Synchrotronhttp://www.esrf.eu <http://www.esrf.eu/>
71, Avenue des Martyrs
X-Ray NanoProbe (XNP) group
Tel: +33 4 76 88 28 11
On leave from Univ. Grenoble Alpes