Dear Peter,
On Thu, 19 Jan 2012 16:33:34 +0100, Peter17 <peter017(a)gmail.com> wrote:
> I am encountering a problem when trying to get the value of the fields
> at some nodes every time step.
>
> The code I am using is similar to the one in examples/maxwell/inhom-cavity.py:
>
> point_getter = discr.get_point_evaluator(numpy.array(<coordinates>))
> ...
> val = point_getter(fields)
>
> except that I am using several point_getters, in order to plot the
> curve (field value vs. time) for each point.
>
> My problem is that:
> * the execution is becoming slower and slower as I increase the number
> of points in CUDA:
> * the execution is slower in GPU (1 Tesla C1060) than in MPI (4 cores
> of Xeon X5650):
>
> * MPI-4
> ** No getter: ~ 109 s. for 400 steps
> ** 1 getter: ~ 117 s. for 400 steps
> ** 10 getters: ~ 117 s. for 400 steps
> ** 20 getters: ~ 117 s. for 400 steps
> * 1 GPU
> ** No getter: ~ 66 s. for 400 steps
> ** 1 getter: ~ 67 s. for 400 steps
> ** 10 getters: ~ 156 s. for 400 steps
> ** 20 getters: ~ 281 s. for 400 steps
>
> I made some tests and this issue seems related to
> hedge/discretization/__init__.py:54+:
> class _PointEvaluator(object):
> ...
> def __call__(self, field):
> ...
> result[i] = numpy.dot(self.interp_coeff, field[i][self.el_range])
>
> The numpy.dot() product seems to consume much of the time (when I
> remove this instruction, I get a constant time of ~ 66 s. in CUDA).
>
> The difference of speed between MPI and GPU might be due to the fact
> that MPI-4 will divide the mesh in 4 smaller parts. Alternatively, it
> could be related to the difference of data type, and so to this issue:
> [1], but I'm not totally sure about how all this works...
>
> Is there a simpler way to get the value of a point? I am only using
> nodes of the mesh, so Hedge should already know the value without
> interpolating anything...
Call discr.convert_volume(kind="numpy") once and then give that vector
to your point evaluators. That should save you large amounts of time,
because it won't incur a GPU transfer for every evaluator. It's possible
to be even cleverer and save more time, but this might be good enough.
HTH,
Andreas

Dear Andreas,
I am encountering a problem when trying to get the value of the fields
at some nodes every time step.
The code I am using is similar to the one in examples/maxwell/inhom-cavity.py:
point_getter = discr.get_point_evaluator(numpy.array(<coordinates>))
...
val = point_getter(fields)
except that I am using several point_getters, in order to plot the
curve (field value vs. time) for each point.
My problem is that:
* the execution is becoming slower and slower as I increase the number
of points in CUDA:
* the execution is slower in GPU (1 Tesla C1060) than in MPI (4 cores
of Xeon X5650):
* MPI-4
** No getter: ~ 109 s. for 400 steps
** 1 getter: ~ 117 s. for 400 steps
** 10 getters: ~ 117 s. for 400 steps
** 20 getters: ~ 117 s. for 400 steps
* 1 GPU
** No getter: ~ 66 s. for 400 steps
** 1 getter: ~ 67 s. for 400 steps
** 10 getters: ~ 156 s. for 400 steps
** 20 getters: ~ 281 s. for 400 steps
I made some tests and this issue seems related to
hedge/discretization/__init__.py:54+:
class _PointEvaluator(object):
...
def __call__(self, field):
...
result[i] = numpy.dot(self.interp_coeff, field[i][self.el_range])
The numpy.dot() product seems to consume much of the time (when I
remove this instruction, I get a constant time of ~ 66 s. in CUDA).
The difference of speed between MPI and GPU might be due to the fact
that MPI-4 will divide the mesh in 4 smaller parts. Alternatively, it
could be related to the difference of data type, and so to this issue:
[1], but I'm not totally sure about how all this works...
Is there a simpler way to get the value of a point? I am only using
nodes of the mesh, so Hedge should already know the value without
interpolating anything...
Thanks in advance
Best regards
[1] http://lists.tiker.net/pipermail/pycuda/2011-November/003471.html
--
Peter Potrowl

On Tue, 10 Jan 2012 09:59:09 +0100, Peter17 <peter017(a)gmail.com> wrote:
> Dear Andreas,
>
> Did you have time to look at my new fix?
Yep, just merged it. Once again, sorry for the delay. Thanks for the
patch.
> Do you have any update about the new loopy-based backend for Hedge?
No updates other than it's still going to happen. Hopefully sooner
rather than later, but only parts of my time are under my own control.
Andreas

Dear Andreas,
Did you have time to look at my new fix?
Do you have any update about the new loopy-based backend for Hedge?
Thanks in advance
Best regards
--
Peter Potrowl