Hi Guys-
I'm trying to write re-usable code for automatically tuning (CUDA)
kernels and their associated (python) calling functions for optimal
execution time.
My main goals:
1. Write re-usable code to handle the optimization of kernels (and
their associated support code), given templates that contain:
2. Inline definition of logic for conditional unrolling of loops,
disabling sections of code, etc...
3. Inline definition of the tunable parameters, along with the sets of
possible values they can take
I've found two templating languages that I think are promising for this goal:
Cheetah
http://www.cheetahtemplate.org
and
Mako
http://www.makotemplates.org
Both are pretty rich and Pythonic in terms of the logic you can
incorporate into your templates, so that covers 2. pretty well.
I'm stuck on 3.
To give you some idea what I'm talking about, here's a dummy template
that generates python code that just waits different amounts of time
based on the (after the template is rendered) hard coded values of x1,
x2, y1, y2.
__________contents of mainFileName__________________
## This is a mako template demonstrating how to make a template for
use with the AutoTunedFunction class
<%!
tuneableParameterSets = []
%>
import time
def main:
## Two tuneable parameters that must be optimized together
<%!
tuneableParameterSets.append({})
tuneableParameterSets[0]['x1'] = [0, 3, 2]
tuneableParameterSets[0]['x2'] = [0, 5, 4]
%>
time.sleep(.01 * ${x1} * ${x2})
## Another set of parameters that should be optimized together.
<%!
tuneableParameterSets.append({})
tuneableParameterSets[1]['y1'] = [True, False]
tuneableParameterSets[1]['y2'] = [False, True]
%>
if ${y1} or ${y2}:
time.sleep(.02)
____________________________
So there is a list of dictionaries, tuneableParameterSets, that I want
to be able to access from outside the template. When you do:
>> from mako.template import Template
>> t = Template(filename=mainTemplateFileName)
the template source gets compiled to template code. I was hoping
there'd be a nice way to inspect the resulting class instance t to get
the tuneableParameterSets variable, and use that to iterate through
the possible values for the tuned parameters. Unfortunately, there
doesn't seem to be a nice clean way to do this. One ugly hack would
be to scrape the code that sets the variables out of the compiled (but
not rendered/filled) template code, perhaps with the help of some
added delimiters. This is really ugly, but so far it's the only thing
that I've come up with that would work.
In Cheetah there is a syntax #attr that lets you set attributes of the
resulting class (after the template is compiled, before it is
rendered/filled). That almost worked, but unfortunately it doesn't
seem like you can set a dictionary, only string and numeric literals.
Anyone familiar with these templating engines have any clever ideas?
Is anyone else using generic tuning code, or is everyone writing a
separate auto-tuning script for each module?
I also looked a bit a codepy, but I don't quite understand what to do
with it; it doesn't seem suitable for manipulating chunks of code
written in another language. I may be misunderstanding it completely
though, and there's not much documentation.
Thanks!
Drew