Co-authored with Charith Mendis.
In short, we released a new package: metap, an easy to use, yet powerful meta-programming layer for Python. The release is long overdue as it is by far the tool I use the most! We will start by presenting the main ideas behind metap, followed by a tutorial. Then, we will provide a delineated view of meta-programming such that we can clearly understand what it is, what it isn't, what it can achieve, and how it is different from other related concepts. Here's a table of contents:
A meta-programming layer (MPL) provides a meta-programming interface through which programs—called meta-programs—manipulate other programs—called object-programs (Check this for a nice overview). Meta-programming is useful because it can automate coding patterns, or transformations over coding patterns (many of which differ from programmer to programmer and from project to project). We highlight three particularly useful use cases of meta-programming, which we name: (a) program augmentation, (b) code generation, and (c) structural introspection. We briefly describe each in detail.
The idea in program augmentation is to enhance the program automatically in a predictable manner. As a concrete example, let us try to dynamically check type annotations.
Python accepts type annotations in e.g., variable declarations, function parameters and return types. These annotations are not checked statically or dynamically by default. It would be useful if we could automatically augment the program with code that performs the checks, i.e., code which checks whether the dynamic values agree with the annotations. For example, a MPL could augment this program:
def foo(s: str):
pass
into the following program:
def foo(s: str):
if not isinstance(s, str):
print(s)
print(type(s))
assert False
pass
This is what we dub program augmentation. The original program is a valid Python program on its own; a MPL just augments it.
A common pattern is to have functions which include many
returns, particularly when this function tries to check multiple conditions
(e.g., in compiler pattern matchers, like LLVM's
InstCombine).
In particular, these functions check whether the input has some
characteristics X
, Y
, Z
, etc. So, a
lot of code ends up looking like:1
1 2 3 4 5 6 7 | if not X: return None if not Y: return None if not Z: return None ... |
There are two reasons why to write such code: (1) it's readable, (2) it's debuggable. Consider the two main alternatives. The first is nested ifs:
if X:
if Y:
if Z:
...
return None
However, this is very unreadable. The other alternative is to make use of Python's
match
statement. But that doesn't make it possible to know which
match failed, and so it's not very debuggable.
It would be really ergonomic if we didn't have to
write this all the time and instead we could write something like:
_retn_ifn(X)
. This would essentially be a code-generation
capability (e.g., a macro) that would generate the code above. In this
case, the meta-program uses a superset of Python (i.e., it's not a valid
Python program) defined by the MPL. Such features are generally more powerful
than program augmentation. The latter may complement an existing program, but it
doesn't make writing the program in the first place any easier. This is where an
MPL's code generation shines, as it lets us automatically generate code which
makes both the creation of programs easier, and the readability. It is even
better if such code-generation capabilities are
extensible; for example, _ret_ifn
can be user-defined. This
is what we call user-defined code generation.
Finally, meta-programming becomes more powerful with
introspection, which occurs when a program inspects itself. In modern
programming languages introspection is focused on types/semantics (e.g.,
if constexpr
in C++);2 for example, when programs inspect
generic types. While this is definitely useful, introspection is not
limited to types and/or semantics. We argue that what we call
structural introspection is both possible and important. This is
when a program introspects structural elements. For example, it is
sometimes useful to assert that a loop doesn't contain a
continue. This can be useful if the logic of (usually) a
(while
) loop would become incorrect if a
continue
was added and we want to prevent that without relying on users
manually inspecting the loop body. Such an assertion needs to
introspect the structure of the program.
Unfortunately, there is no general-purpose meta-programming layer (MPL)
or library for Python which provides these capabilities: program
augmentation, user-defined code generation, and structural
introspection. The built-in meta-programming capabilities of Python
don't provide these features either. Structural introspection seems to
be completely unthought of. Decorators provide some code generation
capabilities but they don't allow us to e.g., define custom statements
like
_ret_ifn(X)
. Finally, there is very limited support for
program augmentation, and especially for the most important kind:
logging.
In this article we present metap, an easy-to-use
meta-programming layer for Python, which provides all the aforementioned
features. metap provides user-defined
code generation through a simple macro system. It also provides a rich
program-augmentation API which allows users (among other things) to log
enable dynamic type-checking, expand asserts, and log all kinds of
structures such as: if
s, return
s,
break
s, continue
s, function entries and calls.
Finally, metap provides structural
introspection directives which allow users to check the properties of
the structure of the code (e.g., that a loop does not contain a
continue
).
We start with a simple logging example, which falls
under program augmentation. Let's say we have the following meta-program, in
a file named test_mp.py
:
# test_mp.py
def foo():
return 2
def bar():
a = 2
if a == 2:
return 4
foo()
bar()
In this example, the meta-program has nothing metap-specific. We can just run it with Python as it is. But, we can still tell a client to transform it in various useful ways. For example, let's say we want to log all the returns. We can write a simple client as follows:
# client.py
from metap import MetaP
mp = MetaP(filename="test_mp.py")
mp.log_returns()
mp.dump(filename="test.py")
This communicates to metap (only) what is necessary
and sufficient for this task: which file to load (test_mp.py
), what to
do (log the returns), and where to output the resulting (object) program
(test.py
). Now, we can first run:
> python client.py
to produce test.py
and then run it:
> python test.py
which outputs:
metap::Return(ln=3)
metap::Return(ln=9)
In general, metap allows the user to log all kinds of things, optionally supporting indentation and only logging within ranges. Indicatively, metap can log: returns, breaks, continues, call-sites, function entries, and conditionals (i.e., ifs).
In the introduction we mentioned dynamic checking. metap provides that through the
simple client API call dyn_typecheck()
(similar to
log_returns()
). We note that metap supports pretty complex
annotations like:
Optional[Tuple[List[str], List[int]]]
The full potential of metap is reached when the meta-program starts using the metap-superset of Python.
This example is taken straight from real-world code I (Stefanos) have written for the Markdown-to-HTML compiler I use to generate articles like the one you're reading. The purpose of this snippet is to parse a line and check if it's a Markdown heading (i.e., if it starts with “#”). But, we also want to identify whether it's a level-1 heading (i.e., a single leading “#”) or a level-2 heading (i.e., two leading “#”) because the compiler generates different code for each case. metap allows us to write the following (meta-program):
# md_to_html_mp.py
line = "# test"
if (_cvar(line.startswith('# '), hlvl, 1) or
_cvar(line.startswith('## '), hlvl, 2)):
# ... Common code which applies to both
# level-1 and level-2
body += self.parse_heading(line, hlvl)
_cvar()
is a metap-specific feature whose name
stands for “conditional variable”. It allows us to assign a value to a
variable while testing a condition. The first argument is a condition,
the second a variable name, and the third any value. If the condition is
True
, then the third argument is assigned to the second
(otherwise, nothing happens), akin to C++'s (which by the way is also possible
in pure Python through the walrus operator, but it's much less common):
if (a = foo()) {
// ...
}
in which a
gets the value of foo()
(unconditionally), and if that value is non-zero, we enter into the if-body.
There is a two-argument version of _cvar()
that allows us to do
the same thing, but the version we showed above is more powerful
because it allows us to specify what value a
or
hlvl
will get if the condition is satisfied. Furthermore, it's important to
clarify that we don't want to do the following:
# md_to_html_mp.py
line = "# test"
if line.startswith('# '):
# ... Common code which applies to both
# level-1 and level-2
body += self.parse_heading(line, 1)
else:
# ... The _same_ common code which applies to
# both level-1 and level-2
body += self.parse_heading(line, 2)
because, as is evident from the code, this introduces huge code duplication. There is some common code for both cases, and it's better to write it once for maintainability, readability, and consistency. We omit a pure-Python version of the above snippet for clarity.
Besides built-in code-generation capabilities, metap also
allows the user to define their own macros. For example, we can define the
_retn_ifn(X)
macro we saw earlier as follows:
def _ret_ifnn(x):
stmt : NODE = {
_tmp = <x>
if _tmp is None:
return None
}
return stmt
A macro is basically a function which, instead of returning
a “normal” value, returns a code entity. A NODE
variable denotes
that the contents are to be treated as code. Everything in it is just treated as
code verbatim, except for <x>
through which we can refer to
other code variables. In that way, we can compose code entities.
Structural introspection allows a program to inspect its own structure. In
metap we
provide structural introspection mainly to check certain properties that aid
in maintainability and correctness. A classic example is a loop that must not
have a continue
. It often happens with while
loops
because in them the update statement to get us to the next element is written
by the user (e.g., i += 1
), whereas in for
loops it
happens automatically (e.g., the i += 1
happens behind the scenes
when we do for i in range(...)
). As an example, consider the
following (simplified) while
loop used in Huffman compression to
construct a histogram:
# s: string
hist_len = 0
for i in range(len):
hist_pos = char_map[s[i]]
if hist_pos == -1:
char_map[s[i]] = hist_pos = hist_len
hist[hist_pos].sym = s[i]
hist[hist_pos].freq = 0
hist_len += 1
curr_freq = hist[hist_pos].freq
hist[hist_pos].freq = curr_freq + 1
Here, there is some code that we want to run in all cases (the last two
lines), and there's some code that we want to run only when we find a new
symbol (the code inside the if
). To avoid the nesting
introduced by the
if
—which can get deeper and deeper as the loop gets
complicated—one may try to write the loop as follows (a pattern which
is used amply inside the LLVM source code to avoid extensive nesting):
...
if hist_pos != -1:
continue
char_map[s[i]] = hist_pos = hist_len
...
But of course this is wrong, because this won’t execute
the last two lines if we get inside the if-body. To avoid such mistakes, the
programmer can add a
@no_continue
annotation at the beginning of the loop:
...
for i in range(len):
@no_continue
...
Now, if someone tries to add a
continue
—for example, by writing the second
version—they will get the following error message from metap when they
run the client (Attention: Not at
runtime, but at compile-time):
metap: Error: @no_continue directive used at line 3,
but there's a `continue` at line: 7.
It is worth noting that what the original programmer wants is a defer statement (like the one provided by Go), but for loop bodies instead of the (usual) function bodies. We do not know of any language that supports such defer statements, and we plan it as a future work for metap.
In this section, we aim to provide a view of meta-programming that allows us to: (1) distinguish it from similar terms (e.g., compilation), and (2) specify what meta-programming can (and cannot) solve. This develops a rationale for why we implemented metap as a meta-programming layer and not, for example, as a compiler plugin or a DSL.
In general, a meta-program is a program that specifies an object-program (or target program), and a meta-programming layer (MPL) is a tool that takes in a meta-program and produces the object-program. A trivial meta-program is any program (as a program trivially specifies itself). A trivial MPL is a pass-through, i.e., it spits out what it gets in. However, this definition is too broad to allow us to specify what sets meta-programming apart from other similar technology.
One may think that a conventional C compiler is a MPL, which takes as input a C program—which in this case we can think of as a meta-program that describes an assembly object-program—and produces an assembly program. But, we argue that this is an inaccurate view of compilers (or meta-programming; or both). This is because a C program is an inaccurate specification of an assembly program, which becomes even worse given C's treatment from optimizing compilers. More specifically, it's hard to predict what assembly the compiler will generate.
This, in turn, is because C, and basically every programming language,
defines an abstract semantics for programs. In short, the C standard defines
(abstractly) what a program does, but not how; the latter is
left to the implementation. So, for example, if we assign 2
to
a
and then immediately read from a
(and assuming
a
is not volatile
). then we should get
2
. This abstract behavior can be implemented in many different
concrete ways. For example, a
's value could be stored in a register
or in the stack or in the heap. Leaving the “how” unspecified gives the
compiler the freedom to generate whatever assembly it deems profitable as
long as that assembly has the abstract behavior defined.
However, abstract behavior is not always useful. In general, if we want
a new feature—e.g., return x
if x
is not
None
—we may do it abstractly, or concretely. In the former case, we
define the abstract behavior which is then implemented somehow by a compiler.
This essentially creates a domain-specific language (DSL). In the latter case,
we generate code predictably, which essentially constitutes a MPL. Now, we will
consider the benefits/drawbacks of DSLs and MPLs.
One reason we may want a DSL is because we have different operators that co-interact. A DSL compiler that understands the semantics of the operators can perform optimizations like e.g., operator fusion. In general, as in C, abstract semantics allows the compiler to optimize the code aggressively. Furthermore, a DSL is useful when we want to introduce a new programming model. More specifically, a DSL's value increases drastically in proportion to the difficulty of implementing a new programming model over an existing language. In the case where the difficulty reaches an unmanageable peak, it becomes hard for users to think about “how” something happens, and instead a DSL allows them to focus only on the “what”. Popular examples of DSLs that have all the features above are TensorFlow and Halide.
Furthermore, a DSL gives us some form of portability. In particular, the behavior of a DSL program is platform-independent; again, because it is abstract (and assuming it is not underspecified because of e.g., undefined behavior).
But, DSLs have some disadvantages too. Their main drawback is the same as their main advantage: abstract behavior. In particular, the user generally doesn't know how DSL features are implemented. Also, a user may not want to learn a whole new programming model and new semantics that a DSL introduces. This is not only time-consuming, but can also lead to bugs (until one really understands what the semantics is). That's where a meta-programming layer shines.
A meta-programming layer is a great solution when we don't want to: (1) have opaque operators that can interact (potentially through compiler optimizations), and (2) learn a new programming model and semantics, and (3) we don't want to translate the program in multiple languages. Basically, when we want to program in a language, model, and semantics we already know. But what do we gain?
The main benefit is that we get predictable generated code, which in turn implies many other benefits. First, the meta-language is as well-defined as the object-language because a meta-program is just syntactic sugar over object-programs. The MPL generates e.g., Python code predictably, so there's no undefinedness originating from the meta-program or the MPL. This is not the case for languages (including DSLs) with abstract behaviors. For example, C has undefined behavior while x86 assembly doesn't.
Second, we don't have to teach the tools we use about our special features. The goal of a MPL is just to help us either analyze the program code, or generate new code. In the former case, we don't need any external tool because the analysis happens by the MPL. In the latter case, we can use whatever tools we can already use in the object language.
Third, we don't need to re-implement features over a new language. For example, with metap we just generate a Python program predictably. Suppose now we execute this Python program and we get an exception. Because we generated the code predictably, we can easily figure out where this exception came from in the meta-program. In other words, we don't need to re-implement exceptions for the meta-language.
Fourth, a MPL can easily interact with program-augmentation features. For
example, suppose _retn_ifn(x)
becomes a new Python statement that
somehow returns x
if x
is not
None
. But if we don't know that this will generate a return
statement, we can't compose it e.g., with metap's
log_returns()
. In other words, we sacrifice transparency.
Finally, since a MPL just analyzes or generates code, it is extremely lightweight. A MPL is not a runtime, it's not a compiler, and it's not a standard library. Consequently, to run a meta-program, we don't need to link it against a heavy library or runtime, or use any special tooling.
In short, generating code instead of implementing abstract behaviors results in predictable, transparent, and debuggable code. In fact, exactly because of that, it is easy to create “towers of meta-programming layers” (to paraphrase an inspiring paper). These were top priorities for metap's use cases, while none of the DSL benefits were part of our desiderata. This is why metap is a MPL and not a DSL.
The last couple of years, more and more people seem to appreciate the benefits of meta-programming. For example, C++ has had templates for a long time and Go recently added generics. The problem is that most of these implementations are, at least for me, not very useful.
C++ for some reason decided to introduce templates—a new language— on top of C++ which—to everyone's surprise (or not if you knew anything about programming)—turned out to produce abhorring error messages and when you couple it with C++'s compilation model, to have unimaginable compilation times.3 The cherry on top is that you get all these drawbacks without many of the basic features one would want from templates (e.g., C++ only recently got some introspection capabilities).
This has led many people to believe that meta-programming, and even template meta-programming specifically, inherently has all these drawbacks. No. If you don't believe me, try compiling D's standard library phobos. It contains hundreds of thousands of lines of code and is full of templates, yet compiles in 20 seconds or so. And generally, D actually got templates right, as it essentially doesn't have any of the drawbacks of the C++ implementation.
As for Go, I've written a whole article about its weird generics.4
In short, even though modern languages have some meta-programming features, these features are usually weak, or poorly implemented (or both). For example, they don't provide out-of-the-box utilities that allow you to e.g., log all returns. Furthermore, because modern languages tend to be massive and complicated, it's very hard to implement a meta-programming layer yourself. Imagine writing an AST transformer for C++. Even using Clang's utilities, we're talking about a massive undertaking considering all the craziness you have to deal with. And good luck hooking that up with Clang's pipeline.
Montana was a compiler infrastructure which allowed users to easily make their own plugins, which could meta-program the source program, and hook them up easily into Montana's pipeline. Montana was clearly our biggest inspiration to build metap (I learned about it while binge-watching Handmade Hero on youtube). The Montana folks really understood and partly realized the potential of meta-programming, and the role a compiler's assistance plays. Sadly, it is much less known than it deserves. As Casey Muratori said, Montana was part of the IBM VisualAge toolchain, which was absolutely terrible and so the whole thing failed. Over the years I've heard people talking about VisualAge and most seem to agree. In the years following Montana there was some academic work on similar technologies, like Xoc by Russ Cox,5 but rarely did they cite Montana.
One problematic aspect of a meta-programming layer is that it introduces another
thing that has to be executed before you can run your program. For C-like
pipelines that's not a problem because you can usually just add it as part of
your build script. But in Python I've found it a bit annoying. Conveniently, I
discovered File Watcher
which I've made run metap
anytime I save a meta-program.
static
if
in Dlang early and around 2012, they co-authored a static if proposal for
C++. Bjarne
Stroustrup, in his infinite wisdom, co-authored a
rebuttal
that rejected it, only to realize long overdue that it's an incredibly useful
feature and introduce it in C++17.↩