The How of Macros 5: Unquoting

As you now know, quote descends the syntax tree that is its argument and converts it into code that will, when executed, build the original syntax tree. quote is usually not useful unless the tree contains unquote or unquote_splicing nodes. This post explains how those are handled. The related function, expand, is explained later.

The series: Elixir compilation, syntax trees, literal data, quote, unquote, escape, hygiene and var, what's up with require?

What does the compiler's quote-handling code do when it discovers an unquote node?


The code to handle unquote is written in Erlang, but its Elixir equivalent would be:

def do_quote({:unquote, _metadata, expression) do 

Let's look at an example of how that interacts with quote handling to produce the correct syntax tree.

Reading the Erlang code for do_quote, I was reminded that Erlang's and isn't short-circuiting; that is, it evaluates both its arguments even if the first is false. Erlang's equivalent of Elixir's and and && is andalso. It would have to be written as a macro in order to control whether the second argument is evaluated. Here's a typical implementation:

  defmacro andalso(left_ast, right_ast) do
    quote do
      if unquote(left_ast) do     # left code always evaluated
        if unquote(right_ast) do  # right maybe evaluated      

Remember that, like all macro functions, andalso will be called with a syntax tree at compile time, to produce another syntax tree that will be compiled. So this:

andalso (IO.inspect false), (IO.inspect true)

... will produce this code:

  if IO.inspect false do
    if IO.inspect true do

For variety, let's also implement andalso without using quote and call it andalso_literal:

  defmacro andalso_literal(left_ast, right_ast) do
    {:if, [context: M, import: Kernel],
       left_ast,                      # what `unquote` does
         do: {:if, [context: M, import: Kernel],
                right_ast,            # what `unquote` does
                [do: true, else: false]
           else: false

That works just fine. Not only that, the two versions are identical in the sense that they will hand the same syntax tree to the bytecode generator. It'd be tedious to walk through the transformations of the whole tree to show that equivalence, so I'll sketch it out in pictures and text.

First, let's look at these two pieces of code:

quote do: (if ...)       # andalso
{:if, [metadata], ...}   # andalso_literal

Here's a picture of the processing:

In the andalso case, quote's argument is parsed into the kind of 3-tuple the compiler uses for the if special form:

{:if, [metadata], ...}

That expression is (when it comes to this top level) the same as the literal 3-tuple used in andalso_literal. There's a difference, though: andalso's 3-tuple was created by the parser, but andalso_literal's 3-tuple is given to the parser. So they'll go through different processing routes despite experiencing the same transformation.

In the andalso case, the :if 3-tuple is processed by the quote context-expansion code (do_quote), which converts it into tuple-creation code like this:

{:{}, [...], [:if, [metadata], ...]

In the andalso_literal case, the 3-tuple is parsed as a literal data structure, which... converts the top-level expression into the same :{} tuple-creation code.

The two {:if, ..., ...} syntax trees aren't identical all the way down to their leaves.  andalso contains unquote expressions that wrap variable names, whereas andalso_literal contains just the variable names:

... unquote(left_ast)...     # andalso
...         left_ast ...     # andalso_literal

And so, again, there's different processing:

In the andalso case, the unquote expression is parsed into an :unquote syntax tree that wraps the typical 3-tuple for a variable dereference: {:left_ast, [...], atom}. Since the context-expansion of an :unquote 3-tuple is just the unquote's single argument, what gets used for bytecode generation is that tuple, an instruction to dereference left_ast.

In the andalso_literal case, a name is parsed, which just produces the same variable-dereference tuple without the need for the need for an intermediate :unquote tree.


unquote_splicing also does nothing to its AST argument. However, the do_quote pattern-matching has to happen up one level in the context-expansion because it has to capture both the list the unquoted tree is to be spliced into and also the value to splice in. As an example, consider the case where the unquote_splicing is at the head of a list:

quote do 
  [unquote_splicing(left), 3, IO.inspect a]

A function to transform that list might look like this:

def do_list_tail([{:unquote_splicing, _meta, [to_splice]} | rest]),
  do: [to_splice | rest]

... except that there might be another unquote_splicing later in the list, so we need some recursion:

def do_list_tail([{:unquote_splicing, _meta, [to_splice]} | rest]),
  do: [to_splice | do_list_tail(rest)]

... which is good, because a day without recursion is like a day without sunshine.

We also need the case where the head of the list isn't an :unquote_splicing:

  def do_list_tail([head | rest]), 
    do: [head | do_list_tail(rest)]

And, as always with explicit recursion, we need a base (end of recursion) case:

  def do_list_tail([]), do: []

That finishes up the basics of compiling macros. Programming language designers have been working with this style of compile-time syntax-tree transformations since 1963, the date they were first proposed for Lisp. The Elixir syntax tree and, consequently, quote and unquote handling is tricksier than in Lisps (at least the ones I've looked at). I don't know the reason. I suspect it has to do with compatibility with the Erlang syntax tree format (which was not designed to support macros), and perhaps with pattern matching.

There's more to say about macros, but I'm inclined to ask if people are interested in other topics before I put in the effort. Let me know via email or twitter (publicly with @marick or via my open DMs).

Previous: quote