Algebraic datatypes and pattern matching in Haskell lay a fertile ground to conveniently define and process abstract syntax trees (ASTs). However, in Haskell, trees often cannot grow: once a datatype is defined and compiled, it cannot be extended. Extensions to a datatype mainly appear as new fields to its existing data constructors, and/or new data constructors.
At the center of any metaprogramming system stand tall trees representing the abstract syntax of object terms. Metaprograms processing these trees often do so by decorating nodes with additional information. This additional information may appear as new fields to the existing data constructors, and/or new data constructors. Common practice is either post hoc, to define a new separate datatype representing the output decorated trees; or pre hoc, to use the same large datatype to represent both the non-decorated input and the decorated output trees. Both methods are often ad hoc; the former leads to duplication, and the latter forces the input trees to carry an unnecessary set of information making them inconvenient to work with.
In this talk, I introduce an encoding of datatypes that allows them to be extended in a post hoc manner, yet still argueably keeping them convenient to work with. It is done as a part of the the Summer of Haskell project titled “Native Metaprogramming in Haskell”, where we considered unifying the two most popular representations of Haskell’s syntax: the small and convenient AST in the popular library Haskell-Src-Exts (HSE), and the large decorated AST used inside GHC’s front-end (HsSyn).