A low-allocation template engine (Part 1). Objectos 0.4.3.1 released
Welcome to Objectos Weekly issue #015.
In this issue I want to talk a bit about how the Objectos template engines work under the hood. In particular, how it tries to achieve one of its goals: minimize allocation if possible. While I will talk about Objectos Code specifically the idea will be the same for Objectos HTML and CSS. So, in the long run, it should serve dynamically created web pages (or fragments) while trying to minimize allocation. At some point I might include JSON to the mix but there are no plans at the moment.
Additionally, I have released Objectos 0.4.3.1. It should have been 0.4.3. But I messed up the previous release naming it 0.4.3 instead of 0.4.2. Welp! At times there are advantages of not having users yet.
Let's begin.
Should I care about allocation?
Probably not. Write code that is clear and readable.
Having said that, as a library developer, I care about allocation and I think minimizing it is a problem worth solving. Granted, minimizing allocation is just one of the set of goals I am trying to achieve with Objectos.
What I am trying to say is: I am not trying to make up anyone's mind. But I do hope to find like-minded people who, perhaps, will find this article interesting.
How text-based templates work?
I will try to give a quick overview of how some text-based template engines work.
Please note that, while I have read through some of the source code of said engines, I have not studied any of them in depth.
Reading and parsing
Let's revisit the code of our hypothetical engine from the previous issue:
var engine = HypotheticalLibrary.createEngine();
Template tmpl;
try {
tmpl = engine.loadTemplate("iface.tmpl")
} catch (IOException e) {
System.err.println("Failed to load template");
return;
}
Now, how would one make this work?
The first step is to read the template from the disk. The second step is to parse the template.
The engines that I looked into will do those two steps in one go; a buffered reader is used and the input is looked for template language directives.
The result of this process is an abstract syntax tree (AST). In other words, an in-memory tree of allocated objects representing the loaded and parsed template.
This process involves both I/O and a number of objects being allocated. Therefore, the engines will usually cache the resulting template. While necessary, the cache itself allocates a number of other objects.
Traversing and generating
Once again, we revisit our hypothetical engine example:
var tmpdir = System.getProperty("java.io.tmpdir");
var target = Path.of(tmpdir, "HelloWorld.java");
var data = Map.of(
"pkgName", "com.example",
"name", "HelloWorld"
);
try {
tmpl.write(target, data);
} catch (IOException e) {
System.err.println("Failed to write HelloWorld.java");
return;
}
We now have our template in-memory. It is a tree of objects so we can traverse it.
When we encounter a node containing plain text we write it to the output directly.
When we encounter a node containing a language directive we dispatch it to the code responsible for executing such directive.
For example, let's suppose the node is for the directive {{name}}
:
-
in the provided
data
map, it would look for a mapping for thename
key; -
if the map contains such key, the directive would get the mapped value and write it to the output; and
-
if the map does not contain such key then behavior varies depending on the engine implementation.
An alternative
So we are generating a Java file from the following text-based template:
package {{pkgName}};
public interface {{name}} {}
One possible minimum alternative would be:
var pkgName = "com.example";
var name = "HelloWorld";
var out = new StringBuilder();
out.append("package ");
out.append(pkgName);
out.append(";");
out.append(System.lineSeparator());
out.append("public interface ");
out.append(name);
out.append(" {}");
out.append(System.lineSeparator());
// write file
Apart from the strings such as "com.example"
and "package "
it only allocates:
-
the
StringBuilder
itself (with its fields); and -
a new
byte[]
each time theStringBuilder
resizes its storage.
But I see two main problems with this approach:
-
it is error-prone and not programmer friendly. For example, the code above does not emit an empty line between the package and interface declarations; and
-
it is not a template engine.
Introducing an API
In order to make it easier to use we should introduce an API. One example could be:
var pkgName = "com.example";
var name = "HelloWorld";
var out = new StringBuilder();
var tmpl = new JavaTemplate(out);
tmpl.packageDeclaration(pkgName);
tmpl.interfaceDeclarationStart();
tmpl.addModifier(PUBLIC);
tmpl.setName(name);
tmpl.interfaceDeclarationEnd();
// write file
It is an improvement from the previous iteration.
But I still think it has some issues. The main problem with this approach is that this API is imperative; I feel that templates should be declarative.
A declarative API
All right, let's try and evolve our API into a declarative one.
var pkgName = "com.example";
var name = "HelloWorld";
var tmpl = JavaTemplate.javaTemplate(
JavaTemplate.packageDeclaration(pkgName),
JavaTemplate.interfaceDeclaration(
JavaTemplate.PUBLIC, JavaTemplate.name(name)
)
);
// write file
In the code above, all of the methods and variables from the JavaTemplate
class are static.
So we could rely on static imports like so:
var pkgName = "com.example";
var name = "HelloWorld";
var tmpl = javaTemplate(
packageDeclaration(pkgName),
interfaceDeclaration(
PUBLIC, name(name)
)
);
I think this is an improvement over the previous iteration.
Still, I see one issue here: the template is represented by an object graph. So, besides being written in pure Java, what advantage does it provide over the text-based template?
A low allocation API
So, can we provide a declarative API like the previous one but that does not allocate that many objects?
One alternative could be the following:
class HelloWord extends JavaTemplate {
String pkgName = "com.example";
String name = "HelloWorld";
@Override
protected void definition() {
packageDeclaration(pkgName);
interfaceDeclaration(
PUBLIC, name(name)
);
}
}
Notice the invocation of the packageDeclaration
and the interfaceDeclaration
methods.
Let's assume they return a value; notice that the returned values are not being consumed.
At least not at the definition
method.
This could be a first indication of not relying on object allocation.
An API that records and plays back method invocations
Let's focus on the definition
method of our last iteration:
@Override
protected void definition() {
packageDeclaration(pkgName);
interfaceDeclaration(
PUBLIC, name(name)
);
}
What if we could somehow "record" all of the method invocations? In execution order, they are:
-
packageDeclaration(pkgName)
-
name(name)
-
interfaceDeclaration(PUBLIC, name(name))
Let's assume such recording is possible.
Perhaps we could play it back the recording and use an API similar to the imperative one we mentioned in a previous example:
// packageDeclaration(pkgName)
replayer.packageDeclaration(pkgName);
// name(name)
replayer.storeName(name);
// interfaceDeclaration(PUBLIC, name(name))
replayer.interfaceDeclarationStart();
replayer.addModifier(PUBLIC);
replayer.setName(replayer.loadName());
replayer.interfaceDeclarationEnd();
In a simplified way, this is what Objectos Code does.
In the next issue we will take a closer look at how the "recording" and "playing back" is implemented.
Stay tuned.
Objectos 0.4.3.1 released
You can read the full release notes here.
Until the next issue of Objectos Weekly
So that's it for today. I hope you enjoyed reading.
Please send me an e-mail if you have comments, questions or corrections regarding this post.