Using the Objectos HTML Pseudo DOM API. Objectos 0.5.3 released
Welcome to Objectos Weekly issue #020.
I have released Objectos 0.5.3! It introduces the Pseudo DOM (pseudom) API for Objectos HTML. The pseudom API replaces the Visitor API shown in the previous issue of the newsletter. The Visitor API has been removed from Objectos HTML in this release. If you are interested, this is the full list of changes.
In this issue I will show you how to use the new pseudom API.
Let's begin.
Before we begin
I use Objectos in production. The Objectos website is generated using Objectos HTML and other Objectos libraries. The internal Objectos CI process also uses other Objectos libraries, such as Objectos GIT.
However, please know that Objectos 0.5.3 is an alpha release. In particular:
-
it is not stable. I expect it to fail if you deviate slightly from the use-case shown here;
-
there may be breaking API changes between releases; and
-
documentation is a work in progress.
Generating different representations of your template
Suppose you have a blog and wish to provide an Atom feed. This would allow the readers of your blog to be notified of the latest articles by using a RSS reader. Not only that; depending on how you set up your Atom feed, readers can access the full article without leaving their RSS reader.
Providing the full article
An Atom feed is a XML file. I won't go into the details of the Atom format. If you're interested, here is RFC 4287.
Just know that:
-
the
feed
element is the root element; -
it may contain a number of
entry
elements. Each represent a distinct article of your blog; and -
entry
elements may contain acontent
element. That is where the article contents goes.
So it may look like the following:
<feed xmlns="http://www.w3.org/2005/Atom">
...
<entry>
...
<content type="html">content goes here</content>
...
</entry>
...
</feed>
Notice that, in our example, the content
element has the html
value for its type
attribute.
Let's look into that.
The type
attribute
RFC 4287 says the following about the type
attribute:
If the value of "type" is "html", the content of the Text construct MUST NOT contain child elements and SHOULD be suitable for handling as HTML [HTML]. Any markup within MUST be escaped; for example, "<br>" as "<br>".
So if our blog post is something like:
<h1>Post title</h1>
<p>Intro paragraph</p>
The content element will be rendered like the following:
<content type="html">
<h1>Post title</h1>
<p>Intro paragraph</p>
</content>
Notice that, while the RFC does not require the '>' symbol to be escaped, we will anyways.
Our example
We will generate the content
element for the following Objectos HTML template:
import objectos.html.HtmlTemplate;
public class BlogPost extends HtmlTemplate {
@Override
protected final void definition() {
doctype();
html(
head(
title("A pseudom example")
),
body(
h1("Title"),
p("Intro text"),
h2("Subtitle"),
p("More text")
pre(code(
"class Foo {}"
))
)
);
}
}
Everything that is inside the body
tag must be included in our result.
Our entry content writer
We will create a feed entry content
writer using the new pseudom API.
The pseudom API provides a DocumentProcessor
interface which gives you access to a HtmlDocument
.
The latter gives you access to the HTML elements defined in our template.
Let's have our writer implement the DocumentProcessor
interface:
import static java.lang.System.out;
import objectos.html.pseudom.DocumentProcessor;
import objectos.html.pseudom.HtmlDocument;
...
final class EntryContentWriter implements DocumentProcessor {
@Override
public final void process(HtmlDocument document) {
for (var node : document.nodes()) {
if (node instanceof HtmlElement element) {
findBody(element);
}
}
}
...
}
Notice that we static importing the out
member of java.lang.System
.
We will write our result directly to it.
Next, let's look at the process
method.
The process
method
The DocumentProcessor
interface defines a single process
method which we implemented like so:
@Override
public final void process(HtmlDocument document) {
for (var node : document.nodes()) {
if (node instanceof HtmlElement element) {
findBody(element);
}
}
}
We iterate over the nodes of our document.
If the node is a HtmlElement
then we have to look for the body
element.
Remember, our result must contain all of the children of the body
element.
Searching for the body
element
The findBody
method is implemented like so:
private void findBody(HtmlElement element) {
if (element.hasName(StandardElementName.BODY)) {
consumeBody(element);
} else {
for (var node : element.nodes()) {
if (node instanceof HtmlElement child) {
findBody(child);
}
}
}
}
If the current element is the body
element then we consume it.
Otherwise we keep looking for the body
element.
We do it by:
-
iterating over the element's nodes; and
-
recursively calling the
findBody
method.
Next, let's look at the consumeBody
method.
Consuming the body
element
Here is the implementation of the consumeBody
method:
private void consumeBody(HtmlElement body) {
for (var node : body.nodes()) {
if (node instanceof HtmlElement element) {
writeElement(element);
}
}
}
We know for sure that we are at the body
element.
So we write all of the elements contained in the body
.
Writing the element
The writeElement
method is implement like the following:
private void writeElement(HtmlElement element) {
var name = element.name();
writeStartTag(name);
if (element.isVoid()) {
return;
}
for (var node : element.nodes()) {
consumeNode(node);
}
writeEndTag(name);
}
First, we write the start tag of the element.
If the element is void then it will not have any contents and we can exit the method early. If it is a normal element, then we:
-
consume its nodes, i.e., any text or child element; and
-
write the end tag.
Writing the element's contents
For completeness, here's the implementation of the consumeNode
method:
private void consumeNode(HtmlNode node) {
if (node instanceof HtmlElement element) {
writeElement(element);
} else if (node instanceof HtmlText text) {
writeText(text.value());
}
}
Therefore:
-
if the node is a child element, we make a recursive call to
writeElement
; and -
if it is a text node, we write its value.
You can find the full source code of the processor here.
Running our example
We write a small program to run our example:
public static void main(String... args) {
var sink = new HtmlSink();
var post = new BlogPost();
var writer = new EntryContentWriter();
sink.toProcessor(post, writer);
}
When executed, it prints:
<h1>Title</h1>
<p>Intro text</p>
<h2>Subtitle</h2>
<p>More text</p>
<pre><code>class Foo {}</code>
</pre>
Which then can be used in an Atom feed XML file, like so:
<content type="html">
<h1>Title</h1>
<p>Intro text</p>
<h2>Subtitle</h2>
<p>More text</p>
<pre><code>class Foo {}</code>
</pre>
</content>
This is how the Objectos site's Atom feed is generated.
Why is it called Pseudo DOM?
The Pseudo DOM (pseudom) API is named this way because, well, it is not a real DOM API:
-
internally it is a forward-only event streamer (think StAX Iterator API); and
-
at any given time, there's a single
HtmlElement
instance which is reused in iterations (recursive included).
To illustrate, consider the following Objectos HTML template:
public class WhyPseudoDom extends HtmlTemplate {
@Override
protected final void definition() {
h1("Why the pseudom name?");
p("Just an example");
}
}
And we write the following DocumentProcessor
for it:
public class WhyPseudoDomProc implements DocumentProcessor {
@Override
public final void process(HtmlDocument document) {
var nodes = document.nodes();
var nodesIter = nodes.iterator();
assertTrue(nodesIter.hasNext());
var h1 = (HtmlElement) nodesIter.next();
assertTrue(h1.hasName(StandardElementName.H1));
assertTrue(nodesIter.hasNext());
var p = (HtmlElement) nodesIter.next();
assertTrue(p.hasName(StandardElementName.P));
assertTrue(h1 == p);
assertTrue(h1.hasName(StandardElementName.H1));
}
private void assertTrue(boolean expected) {
if (!expected) {
throw new AssertionError();
}
}
}
This programs works fine until the last assertion:
assertTrue(h1.hasName(StandardElementName.H1));
This assertion fails because the previous one, h1 == p
, evaluates to true.
Until the next issue of Objectos Weekly
So that's it for today. I hope you enjoyed reading.
The source code of all of the examples are in this GitHub repository.
Please send me an e-mail if you have comments, questions or corrections regarding this post.