goxml / goxpath / goxslt

How XSLT works

XSLT is a declarative language for transforming XML documents. If you come from an imperative background (Python, Go, JavaScript), the model takes some getting used to — but once it clicks, XSLT is remarkably concise for tree-shaped data.

The pipeline

Every transformation has three pieces:

A source document — the XML you want to transform.
A stylesheet — a collection of template rules that say what to produce for each kind of node.
A result tree — the output the processor builds by running the rules.

The processor walks the source tree, finds the best-matching rule for each node it visits, runs that rule’s body, and the rule’s body builds nodes in the result tree. The walk and the rule selection are controlled by xsl:apply-templates, which is the central dispatch mechanism.

Templates and patterns

A template rule is an xsl:template with a match attribute. The match is a pattern: an XPath-like expression that says which nodes the rule applies to.

<xsl:template match="book">
  <li><xsl:value-of select="title"/></li>
</xsl:template>

When xsl:apply-templates reaches a <book> node, this rule fires. Its body is a sequence constructor — a piece of XML mixed with XSLT instructions — that produces the output for that node.

If multiple rules match the same node, the one with the highest priority wins (and ties broken by import precedence). A pattern like book[@featured] is more specific than book and so has a higher default priority.

The context node

When a template fires, the matched node becomes the context node. XPath expressions inside the body — like title or @year — are interpreted relative to it. So <xsl:value-of select="title"/> reads the <title> child of the current <book>.

xsl:apply-templates without a select attribute defaults to select="child::node()", so this very common pattern walks the children of the current node:

<xsl:template match="library">
  <ul>
    <xsl:apply-templates/>
  </ul>
</xsl:template>

Push style and pull style

There are two styles of writing XSLT, and most stylesheets mix them:

Push style lets the processor drive: you write one rule per kind of node, use <xsl:apply-templates/> to recurse, and let pattern matching dispatch. Good for documents with mixed or recursive structure (HTML, DocBook, articles).

<xsl:template match="article">
  <article>
    <xsl:apply-templates/>
  </article>
</xsl:template>

<xsl:template match="para">
  <p><xsl:apply-templates/></p>
</xsl:template>

Pull style uses xsl:for-each and explicit XPath to grab exactly the nodes you want. Good for table-like, regular data (reports, exports).

<xsl:template match="/">
  <table>
    <xsl:for-each select="//order">
      <tr>
        <td><xsl:value-of select="@id"/></td>
        <td><xsl:value-of select="customer"/></td>
      </tr>
    </xsl:for-each>
  </table>
</xsl:template>

Push style scales better when the input structure is unpredictable; pull style is easier to follow when the structure is fixed.

The identity transform

The most useful idiom in XSLT is the identity transform — a template that copies any node it’s given, recursively. In XSLT 3.0 you get this for free with one declaration at the top of the stylesheet:

<xsl:mode on-no-match="shallow-copy"/>

This tells the processor: for any node that no template explicitly matches, copy it shallowly and recurse into its children. With nothing else in the stylesheet, the result is a copy of the input.

The real power comes when you add one template that overrides the handling of a specific element. Everything else is still copied verbatim, so the stylesheet describes only the exceptions:

<xsl:mode on-no-match="shallow-copy"/>

<xsl:template match="price">
  <xsl:copy>
    <xsl:value-of select=". * 1.19"/>
  </xsl:copy>
</xsl:template>

Before XSLT 3.0, the same thing was written as an explicit rule that you had to remember:

<xsl:template match="@*|node()">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

You will still see this idiom in older stylesheets and tutorials. The two are equivalent — the XSLT 3.0 form is just shorter.

Sequences and types

Since XPath 2.0, XSLT works on sequences of items, not just node sets. An item is either a node or an atomic value (string, integer, boolean, date, …). A sequence may be empty, contain one item, or many. There is no distinction between a single item and a sequence of length 1.

Many XSLT instructions accept and return sequences:

<xsl:variable name="primes" as="xs:integer*" select="2, 3, 5, 7, 11"/>
<xsl:value-of select="$primes" separator=", "/>

Type declarations via as are optional but useful — they catch mistakes early and document the intent.

Modes

A template can belong to a mode, which is a named group of rules. Different modes can have completely different rule sets for the same nodes. This is how you produce two different views of the same input — for example, a table-of-contents and the body text:

<xsl:template match="/">
  <toc>  <xsl:apply-templates select="//section" mode="toc"/></toc>
  <body> <xsl:apply-templates select="//section"/></body>
</xsl:template>

<xsl:template match="section">
  <section><xsl:apply-templates/></section>
</xsl:template>

<xsl:template match="section" mode="toc">
  <li><xsl:value-of select="title"/></li>
</xsl:template>

See Modes for the details.

Where to next

Instructions — reference for every supported XSLT instruction
XSLT functions — XSLT-specific XPath functions like current(), key(), regex-group()
XPath reference — the expression language inside XSLT

Getting Started XSLT Functions