XPath Expressions
What is XPath?
XPath is a language for navigating XML trees and querying data. Since the Publisher’s input data is in XML format, XPath is the central tool for accessing elements, attributes and text. Examples: “give me all articles with a certain attribute” or “how many articles are in this article group?”.
For a general introduction to XPath, see the tutorial at W3Schools.
Where are XPath expressions used?
The Publisher accepts XPath expressions in two places:
- In the attributes
selectandtest— the value is interpreted directly as XPath. - In all other attributes — curly braces (
{and}) force an XPath expression.
<Textblock width="{ $width }">
<Paragraph>
<Value select="."/>
</Paragraph>
</Textblock>
$width, the paragraph content is the text value of the current data node.Supported expressions
- Numbers and text:
"5",'hello world'— returned directly. - Arithmetic operations:
*,div,idiv,+,-,mod. Example:( 6 + 4.5 ) * 2 - Variables:
$column + 2 - Current node (dot operator):
. + 2 - Child elements:
productdata,*,foo/bar,node() - Parent node:
../ - Attributes:
@a,foo/@bar - Filters:
article[1]selects the firstarticlechild element. - Comparisons:
<,>,<=,>=,=,!=. Note:<must be written as<in XML. - If/then/else:
if (...) then ... else ... - For expressions:
for $i in (1,2,3) return $i * 2orfor $i in 1 to 3 return $i * 2 - Axes:
preceding-sibling,parent,descendant-or-self,following-siblingetc.
XPath and namespaces
The speedata Publisher ignores XML namespaces by default. This is usually very helpful, as namespaces are not always easy to handle.
The XML file:
<bar:data xmlns:bar="somenamespace">
<bar:child>
<bar:sub>sub</bar:sub>
</bar:child>
</bar:data>
can be addressed in the layout as follows:
<Layout xmlns="urn:speedata.de:2009/publisher/en"
xmlns:sd="urn:speedata:2009/publisher/functions/en">
<Record element="data">
...
although the namespace of the root element is somenamespace.
With <Options namespaces="strict" /> you can change this so that the namespace must be specified for <Record> and <ProcessNode>:
<Layout xmlns="urn:speedata.de:2009/publisher/en"
xmlns:sd="urn:speedata:2009/publisher/functions/en"
xmlns:sn="somenamespace">
<Options namespaces="strict" />
<Record element="sn:data">
<PlaceObject>
<Textblock>
<Paragraph>
<Value select="local-name()"></Value>
</Paragraph>
</Textblock>
</PlaceObject>
</Record>
</Layout>
It is important that the namespace matches, the prefix (here: sn) is freely selectable as long as it is linked to the namespace.
<ProcessNode> works similar:
<ProcessNode select="*:child" />
<ProcessNode select="sn:child" />
<ProcessNode select="sn:*" />
<ProcessNode select="*" />
call the child element bar:child, as the namespaces of sn (from the layout) and bar (from the data) match. * is a ‘wildcard’ here, so it matches all names.
<ProcessNode select="child" /> will not work in the above case, as the empty namespace from the layout is urn:speedata.de:2009/publisher/en and does not match the namespace from the data.
You must also use namespaces in the XPath queries if they are specified in the data:
<Message select="count(sn:sub)" />
writes 1 to the log file. Without namespace a 0, as no element sub with the preset namespace is found.
