XPath and Layout Functions (old XPath module)

This page describes the old XPath parser, called “luxor”. There is a new XPath module called “lxpath”. To switch between these two, you can set the default by setting the xpath configuration, for example on the command line with

sp --xpath luxor

(the old one)

or

sp --xpath lxpath

for the new, default one. You can also set this in the configuration file.

XPath expressions

The speedata Publisher accepts XPath expressions in some attributes. These attributes are called test or select or documented as such. In all other attributes XPath expressions can be used via curly braces ({ and }). In the following example XPath expressions are used in the attribute width and in the element Value. The width of the textblock is taken from the variable $width, the contents of the paragraph is the string value of the current node.

<Textblock width="{$width}">
  <Paragraph>
    <Value select="."/>
  </Paragraph>
</Textblock>

The following XPath expressions are handled by the software:

  • Number: Return the value without change: "5"
  • Text: Return the text without change: 'hello world'
  • Arithmetic operation (*, div, idiv, +, -, mod). Example: ( 6 + 4.5 ) * 2
  • Variables. Example: $column + 2
  • Access to the current node (dot operator). Example: . + 2
  • Access to subelements. Examples: productdata, node(), *, foo/bar
  • Attribute access in the current node. Example @a
  • Attribute access in subnodes, for example foo/@bar
  • Boolean expressions: <, >, <=, >=, =, !=. Attention, the less than symbol < must be written in XML as &lt;, the symbol > may be written as >. Example: $number > 6. Can be used in tests.
  • Simple if/then/else expressions: if (…​) then …​ else …​.

The following XPath functions are known to the system:

There are two classes of XPath functions: standard XPath functions and speedata Publisher specific ones. The specific functions are in the namespace urn:speedata:2009/publisher/functions/en (denoted by sd: below). The standard functions should behave like documented by the XPath 2.0 standard.

sd layout functions

sd:allocated(x,y,<areaname>,<framenumber>)

Return true if the grid cell is allocated, false otherwise (since 2.3.71).

sd:alternating(<type>, <text>,<text>,.. )

On each call the next element will be returned. You can define more alternating sequences by using distinct type values. Example: sd:alternating('tbl', 'White','Gray') can be used for alternating color of table rules. To reset the state, use sd:reset-alternating(<type>).

sd:aspectratio(<imagename>,[pagenumber],[pdfbox])

Return the result of the division width by height of the given image. (< 1 for portrait images, > 1 for landscape). The arguments can contain a page number and a PDF box, see Specifying the page for layout functions for details.

sd:attr(<name>, …​)

Is the same as @name, but can be used to dynamically construct the attribute name. See example at sd:variable().

sd:count-saved-pages(<name>)

Return the number of saved pages from <SavePages>.

sd:current-column(<name>)

Return the current column. If name is given, return the column of the given frame.

sd:current-framenumber(<name>)

Return the current frame number of given positioning area.

sd:current-page()

Return the current page number.

sd:current-row(<name>)

Return the current row. If name is given, return the row of the given frame.

sd:decode-base64(<contents>)

Expect a string encoded in base64 and return the binary contents.

sd:decode-html(<node>)

Change text such as &lt;i&gt;italic&lt;/i&gt; into HTML markup (<i>italic</i> in this case).

sd:dimexpr(<string>)

Deprecated Evaluate the string as a dimension with a unit. Handy if you want to add / multiply lengths. Example: "2mm + 5cm". Since version 4.5.7 you can now calculate with dimensions directly.

sd:dummytext([<count>])

Return a “Lorem ispum…​ ” dummy text. The count defaults to 1.

sd:even(<number>)

True if number is even. Example: sd:even(sd:current-page()).

sd:file-exists(<filename or URI schema>)

True if file exists in the current search path. Otherwise it returns false.

sd:filecontents(<binarycontent>)

Save the given contents into a file and return the file name.

sd:firstmark(pagenumber)

The first marker of the given page number. Useful for headings in dictionaries where the first and the last entry of a page is given.

sd:first-free-row(<name>)

Return the first free row of this area (experimental).

sd:format-number(Number or string, thousands separator, comma separator)

Format the number and insert thousands separators and change comma separator. Example: sd:format-number(12345.67, ',','.') returns the string 12,345.67.

sd:format-string(object, object, …​ ,formatting instructions)

Return a text string with the objects formatted as given by the formatting instructions. These instructions are the same as the instructions by the C function printf().

sd:group-height(<string>[, <unit>])

Return the given group’s height (in gridcells). See sd:group-width(…​) If provided with an optional second argument, it returns the height of the group in multiples of this unit. For example sd:group-height('mygroup', 'in') returns the group height in inches.

sd:group-width(<string>[, <unit>])

Return the number of gridcells of the given group’s width. The argument must be the name of an existing group. Example: sd:group-width('My group'). See sd:group-height() for description of the second parameter.

sd:imageheight(<filename or URI schema>,[pagenumber],[pdfbox],[<unit>])

Natural height of the image in grid cells. Attention: if the image is not found, the height of the file-not-found placeholder will be returned. Therefore you need to check in advance if the image exists. If provided with an optional second argument, it returns the height of the image in multiples of this unit. For example sd:imageheight('myimage.pdf', 'in') returns the height of 'myimage.pdf' in inches. The arguments can contain a page number and a PDF box, see Specifying the page for layout functions for details.

sd:imagewidth(<filename or URI schema>,[pagenumber],[pdfbox],[<unit>])

Natural width of the image in grid cells. Attention: if the image is not found, the width of the file-not-found placeholder will be returned. Therefore you need to check in advance if the image exists. If provided with an optional second argument, it returns the width of the image in multiples of this unit. For example sd:imagewidth('myimage.pdf', 'in') returns the width of myimage.pdf in inches. The arguments can contain a page number and a PDF box, see Specifying the page for layout functions for details.

sd:keep-alternating(<type>)

Use the current value of sd:alternating(<type>) without changing the value.

sd:lastmark(pagenumber)

The first marker of the given page number. Useful for headings in dictionaries where the first and the last entry of a page is given.

sd:loremipsum()

Same as sd:dummytext().

`sd:markdown(<text>)

Renders the text as markdown. See Markdown.

sd:md5(<value>,<value>, …)

Return the MD5 sum of the concatenation of each value as a hex string. Example: sd:md5('hello ', 'world') gives the string 5eb63bbbe01eeed093cb22bb8f5acdc3.

sd:merge-pagenumbers(<pagenumbers>,<separator for range>,<separator for space>, [hyperlinks])

Merge page numbers. For example the numbers "1, 3, 4, 5" are merged into 1, 3–5. Defaults for the separator for the range is an en-dash (–), default for the spacing separator is ', ' (comma, space). This function sorts the page numbers and removes duplicates. When the separator for range is empty, the page numbers are separated each with the separator for the space. If hyperlinks is set to true(), the page numbers become active. The default is false(). The function will show the user visible page numbers, which correspond to the logical page numbers by default.

sd:mode(<string>[,<string>…​])

Returns true (true()) if one of the specified modes is set. A mode can be set from the command line or from the configuration file. See Control of the layout when calling the Publisher

sd:number-of-columns()

Number of columns in the current grid.

sd:number-of-pages(<filename or URI schema>)

Determines the number of pages of a (PDF-)file.

sd:number-of-rows()

Number of rows in the current grid.

sd:odd(<number>)

True if number is odd.

sd:pagenumber(<string>)

Get the number of the page where the given mark is placed on. See the command <Mark>.

sd:pageheight(<unit>)

Similar to sd:pagewidth(), just for the height.

sd:pagewidth(<unit>)

Get the width of the page in number of units (but without the unit). For example a page with width 210mm sd:pagewidth("mm") returns 210. This function initializes a page. (Since version 4.13.8.)

sd:romannumeral(<number>)

Convert the number into a lowercase Roman numeral.

sd:randomitem(<Value>,<Value>, …)

Return one of the values.

sd:reset-alternating(<type>)

Reset alternating so the next sd:alternating() starts again from the first element.

sd:sha1(<value>,<value>, …)

Return the SHA-1 sum of the concatenation of each value as a hex string. Example: sd:sha1('hello ', 'world') gives the string 2aae6c35c94fcfb415dbe95f408b9ce91ee846ed.

sd:sha256(<value>,<value>, …)

Return the SHA-256 sum of the concatenation of each value as a hex string. Example: sd:sha256('hello ', 'world') gives the string b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9.

sd:sha512(<value>,<value>, …)

Return the SHA-512 sum of the concatenation of each value as a hex string. Example: sd:sha512('hello ', 'world') gives the string 309ecc489c12d6eb4cc40f50c902f2b4d0ed77ee511a7c7a9bcd3ca86d4cd86f989dd35bc5ff499670da34255b45b0cfd830e81f605dcf7dc5542e93ae9cd76f.

sd:tounit(<string>,<string>[,<number>])

Return a scalar of the unit given in the second argument converted to the unit given in the first argument rounded to the digits in the third argument (defaults to 0 - return integer values). Example: sd:tounit('pt','1pc') returns 12, because there are 12pt in 1pc (pica point).

sd:variable(<name>, …​)

The same as $name. This function allows variable names to be constructed dynamically. Example: sd:variable('myvar',$num) – if $num contains the number 3, the resulting variable name is myvar3.

sd:variable-exists(<name>)

True if variable name exists. Example: sd:variable-exists('my_bar') checks whether $my_bar is defined (variable names in this function have to be enclosed in single quotation marks if double quotation marks are used to delimit the XPath attribute).

sd:visible-pagenumber(<number>)

Return the user visible page number (as defined by matters) for the given real page number.

XPath functions

abs(<number>)

Return the positive value of the number.

ceiling()

Round to the higher integer. ceiling(-1.34) returns 1, ceiling(1.34) returns 2.

concat( <value>,<value>, … )

Create a new text value by concatenating the arguments.

contains(<haystack>,<needle>)

True if haystack contains needle. contains('bana','na') returns true().

count(<text>)

Counts all child elements with the given name. Example: count(article) counts how many child elements with the name article exist.

ceiling()

Returns the smallest number with no fractional part that is not less than the value of the given argument.

doc(<string>)

Open the file with the given file name and return its contents.

empty(<attribute>)

Checks, if an attribute is (not) available.

false()

Return false.

floor()

Returns the largest number with no fractional part that is not greater than the value of the argument.

last()

Return the number of elements of the same named sibling elements. Not yet XPath conform.

local-name()

Return the local name (without namespace) of the current element.

lower-case(<text>)

Return the text in lowercase letters.

matches(<text>,<regexp>[,<flags>])

Return true if the regexp matches the text. Flags can be one of sim and are described in the spec: https://www.w3.org/TR/xpath-functions-31/#flags. Example: matches("banana", "^(.a)+$") returns true.

max()

Return the maximum value. max(1.1,2.2,3.3,4.4) returns 4.4.

min()

Return the minimum value. min(1.1,2.2,3.3,4.4) returns 1.1.

not()

Negates the value of the argument. Example: not(true()) returns false().

normalize-space(<text>)

Return the text without leading and trailing spaces. All newlines will be changed to spaces. Multiple spaces/newlines will be changed to a single space.

position()

Return the position of the current node.

replace(<input>,<regexp>, <replacement>)

Replace the input using the regular expression with the given replacement text. Example: replace('banana', 'a', 'o') yields bonono.

round(<number>,<number>)

Return the argument in the first parameter rounded to number of decimal places in the second parameter. The second parameter defaults to 0.

string(<sequence>)

Return the text value of the sequence e.g. the contents of the elements.

string-join(<sequence>,separator)

Return the string value of the sequence, where each element is separated by the separator.

substring(<input>,<start>,<length>)

Return the part of the string input that starts at start and optionally has the given length. start can be (in contrast to the XPath specification) negative which counts from the end of the input.

string-length(<string>)

Return the length of the string in characters. Multi-byte UTF-8 sequences are counted as 1.

tokenize(<input>,<regexp>)

This function returns a sequence of strings. The input text is read from left to right. When the regular expression matches the current position, the text read so far from the last match is returned. Example (from the great XPath / XSLT book by M. Key): tokenize("Go home, Jack!", "\W+") returns the sequence "Go", "home", "Jack", "".

true()

Return true.

upper-case()

Converts the text to capital letters: upper-case('text') results in TEXT.