XSLT Recipes
Wstęp
I will put new recipes here from time to time until some software for handling recipes will be written.
Spis treści:
Useful links
- W3Schools tutorial - place for absolute beginners
- Zwon tutorial - example based tutorial
- XSLT FAQ - great FAQ on XSLT; if you can't find answer there, you should actually read this article:)
- XSLT recomendation - if you can't find answer here you should check the recomendation. Keep in mind that every implementation is slightly different than others and not 100% of this document is actyally implemented.
There is much more on W3Schools than it appears to be. Make sure you clicked on every link on whole page. It is more likely that user interface is crappy than there is no information.
If you are kind of guy who can learn through complex examples (like me), check XSLT files which are used on this site. They are pretty advanced and working in all browsers (at least I'm doing everything I can to keep it in that way).
Ensuring every browser can handle site
Choosing the XSLT as main templating engine forces us to ensure that every possible browser can handle it. Unfortunately there are browsers without support for XSL and we need to transform documents server-side. The most important area where we MUST do it are search engine crawlers such as Google Bot. If you send them XML, you'll end up with not indexed or not properly indexed page.
Browsers with no XSLT support have less than 5% of market, but this includes every cell phone browser which is fast growing market so we can't tell how it'll be in near future.
There are some server-side XSLT processors around. Currently, because of mentioned 95% of browsers, any free solution is sufficient.
On server we should distinguish browsers by HTTP_USER_AGENT header. In Python it will be:
if re.search("Gecko|IE|Opera",env["HTTP_USER_AGENT"]):
sendXML()
else:
processXSLT()These three browsers (we assume here IE version lower than 6, Firefox 1.5 and opera 8.50 are obsolete - you can write more detailed checks if you need) are XSLT compatible and others are conceded as not compatible or unknown. For the later we use server-side processing.
Every time when you find another compatible browser we should add its name to list because it's simplest way to lower server load.
Cross-browser compatibility
Now we are limited to four processors: MSXML (IE), Transformiix (FireFox), Opera, and one of your choice (my is libXSLT). It seem to all of them are processing templates it the same way but there are places where you can get head ache figuring out what is wrong.
Paths
The most annoying issue is the handling of paths to external files. We have server-side processor which relative paths are based on file system so if we have:
- directory
- xml
- file.xml
- xslt
- index.xsl
- foo.xsland the directory is in /home/adrian then full path to file.xml is /home/adrian/directory/xml/file.xml
Same path in browsers is http://example.com/xml/file.xml so we need relative paths to process the same file with both processors. Opera and Transformiix handles relativeness of external files based on XSLT file that is processed so writing ../xml/file.xml should be sufficient. And it is! But IE handles relativeness differently - based on page URL. So when we are on http://example.com/something it is working but if URL is http://example.code/something/1/2 it searches for file.xml under address http://example.code/something/file.xml.
One of solutions is to make all paths URIs. Disadvantage is that server-side processor is forced to download file by HTTP and it is significantly slower than taking it from HDD (or from some cache), but... it works.
Checking if file exists
It appears that IE6 (one of MSXML versions) doesn't support document() function properly, which is resulting in cross-browser check of file existence not possible. All other processors are returning empty node-set if file doesn't exist so:
boolean(document($path))
is returning true or false as expected.
Try to replace many files with one file. If they are small you can benefit from less requests and better caching.
If you really need to check for file existence you can redirect any request resulting in 404 HTTP error to proper xml file with content:
<?xml version="1.0" encoding="UTF-8"?> <notFound/>
and then check for root node
XPath in predicates
If you like to match nodes with dynamically evaluated property you can either use variable or subset of XPath language.
Unfortunately not all XPath expressions are supported in predicates. You can't do something like:
<x:value-of select="//*[local-name()=@name]"/>
This throws an evaluation error. On right side of predicate you can use functions and variables. Later could be used in following way:
<x:variable name="name" select="@name"/> <x:value-of select="//*[local-name()=$name]"/>
This will return all nodes with name from attribute name, but when you want to save bandwidth, and every byte is more important than speed of evaluation (and it is always true if browser XSLT processor is in use!), there is better solution:
<x:value-of select="//*[local-name()=current()/@name]"/>
As above, you can use function, in this case current() which returns current node from the context from which XPath was invoked (. also returns current node, but in this particular case it refers to node evaluated by XPath).
Multilingual UI
There is many ways you can do multilingual UI, but as far as you are creating web site and XSLT can get data from external XML documents, it is good to do it browser-side. document() functions path can be evaluated dynamicaly so you can determine current language at run time and do something like:
<x:variable name="path">http://example.com/translations/<x:value-of select="$lang"/>.xml</x:variable> <x:variable name="langdoc" select="document($path)/*/"/>
Where $lang is string with language name (or shortcut e.g. "en"). We keep document fragment under langdoc to gain little performance boost and save fingers on continuously writing name of root element of XML file.
Language file is simply XML file with nodes with unique names:
<texts> <pageTitle>Hello world!</pageTitle> <language>english</language> </texts>
Or in polish:
<texts> <pageTitle>Witaj świecie!</pageTitle> <language>angielski</language> </texts>
You can use langdoc as follows:
<x:copy-of select="$langdoc/pageTitle/node()"/> <x:variable name="element">pageTitle</x:variable> <x:copy-of select="$langdoc/*[local-name()=$element]/node()"/> <x:for-each select="*"> <x:copy-of select="$langdoc/*[local-name()=current()/@langElement]/node()"/> </x:for-each>
It is really compact call In the first place. You don't need to use any external templates or server functionality. I'm using this on this site so you can see how it works in real environment.
Embedding XML in XML
If you want to embed an XML document in another XML document you should be aware that CDATA section can't be nested. Common way to do embedding is to wrap inner document in CDATA element like:
<document>
<xml>
<![CDATA[<innerDocument><node/></innerDocument>]]>
</xml>
</document>
There is problem when inner document contains its own CDATA elements. As far as they cannot be nested, nodes mix-matched error is thrown. Solution is to replace any occurrence of ]]> with ]]>]]><![CDATA[ or ]]>]]><![CDATA[ depending on purpose. If you are unsure test both and pick the right one.
Iterate exactly n times
If number isn't big and you don't want to use recursion, you can do something like that:
<x:for-each select="(//node()|//*/@*) [position()<6]"> ... </x:for-each>
You can use different XML documents to enhance set of nodes, but remember that it's slow technique. It have its benefits on browser side, but not on server's. Consider recursion in cases where speed is more important.

