Table of Contents |
Recipe 8.8. Deepening an XML HierarchyProblemYou have a poorly designed document that can use extra structure.[5]
SolutionThis is the opposite problem from that solved in Recipe 8.7. Here you need to add additional structure to a document, possibly to organize its elements by some additional criteria. Add structure based on existing dataThis type of deepening transformation example undoes the flattening transformation performed in Recipe 8.7: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:import href="copy.xslt"/> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:template match="people"> <union> <xsl:apply-templates select="person[@class = 'union']" /> </union> <salaried> <xsl:apply-templates select="person[@class = 'salaried']" /> </salaried> </xsl:template> </xsl:stylesheet> Add structure to correct a poorly designed documentIn a misguided effort to streamline XML, some people attempt to encode information by inserting sibling elements rather than parent elements.[6]
For example, suppose someone distinguished between union and salaried employees in the following way: <people> <class name="union"/> <person> <firstname>Warren</firstname> <lastname>Rosenbaum</lastname> <age>37</age> <height>5.75</height> </person> ... <person> <firstname>Theresa</firstname> <lastname>Archul</lastname> <age>37</age> <height>5.5</height> </person> <class name="salaried"/> <person> <firstname>Sal</firstname> <lastname>Mangano</lastname> <age>37</age> <height>5.75</height> </person> ... <person> <firstname>James</firstname> <lastname>O'Riely</lastname> <age>33</age> <height>5.5</height> </person> </people> Notice that the elements signifying union and salaried class elements are now empty. The intent is that all following-siblings of a class element belong to that class until another class element is encountered or there are no more siblings. This type of encoding is easy to grasp, but more difficult for an XSLT program to process. To correct this representation, you need to create a stylesheet that computes the set difference between all person elements following the first occurrence of a class element and the person elements following the next occurrence of a class element. XSLT 1.0 does not have an explicit set difference function. You can get essentially the same effect and be more efficient by considering all elements following a class element whose position is less than the position of elements following the next class element: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:import href="copy.xslt"/> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:strip-space elements="*"/> <!-- The total number of people --> <xsl:variable name="num-people" select="count(/*/person)"/> <xsl:template match="class"> <!--The last position we want to consider. --> <xsl:variable name="pos" select="$num-people - count(following-sibling::class/following-sibling::person)"/> <xsl:element name="{@name}"> <!-- Copy people that follow this class but whose position is less than or equal to $pos.--> <xsl:copy-of select="following-sibling::person[position( ) <= $pos]"/> </xsl:element> </xsl:template> <!-- Ignore person elements. They were coppied above. --> <xsl:template match="person"/> </xsl:stylesheet> More subtly, a key can be used as follows: <xsl:key name="people" match="person" use="preceding-sibling::class[1]/@name" /> <xsl:template match="people"> <people> <xsl:apply-templates select="class" /> </people> </xsl:template> <xsl:template match="class"> <xsl:element name="{@name}"> <xsl:copy-of select="key('people', @name)" /> </xsl:element> </xsl:template> A step-by-step approach is another alternative: <xsl:template match="people"> <people> <xsl:apply-templates select="class[1]" /> </people> </xsl:template> <xsl:template match="class"> <xsl:element name="{@name}"> <xsl:apply-templates select="following-sibling::*[1][self::person]" /> </xsl:element> <xsl:apply-templates select="following-sibling::class[1]" /> </xsl:template> <xsl:template match="person"> <xsl:copy-of select="." /> <xsl:apply-templates select="following-sibling::*[1][self::person]" /> </xsl:template> XSLT 2.0Add structure based on existing dataUsing XSLT 2.0's xsl:for-each-group allows you to achieve a more generic solution than we did in the 1.0 solution. Although there are 1.0 solutions that are generic (see Discussion), none is quite as simple: <xsl:template match="people"> <xsl:for-each-group select="person" group-by="preceding-sibling::class[1]/@name"> <xsl:element name="{curent-grouping-key( )"> <xsl:apply-templates select="current-group( )" /> </xsl:element> </xsl:for-each> </xsl:template> Add structure to correct a poorly designed documentYou can exploit xsl:for-each-group with the group-starting-with option to solve this problem: <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:import href="copy.xslt"/> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:template match="people"> <xsl:copy> <xsl:for-each-group select="*" group-starting-with="class"> <xsl:element name="{@name}"> <xsl:apply-templates select="current-group( )[not(self::class)]"/> </xsl:element> </xsl:for-each-group> </xsl:copy> </xsl:template> </xsl:stylesheet> DiscussionAdd structure based on existing dataWhen you added structure based on existing data, you explicitly referred to the criteria that formed the categories of interest (e.g., union and salaried). It would be better if the stylesheet figured out these categories by itself. This makes the stylesheet more generic at the cost of added complexity: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:import href="copy.xslt"/> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:strip-space elements="*"/> <!-- build a unique list of all classes --> <xsl:variable name="classes" select="/*/*/@class[not(. = ../preceding-sibling::*/@class)]"/> <xsl:template match="/*"> <!-- For each class create an element named after that class that contains elements of that class --> <xsl:for-each select="$classes"> <xsl:variable name="class-name" select="."/> <xsl:element name="{$class-name}"> <xsl:for-each select="/*/*[@class=$class-name]"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> </xsl:for-each> </xsl:element> </xsl:for-each> </xsl:template> </xsl:stylesheet> Although not 100% generic, this stylesheet avoids making assumptions about what kinds of classes exist in the document. The only application-specific information in this stylesheet is the fact that the categories are encoded in an attribute @class and that the attribute occurs in elements that are two levels down from the root. Add structure to correct a poorly designed documentThe solution can be implemented explicitly in terms of set difference. This solution is elegant, but impractical for large documents with many categories. The trick used here for computing set difference is explained in Recipe 9.1: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:import href="copy.xslt"/> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:template match="class"> <!--All people following this class element --> <xsl:variable name="nodes1" select="following-sibling::person"/> <!--All people following the next class element --> <xsl:variable name="nodes2" select="following-sibling::class/following-sibling::person"/> <xsl:element name="{@name}"> <xsl:copy-of select="$nodes1[count(. | $nodes2) != count($nodes2)]"/> </xsl:element> </xsl:template> <xsl:template match="person"/> </xsl:stylesheet> |
Table of Contents |