Alphabet Soup: XSL Stylesheets

Alphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible Markup Language (XML) technologies. The secon...
Author: Baldric Little
0 downloads 1 Views 1022KB Size
Alphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible Markup Language (XML) technologies. The second tutorial described the construction of a well- formed XML document. The third tutorial discussed the role of the XML schema, the primary elements of a schema, and the relationship between an XML document and an XML schema. In this tutorial, we describe the roles of XSL stylesheets and introduce the mark-up and transformation languages used in stylesheets.

Presentation and Transformation We started this series by discussing the primary role of XML. XML is a set of technologies that simplify and enhance the distribution of data and information (Figure 1). XML does this by structuring data hierarchically using tags to convey the meaning of each unit of data (called an element in XML terminology).

Figure 1: Information Sharing Using XML

XML Document XML Schema

The core repository for data in an XML-based system is the XML document. An XML document is a text document that contains elements delimited using tags (Figure 2).

1

Figure 2: BankAcct3.xml

Presentation: Presentation of info rmation on the web is one common application of XML. By default, web browsers capable of processing XML display the XML document in a hierarchical, text format (Figure 3).

Figure 3: BankAcct3.xml in Internet Explorer

2

Although this format may be acceptable for simple XML documents, this format is not appropriate for complex XML documents. XSL provides a means of formatting the XML data for easier use. With XSL, the web browser displays a more meaningful view of the data (Figure 4).

Figure 4: BankAcct3.xml Rendered Using a XSL Stylesheet

Data Transformation: In addition to presenting information on the web, the XSL stylesheet provides a means to transform data as needed so that different applications can examine the same data in whatever way they need the data. One XML document may have many XSL stylesheets, one stylesheet for each user (Figure 5). The user can be a person or an application.

3

Figure 5: Multiple XSL Stylesheets for One XML Document

XML Document

As a simple illustration, a person interested in information about his or her bank account may prefer the view in Figure 4; however, a computer program would have a great deal of difficulty understanding this format. Instead, the computer program might prefer the data in a commaseparated format like Figure 6.

Figure 6: Comma-separated BankAcct3.xml Data

A Simple XSL Stylesheet XSL uses HyperText Markup Language (HTML) to format data for display by web browsers and other applications that understand HTML, XPath for navigating through the XML hierarchy, and

4

XSL Transformations (XSLT) for manipulating the data. Figure 7 is an example of a basic XSL stylesheet.

Figure 7: A Simple XSL Stylesheet (BankAcct3.xsl) XPath

HTML

XSLT

To understand XSL, we will dissect this basic XSL stylesheet.

Line 1 declares the contents of the file as an XSL stylesheet based on the 1999 XSL standard.

Line 3 establishes the initial point in the XML document for processing nodes. The slash at the start of match="/BankAccount" establishes the root node as the starting point for processing the XML document. The remainder of the match property indicates this template manipulates the top user-defined node in the XML document tree (bank account). Generally, the match property for the initial template used to process the XML document starts with a reference to the root node (/) followed by the name of the top user-defined node.

Lines 5 and 6 integrate HTML and XSLT to display the values of the account id element and the balance element, respectively. Both lines start with standard HTML to display labels for the data values. Each “  ” inserts a blank space in the stylesheet’s output; this is needed when forcing space characters outside of HTML elements. The value-of transformation instruction 5

displays the value of the element identified by the select property. The value-of instruction self-terminates to ensure a well- formed stylesheet. The
at the end of each line is “XML” notation for an HTML line break; in general, single tag HTML elements require a trailing slash to self terminate.

Line 8 starts repetitively processing each account holder node. Line 9 specifies the sort order for account holder nodes, in this case sorting by the account holder’s name. Lines 11 through 17 display the account holder information using HTML and XSLT. Line 19 terminates this repetitive processing.

Line 21 closes the template XPath pointer established by line 3.

Line 23 terminates the XSL stylesheet.

Open MyAccount.html in Internet Explorer. MyAccount.html processes the BankAcct3.xml document using the BankAcct3.xsl stylesheet to produce data formatted using HTML (Figure 8). The resulting page looks similar to Figure 4.

Figure 8: XML Document/XSL Stylesheet Relationship

XSL Stylesheet

XML Document Data BankAcct3.xml

BankAcct3.xsl

Data Formatted Using HTML

Modifications: Now we will modify the stylesheet to display the information in different ways.

Open BankAcct3.xsl in Crimson Editor. Line 6 currently displays the value of the balance element without any formatting. Modify line 6 as follows:

6

Balance:  


This change uses the format-number XSLT function to format the balance using commas, enclosing negative numbers in parentheses, and displaying one digit to the left of the decimal and two digits to the right of the decimal. Save the change.

Open BankAcct3.xml in Crimson Editor and change the balance from 232.34 to 1000232.34. Save the change. Switch to Internet Explorer and refresh the page (MyAccount.html). Verify the balance displays properly given the format specified (1,000,232.34).

Switch to BankAcct3.xml in Crimson Editor. Add a minus sign to the start of the balance. Save the change. Switch to Internet Explorer and refresh the page. The balance should display inside parentheses.

Switch to BankAcct3.xsl in Crimson Editor. Change the sort order of the account holders to use the holder’s tax id instead of the name. Save the change. Switch to Internet Explorer and refresh the page. The account holders should sort based on the tax id.

Default XSL Stylesheet: Currently, we are using a web page to transform the contents of the XML document using the XSL stylesheet. You may want to specify a default XSL stylesheet for the XML document. Switch to BankAcct3.xml in Crimson Editor. Add line 2 as shown in Figure 9. This provides an automatic link to the BankAcct3.xsl stylesheet.

Figure 9: Default Stylesheet Specification

Close Internet Explorer. Open BankAcct3.xml in Internet Explorer (Figure 10).

7

Figure 10: XML Document Displayed Using a Default XSL Stylesheet

Overriding the Default XSL Stylesheet: After setting a default XSL stylesheet, if you need to associate other XSL stylesheets with the same XML document, use an HTML document (web page) to process the desired XSL stylesheet just as MyAccount.html does.

Figure 11: Process BankAcct3.xml Using the BankAcct3.xsl Stylesheet (MyAccount.html)

8

Figure 11 is a JavaScript function embedded in a web page that uses the BankAcct3.xsl stylesheet to process BankAcct3.xml and display the results. Line 12 defines the XML document. Line 16 defines the XSL stylesheet.

The sample files include a web page called Template.html. You can copy and modify Template.html to create additional web pages to process XML documents using XSL stylesheets; just change lines 12 and 16 as needed.

Make a copy of Template.html and rename the copy AcctInfo.html. Modify AcctInfo.html in Crimson Editor to process the BankAcct3.xml document using the AcctInfo.xsl stylesheet. Open AcctInfo.html in Internet Explorer. The web page should similar to Figure 12. The XSL stylesheet designated in the web page overrides the default stylesheet specified in the XML document.

Figure 12: AcctInfo.html Output

Close Internet Explorer. Open BankAcct3.xml in Internet Explorer. Notice how the page displayed does not look exactly like Figure 12. Why not?

Close all files open in Crimson Editor and Internet Explorer.

9

Extending XSL Stylesheets Up to this point, we have used an XML document with data about a single bank account. To demonstrate the full potential of XSL, we are going to extend this XML document to hold data about several bank accounts.

Open AllAccts.xml in Internet Explorer. This XML document contains information about multiple accounts. Figure 13 shows this XML document formatted using a default XSL stylesheet.

Figure 13: AllAccts.xml

10

Open AllAccts.xml in Crimson Editor. Scroll through the XML document paying attention to the structure of the document and comparing it to the document as displayed in Figure 13. Notice that this XML document contains multiple account nodes. This is the most significant difference between this XML document and the one we have been using. The top node (bank accounts) has multiple child nodes, one for each account. Any closed account has a close date element; however, accounts currently open do not have the close date element.

Figure 14 is the default XSL stylesheet for the AllAccts.xml document.

Figure 14: Initial AllAccts.xsl Stylesheet

We will examine some key parts of this XSL stylesheet.

Templates: Since the bank accounts node contains a child node for each account, we have to process each account node separately to produce account information. The XPath template

11

element on line 3 defines the starting point for processing the XML document. In this case, the XSL stylesheet processes each child node in the bank accounts node.

This is a good point to talk about how templates work in XSL stylesheets. The XML parser uses the match property of the XPath template element to determine if a template applies to the current node. If the currently selected node matches the node defined by the match property, the XML parser uses the template. Initially the XML parser starts with the root node and tries to find a template that matches this node. If so, the XML parser processes the node using this template. If not, the XML parser moves to the next lower node in the hierarchy and searches for a matching template. This continues until the XML parser finds a match or reaches the end of the hierarchy. In general, if multiple templates match a node, the XML parser implemented by Microsoft currently uses the last matching template in the XSL stylesheet. As a note, there is a way to assign priorities to templates within an XSL stylesheet; however, this is beyond the scope of this tutorial.

Since there are multiple account nodes, the XPath apply-templates element on line 5 causes the XML parser to process each account node separately, in order based on the account id. The XML parser looks for a template that manipulates account nodes. The template selected is the one that starts on line 11. This template is almost identical to the BankAcct3.xsl stylesheet discussed earlier.

Close Internet Explorer and Crimson Editor.

Make a copy of AllAccts.xsl and name the copy AllAccts2.xsl.

Open AllAccts.xml in Crimson Editor. Change the default XSL stylesheet to reference AllAccts2.xsl (href="allaccts2.xsl"). Save the change and close Crimson Editor.

Open AllAccts2.xsl in Crimson Editor. Remove the and tags on lines 19 and 24. Save the changes.

12

Open AllAccts.xml in Internet Explorer. Notice that the AllAccts2.xsl stylesheet does not italicize the Name and Tax ID labels. We are going to modify the XSL stylesheet to make better use of templates. Switch to AllAccts2.xsl in Crimson Editor. Cut lines 18 through 26 to the clipboard (select the lines and then Edit à Cut). Move the cursor to line 23 (immediately before ). Paste the lines cut earlier (Edit à Paste).

Modify the XSL stylesheet so the text looks like the code in Figure 15. We have marked locations requiring changes. Save the changes.

Figure 15: AllAccts2.xsl Modifications

These modifications create a template to process account holder nodes and set up the template that processes account nodes to use the new template. This accomplishes the same outcome as the XPath for-next element previously used; however, the XML parser appears to process XSL stylesheets faster using this method. This is a major issue when using large XML documents.

13

This also makes reusing XSL code easier since the XSL stylesheet organizes code for processing nodes in discrete blocks, which you can copy to new stylesheets more easily.

Switch to AllAccts.xml in Internet Explorer and refresh the page. The page should look the same. If not, check for typographical errors in your code.

To verify that the new structure actually does something, switch back to AllAccts2.xsl in Crimson Editor. Change the sort order for account holders (line 17 in Figure 15) to use the account holder’s tax id. Save the changes. Switch to Internet Explorer and refresh the page. Verify the sort order of account holders is the tax id.

Sorting Nodes: Switch to AllAccts2.xsl in Crimson Editor. Scroll to the top of the XSL stylesheet. The XSL stylesheet currently sorts account nodes in ascending order by the account id (Figure 16). We want to sort the account nodes in descending order by balance. Change the sort element to use the balance. Change the sort order to descending. Save the changes.

Figure 16: Initial Account Node Sort Order

Switch to Internet Explorer and refresh the page. Notice that the XML parser sorts the accounts by the first character of the balance, not the numeric value. This is because the XML parser treats the data as text. We must tell the parser that the balance is a number.

Switch to AllAccts2.xsl in Crimson Editor. Modify the sort line to look like the following:

14

The data-type property tells the XML parser the balance is a number. Save the changes. Switch to Internet Explorer and refresh the page. The sort should be correct now.

Filtering Nodes: Switch to AllAccts2.xsl in Crimson Editor. Assume we only want to see accounts with a balance less than $100,000. XSL provides a filtering mechanism when applying templates. Modify the XSL stylesheet as shown in Figure 17. Save the changes. Switch to Internet Explorer and refresh the page. You should get an error similar to the one shown in Figure 18.

Figure 17: Filtering - First Attempt

Figure 18: Filter - First Attempt Error

15

The reason for this error is that the less than sign ( instead of the greater than sign (>); the greater than sign is the ending character for any tag. Figure 19 shows the completed AllAccts2.xsl stylesheet.

Figure 19: Final AllAccts2.xsl Stylesheet Account ID:  
Balance:  
Name:        Tax ID:  


16

Summary This tutorial described the roles and parts of an XSL stylesheet, as well as a web page to process XML documents and XSL stylesheets. This tutorial does not provide a comprehensive coverage of XSL stylesheets. For additional information about XSL stylesheets, consult the XSL specification developed by the W3 and the Microsoft XML 4.0 Parser software development kit (SDK) documentation available from Microsoft.

XML Resources Crimson Editor, www.crimsoneditor.com. Microsoft Internet Explorer, www.microsoft.com. Microsoft XML 4.0 Parser Software Development Kit (SDK), www.microsoft.com. Topologi P/L Schematron Validator, www.topologi.com. World Wide Web Consortium, www.w3.org.

17