Recipe 14.17. Using XSLT from Java

Table of Contents

Recipe 14.17. Using XSLT from Java

Problem

You want to invoke XSLT processing from within a Java application.

Solution

You can invoke XSLT functionality from Java in three basic ways:

Using the native interface of your favorite Java-based XSLT implementation
Using the more portable TrAX API
Using JAXP 1.2 or 1.3 (a superset of TrAX; see http://java.sun.com/xml/jaxp/index.jsp)

If you are familiar with the internals of a specific Java-based XSLT implementation, you might be tempted to use its API directly. However, this solution is not desirable, since your code will not be portable.

An alternative is Transformation API for XML (TrAX), an initiative initially sponsored by Apache.org (http://xml.apache.org/xalan-j/trax.html). The philosophy behind TrAX is best explained by quoting the TrAX site:

The Java community will greatly benefit from a common API that will allow them to understand and apply a single model, write to consistent interfaces, and apply the transformations polymorphically. TrAX attempts to define a model that is clean and generic, yet fills general application requirements across a wide variety of uses.

TrAX was subsumed into Java's JAXP 1.1 (and more recently 1.2 and 1.3 for J2SE 5.0) specification, so there are now only two ways to interface Java to XSLT: portably and nonportably. However, the choice is not simply a question of right and wrong. Each processor implementation has special features that are sometimes needed, and if portability is not a concern, you can take advantage of a particular facility that you require. Nevertheless, this section covers only the portable JAXP 1.2 API.

You can implement a simple XSLT command-line processor in terms of JAXP 1.1, as shown in an example borrowed from Eric M. Burke's Java and XSLT (O'Reilly, 2001):

public class Transform
{
   
  public static void main(String[  ] args) throws Exception
  {
    if (args.length != 2)
    {
      System.err.println(
        "Usage: java Transform [xmlfile] [xsltfile]");
      System.exit(1);
    }
   
    //Open the source and style sheet files
    File xmlFile = new File(args[0]);
    File xsltFile = new File(args[1]);
   
    //JAXP uses a Source interface to read data
    Source xmlSource = new StreamSource(xmlFile);
    Source xsltSource = new StreamSource(xsltFile);
   
    //Factory classes allow the specific XSLT processor
    //to be hidden from the application by returning a
    //standard Transformer interface
    TransformerFactory transFact =
      TransformerFactory.newInstance( );
    Transformer trans = transFact.newTransformer(xsltSource);
   
    //Applies the stylesheet to the source document
    trans.transform(xmlSource, new StreamResult(System.out));
 }
}

In addition to a StreamResult, a DOMResult can capture the result as a DOM tree for further processing, or a SAXResult can be specified to receive the results in an event-driven manner.

In the case of DOM, the user can obtain the result as a DOM Document, DocumentFragment or Element, depending on the type of node passed in the DOMResult constructor.

In the case of SAXResult, a user-specified ContentHandler is passed to the SaxResult constructor and is the object that actually receives the SAX events. Recall that a SAX content handler receives callbacks for events such as startDocument(), startElement( ), characters( ), endElement( ), and endDocument( ). See http://www.saxproject.org/ for more information on SAX.

Discussion

The beauty of accessing XSLT transformation capabilities from Java is not that you can write your own XSLT processor front end, as you did in the solution section, but that you can extend the already formidable capabilities of Java to include XSLT's transformational abilities.

Consider a server process written in Java that must deal with constantly changing XML files stored in an XML database or XML arriving in the form of SOAP messages. Perhaps this server needs to support multiple versions of document schema or multiple SOAP clients for backward compatibility. Thus the server must handle several schemas transparently. If data in an older schema can be transformed to newer ones, then the server code will be that much simpler.

The nice thing about using XSLT via the JAXP interface is that instances of transformers can be reused so you need to parse the stylesheet only once, when the server loads. However, if your server is multithreaded and each thread must handle transformations, different instances will be required per thread to ensure thread safety.