Skip directly to searchSkip directly to the site navigationSkip directly to the page's main content

View System Documentation - XML Parsers

This page provides a discussion about the XML technologies currently in use (XML Players). Also discussed is what XML technologies are standard with which Sun Java JVM. The Tomcat application server's XML requirements are then covered followed by the IBIS-PH View System's deployment and finally XML Performance.

Most of the following information is not needed for the IBIS-PH View System *IF* running a standard Java 1.5+ JDK. The View System contains the correct XSLT engine and special XML libraries. It should work with any standard JAXP and XML parser either supplied by the application server or the JDK.

XML Technologies:

XML PlayerDescription
JAXP / JAXP-API Pluggable XML framework for the "open" java xml parsers.
XML APIs Package that implements the specific vendor's JAXP XML APIs.
XALAN XML Transformation Engine. This is XSL version 1.0 ONLY!!! As of 2/2006, it is not known if they will ever support XSLT 2.0. This engine is needed because the IBIS XSLTs use the xalan:nodeset function call in several places. However, Saxon 8.x supports XSLT 2.0 and would not need these special calls... Also, XALAN provides XPATH 1.0 support but it's not real clear how to get DOM4j to use it instead of the JAXEN xpath processor. NOTE: DOM4J Bench marking Article states that XALAN's xpath shouldn't be used anyway for performance reasons.

Note that Version 2.5-2.6 when used with JDK 1.5+ had memory leaks!!! See: : Memory Leak in XSL Transform leading to OutOfMemory Exception or google it. Had a note about Xalan 2.7 not working with the IBIS key generation. However, in light of the memory leak and that 2.7 had some leaks fixed, gave it a try again to couldn't see the problem with the xsl:key (3-15-07).
XT Simple and fast XSLT 1.0 Transformer. Not sure if it provides the nodeset extension that is needed by the IBIS XSLTs.
CRIMSON "Open" SAX parser (v1 or v2 both have fallen by the wayside in Sun's eye in favor of Xerces).
XERCES "Open" SAX parser and DOM parser, with newer versions supporting xinclude (2.7+). This package is NOT needed for basic transformations by ibisph BUT it is needed for those more advanced DOM operations like used with Batik objects for SVG to JPEG conversion, some chart data and for certain query system operations. If it were not for these DOM type operations then the standard Crimson parser which is included with Java 1.4 and 1.5 would do or even the Alfred parser which comes with DOM4j 1.5.1 and before would suffice. Note that to use the XINCLUDE feature, specific system settings must be made. As such IBIS does not rely on this parser and its settings since the application has to be deployable to app servers where the sys admins might not want to change the system properties. Also, Xerces v2.6 has a known memory leak.
XINCLUDE As stated in the Xerces section, Xerces 2.7+ supports xinclude. However, that feature is not turned on by default. It is enabled via system properties (which we might not always have access to - think Utah's "ITS" and their app servers that server 50+ apps), or setting the class path with command line arguments to enable, or the code needs to implement it's own custom JAXP where our factory SAX call instantiates with the xinclude flag set. Because of this the ibis:xinclude XSLT is used. This allows includes to be performed independently of the XML parser and its configuration. See Apache XERCES Xinclude FAQ for more info.
JAXEN Provides xpath support and is the integrated engine within JDOM and DOM4j.
DOM4J Higher level XML package that allows for XML document creation and navigation. v1.5.1 needs at least jaxen 1.1 beta 4 if doing xpath (which ibisph utilizes). Dom4j also needs a SAX parser. Versions 1.5.1 and before contains a fast, non validating SAX parser known as Alfred. Later versions have this removed so Crimson or Xerces is required. Their docs say that alfred can be skipped if using the JAXP. However, I haven't been able to get this to work??? DOM4j is pluggable with both XALAN and SAXON via JAXP.
Saxon Provides XPath and XSLT 2.0 support. The IBIS-PH View System historically used the XALAN XSLT processor. However, in 2006/2007 the application started crashing intermittently. It appeared to be a memory leak. At the time XALAN 2.6 was known to have memory leaks plus there were no plans to implement XSLT 2.0 functionality (which is much more robust). So a decision was made to convert to XSLT 2.0 and Saxon. The downside to doing this is that XALAN is standard with Java and because of JAXP would always be used in place of SAXON. To get around this, the Spring framework was used to plug in an instance of the SAXON XSLT processor object to the system's view object.

JAVA 1.5.x and XML

Java 1.5 (rt.jar) comes bundled with basic XML:
  • JAXP 1.1+ (probably 1.3)
  • Crimson (SAX parser v2)
  • Xalan is NOT part of the runtime
  • Xerces is NOT part of java runtime
  • In reading, it looks like Sun is going to drop Crimson and replace it with Xerces at some future point.

JAVA 1.4.x and XML

Java 1.4 (rt.jar) comes bundled with the following XML:
  • JAXP 1.1
  • Xalan 2.4.1
  • Crimson (SAX 2)
  • Xerces is NOT part of Java runtime

A system property can be set to tell which parser to load.
java.endorsed.dirs
and
c:\>java -Djava.endorsed.dirs=yourDirectoryPath MyApp.
See: https://jaxp.dev.java.net/Updating.html#java-14

NOTE: both java and tomcat have endorsed directories and you can specify a different one independently using the system property above.

!!! IMPORTANT !!! It should be noted that this is not a good solution to rely on as different app servers and different system administrators might not want or be able to specify a different parser (xinclude enabled) and/or transformation engine.

JAVA 1.3.x (and prior) with XML

No XML support is provided. E.g. all XML packages will need to be provided in the Tomcat environment, or manually set in the Java runtime, or included within the application deployment package, or by some other means.

TOMCAT

Tomcat 4.x+ has a directory named "endorsed". This directory is a mechanism to get around the CORBA and XML related packages that are loaded by the Java 1.4 rt.jar. So, this directory should only contain CORBA and XML related jars. Tomcat's bootstrap class loader will then load this directory first, then the JVM, then the common lib, then the webapp's classes, and lastly the webapp's lib directory.

Tomcat needs an XML parser. The endorsed directory contains Xerces 2.3 which overrides everything even if specified in the common lib or webapps lib and possibly even if the system properties are set? If using Java 1.4 then these xerces jars can be deleted from the endorsed directory since the Crimson parser is in java 1.4's rt.jar. Else if an earlier java is used then an XML parser is MUST be in the endorsed directory or in the class path for tomcat (via it's boot.jar and the way it searches for xml parsers). This parser must also be a version that supports whatever tomcat is in need of. Because of this that's why tomcat ships with xerces 2.3 and puts it in the endorsed directory.

BOTTOM LINE IF RUNNING JAVA 1.4.x - DON'T AS THE APP REQUIRES JAVA 1.5+
  • JAXP: Version 1.1 is good to go with Java 1.4
  • XALAN: You can't specify which xalan to use by including the jar within the webapp/lib. You either get the one in the Java rt.jar or you MUST put in a new one in the Tomcat endorsed directory. It might be possible to override Java's XALAN with a system property but that will still apply to ALL apps using that JVM!!!
  • XERCES: This parser is not needed by IBIS but could be needed by other apps. A new Xerces can be put in place by doing one of the following:
    • Put the updated xerces into the endorsed directory where all apps will be forced to use.
    • Remove the Tomcat default from the endorsed directory and include the desired xerces jar in each webapp's lib folder. Could put in the common lib but this is the same as putting it in the endorsed.

NOTES:
  • The information provided above is for historical purposes and is left for those how may have to deal with these issues -OR- for those "open" Sun standard pluggable XML parsers that are packaged based on JAXP. Other options include the use of some other XML package that is not the same package structure etc.
  • In any case you'll need to test Tomcat to start and test any app that uses xerces.
  • jaxp-api.jar can NOT be placed in the endorsed directory but can be put and picked up in the lib commons dir.

IBIS-PH DEPLOYMENTS

In light of the above, the IBIS-PH View System is complied with Java 1.5+ and has a requirement to have Java 1.5+. Since Java 1.5+ contains JAXP and an XML parser, the application is deployed with Saxon, dom4j, and Jaxen. The Saxon XSLT engine is plugged directly into the view code so that it will always be used (Java 1.5 comes with Xerces which is not XSLT 2.0 as of 2/2008). A safe way to make sure that the correct XML libraries are used is to always verify/put the most recent XML related packages in the app server's endorsed directory and/or make sure that the app server is running Java 1.5+.

XML PERFORMANCE

xml performance can be helped by:
  • Specifying the "US-ASCII" or at least UTF-8 encoding instead of UTF-16 etc.
  • Don't use the DOCTYPE statement.
  • Don't format the document with lots of white space that simply get's in the way of parsing and makes the file larger.
The information provided was retrieved on: Sat, 25 May 2019 12:26:38.

Content updated: Wed, 4 Nov 2015 09:26:28 MST