JavaTM API for XML Processing Release Notes

Specification Version: 1.2
Reference Implementation (RI) Version: 1.2.4

This document contains installation instructions and other notes that may help you use this software library more effectively. See also the JAXP FAQ for more information.

Note:
If you are reading this page online, this is the most current version of the release notes. If this page was downloaded as part of the release bundle, please see the JAXP Documentation page for the most current version of the release notes.

Contents

Component Versions

These versions of the relevant technologies have been incorporated into the reference implementation.

Identifying the JAR Files

This release is contained in six JAR files:

jaxp-api.jar
The javax.xml.parsers and javax.xml.transform components of JAXP. These packages contain the APIs that give applications a consistent way to obtain instances of XML processing implementations.
sax.jar
The APIs and helper classes for the Simple API for XML (SAX), used for serial access to XML data.
dom.jar
The APIs and helper classes for the Document Object Model (DOM), used to create an in-memory tree structure from the XML data.
xercesImpl.jar
The implementation classes for the SAX and DOM parsers, as well as Xerces-specific implementations of the JAXP APIs..
xalan.jar
The "classic" (interpreting) XSLT processor.
xsltc.jar
The compiling XSLT processor.

XML Parsing

The information in this section pertains to the Xerces technology:

More information on known bugs and recent fixes can be found at the Apache Xerces site (The release notes for the latest Xerces version are at http://xml.apache.org/xerces2-j/releases.html.)

Known Schema Processing Limitations

This section discusses known schema processing bugs, limitations, and implementation-dependent operations.

Limitations

These limitations specify known upper bounds on values.

Problem Areas / Known Bugs

The following problems are known to exist:

Implementation-Dependent Operations

This implementation-dependent operation is not fully clarified by the W3C XML Schema specification (http://www.w3.org/2001/XMLSchema). As a result, differing implementations exist.

Known Migration Issues from JAXP 1.1

JAXP 1.1 is built into J2EE 1.3 and J2SE 1.4. Since JAXP 1.2 contains a different parser, there are some differences in functionality that is not specified by the standard. This section highlights the major differences.

Note:
JAXP is intended as an implementation-independent API layer. However, it is still possible to make use of parser-specific features, either intentionally or unintentionally. Portable applications (those that do not rely on parser-specific features) will not be affected by these differences.

Use of Java Encoding Names

The JAXP 1.1 parser recognizes Java encoding names in an XML header. For example, in this header:

<?xml version="1.0" encoding="UTF8"?>

the JAXP 1.1 parser recognizes the Java encoding name UTF8 as valid. However, in XML the standard encoding name uses a hyphen, as in UTF-8.

Since XML documents are intended for maximum portability, the JAXP 1.2 parser diagnoses the use of UTF8 as an error.

Note:
The Java encoding name is still UTF8, so you continue to use that value when invoking APIs in the java.io and java.lang packages -- as, for example, when writing:

out = new OutputStreamWriter(System.out, "UTF8");

On the other hand, the java.nio package more properly recognizes UTF-8. (And all of Java's core packages recognize UTF-16.)

Security Issue

While XML does not allow recursive entity definitions, it does permit nested entity definitions, which produces the potential for Denial of Service attacks on a server which accepts XML data from external sources. For example, a SOAP document like the following that has extremely deeply nested entity definitions can consume 100% of CPU time and a lot of memory in entity expansions.

<?xml version="1.0" encoding ="UTF-8"?>
<!DOCTYPE foobar[
<!ENTITY x100 "foobar">
<!ENTITY x99 "&x100;&x100;">
<!ENTITY x98 "&x99;&x99;">
...
<!ENTITY x2 "&x3;&x3;">
<!ENTITY x1 "&x2;&x2;">
]>
<SOAP-ENV:Envelope xmlns:SOAP-ENV=...>
<SOAP-ENV:Body>
<ns1:aaa xmlns:ns1="urn:aaa" SOAP-ENV:encodingStyle="...">
<foobar xsi:type="xsd:string">&x1;</foobar>
</ns1:aaa>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

A system that doesn't take in external XML data need not be concerned with issue, but one that does can utilize one of the following safeguards to prevent the problem:

New system property to limit entity expansion
The entityExpansionLimit system property lets existing applications constrain the total number of entity expansions without recompiling the code. The parser throws a fatal error once it has reached the entity expansion limit. (By default, no limit is set.)

To set the entity expansion limit using the system property, use an option like the following on the java command line: -DentityExpansionLimit=100000
 
New parser property to limit entity expansion
The http://apache.org/xml/properties/entity-expansion-limit parser property lets an application set a limit on total entity expansions without having to use the command line. It accepts a value of java.lang.Integer type. The parser throws a fatal error once it has reached the entity expansion limit. (By default, the value is null, which means that no limit is set.)

To set the entity expansion limit with this property, the application can use code like the following:

DocumentBuilderFactory dfactory = DocumentBuilderFactory.newInstance(); dfactory.setAttribute(
  "http://apache.org/xml/properties/entity-expansion-limit",
  new Integer("100000")
);
 
New parser property to disallow DTDs
The application can also set the http://apache.org/xml/features/disallow-doctype-decl parser property to true. A fatal error is then thrown if the incoming XML document contains a DOCTYPE declaration. (The default value for this property is false.) This property is typically useful for SOAP based applications where a SOAP message must not contain a Document Type Declaration.

XSLT Processing

The JAXP RI contains 2 XSLT engines that are part of the Xalan implementation. This section of the Release Notes describes:

Note:
XSLT is supported by the JAXP transform package. See javax.xml.transform for details on accessing basic XSLT functionality in an implementation-independent manner.

The Interpreting XSLT Processor (Xalan)

Xalan is the default XSLT parsing engine that is used when you use the JAXP transform package. More information on can be found at the Apache Xalan site.

For the latest information on other known bugs and recent fixes, see the Xalan "Read Me" at http://xml.apache.org/xalan-j/readme.html.

Known Limitations

Using -URIRESOLVER with Java 1.4
When using the -URIRESOLVER command line option to specify a custom resolver, the jar file that contains the URIResolver implementation class must be included in the endorsed directory, along with the JAXP jar files. Otherwise, a class not found error occurs.

The Compiling XSLT Processor (XSLTC)

The XSLTC transformer generates a transformation engine, or translet, from an XSL stylesheet. This approach separates the interpretation of stylesheet instructions from their runtime application to XML data.

XSLTC works by compiling a stylesheet into Java byte code (translets), which can then be used to perform XSLT transformations. This approach greatly improves the performance of XSLT transformations where a given stylesheet is compiled once and used many times. It also generates an extremely lightweight translet, because only the XSLT instructions that are actually used by the stylesheet are included.

Known Limitations in XSLTC

The known bugs and limitations are:

To check on the open bugs in the current Apache xml-xalan/java repository, follow the instructions below:

  1. Go to http://nagoya.apache.org/bugzilla.
  2. Select Query Existing Bug Reports.
  3. Choose:
    Program: XalanJ2
    Component: org.apache.xalan.xsltc
    (and) Xalan-Xsltc
  4. Submit the query.

Custom Class Loader Issue

In both Xalan and XSLTC, a problem can occur when using a custom class loader with a transformation factory.

Transformation factories in JAXP always prefer the use of the "context class loader" to the use of the "system class loader". Thus, if an application uses a custom class loader, it may need to set the custom class loader as the context class loader for transformation factory to use it. Setting a custom class loader on the current thread can be done as follows:

try {
  Thread currentThread = Thread.currentThread();
  currentThread.setContextClassLoader(customClassLoader);
}
catch (SecurityException e) {
  // ...
}

If the application is multi-threaded, the custom class loader may need to be set in all threads (every time a new thread is created). A security exception is thrown if an application does not have permission to set the context class loader.

This issue applies to both Xalan and XSLTC.

Manually Specifying which XSLT Implementation to Use

By default, JAXP transformations use the Xalan XSLT engine. To direct the application to use the XSLT engine in XSLTC, one way is to set the TransformerFactory property as follows:

javax.xml.transform.TransformerFactory=
    org.apache.xalan.xsltc.trax.TransformerFactoryImpl

This mechanism lets you determine which transformer you use when you start the app. However, changing this property in a servlet container, for example, affects every other servlet in the container, so it may be unwise to use that option. (To prevent the problems that can attend such global overrides, future implementations of Tomcat in the Java Web Services Developer Pack may well preclude such property settings.)

When you can't use a system property to select the transformation engine, you can either instantiate the factory in your program directly, with code like this:

new org.apache.xalan.xsltc.trax.TransformerFactoryImpl(..)

Or, to get back runtime control, you can pass the name of the factory as an argument to the application, and use the ClassLoader to create a new instance of it.

Similarly, you can ensure you are using the Xalan implementation with this setting (or else direct the application to instantiate the factory class, as above):

javax.xml.transform.TransformerFactory=
     org.apache.xalan.processor.TransformerFactoryImpl

Automatically Choosing Implementations with the "Smart Transformer"

The JAXP transformation API includes a "Smart Transformer" which automatically switches between Xalan and XSLTC processors within your application. It uses Xalan to create your Transformer objects, and XSLTC to create your Templates objects.

To use the switch, you use this setting for the factory system property:

javax.xml.transform.TransformerFactory=
     org.apache.xalan.xsltc.trax.SmartTransformerImpl

For one-time transformations or transformations that require extensions supported by Xalan, and not XSLTC, you would use Transformer objects. For a repeated transformation where performance is critical, you would use Templates objects.

Note:
Again, it may or may not be wise (or possible) to control the factory setting with a system property. See the previous section for ideas on directing the application to instantiate a specific factory class.


JWSDP Security Considerations

When an application is running on a web server, such as the Java Web Services Developer Pack (JWSDP), with security enabled, the following permissions must be set:

permission java.io.FilePermission

    "/${webserver.home}/common/endorsed/xercesImpl.jar", "read";
permission java.io.FilePermission
    "/${webserver.home}/common/endorsed/xalan.jar", "read";

permission java.util.PropertyPermission
    "javax.xml.parser.SAXParserFactory", "read, write";
permission java.util.PropertyPermission
    "javax.xml.transform.TransformerFactory", "read, write";

permission java.util.PropertyPermission "user.dir",        "read";
permission java.util.PropertyPermission "file.separator",  "read";
permission java.util.PropertyPermission "line.separator",  "read";
permission java.util.PropertyPermission "JavaClass.debug", "read";

permission java.lang.RuntimePermission "createClassLoader";
permission java.lang.RuntimePermission "accessDeclaredMembers";

Note:
If read permission is not set for xercesImpl.jar, a potentially misleading error message is reported. A FactoryConfigurationError is thrown that says
   "Provider org.apache.crimson.jaxp.SAXParserFactoryImpl not found",
instead of
   "Provider org.apache.xerces.jaxp.SAXParserFactoryImpl not found".


Changes in JAXP RI Versions

Changes in JAXP RI version 1.2.4

Changes in JAXP RI version 1.2.3

Changes in JAXP RI version 1.2.2

Changes in JAXP Ri version 1.2.1

Performance of Xerces parser improved significantly.

XSLTC was not included as part of this release, which was destined solely for the J2EE platform. 

Changes in JAXP RI version 1.2.0-FCS

The parser implementation changed from Xerces 2.0.0_01 to Xerces-J 2.0.1_01
(Xerces 2.0.1 final with controlled bug fixes). The Xalan XSLT processor
implementation was updated to xalan-j 2.3.1_01 (Xalan version 2.3.1 with
controlled bug fixes).

Finally, this release fully supports the proposed 1.2 JAXP specification,
which implements document validation using W3C XML Schema.

Changes In JAXP RI version 1.2.0-EA2

The parser implementation changed from Xerces 2.0.0 beta3 to Xerces-J 2.0.0_01 (Xerces 2.0.0 final with controlled bug fixes). The Xalan XSLT processor implementation was updated to xalan-j 2.3.0_01 (Xalan version 2.3.0 with controlled bug fixes).

The Xalan XSLTC processor was also added in this release. (It is used to compile a stylesheet into a transformation engine (translet) that is ready to run.)

This release fully supports the proposed 1.2 JAXP specification, which implements document validation using W3C XML Schema.

Changes in JAXP RI version 1.2.0-EA1

The parser implementation changed from Apache Crimson to Xerces 2 version 2.0.0 beta3. The XSLT processor implementation was updated to Xalan classic version 2.2.D14.

The parser supports W3C XML Schema but does not support all aspects of the proposed JAXP 1.2 specification. In particular, the ability to enforce that an instance document conforms to a particular schema has not been implemented. However, the validation portions of the specification can be used along with schema hints in the instance document.