Performance Highlights

z/OS XML System Services is a high performance validating and non-validating parser, with a unique buffer-in/buffer-out design to minimize application-to-parser API overhead, and allow the application complete control over the post-parse processing.

z/OS XML System Services continues to focus on performance from release to release, with a strong focus on achieving parity or improving performance between releases. The z/OS V1.12 XML System Services validating parser provides up to 30-50% improvement over the z/OS V1.11 XML System Services validating parser. The z/OS V1.12 XML System Services non-validating parser performance is equivalent to z/OS V1.11 XML System Services non-validating parser.

z/OS XML System Services has added benefits of running on z Application Assist Processors (zAAPs) for all task-mode parsing and z Integrated Information Processors (zIIPs) for all enclave SRB mode parsing. These represent significant reductions in customer cost, reflecting IBM's strategy to provide price/performance benefits for new application workloads considered strategic for System z.

The chart below shows z/OS V1.12 validating parser performance improvements over z/OS V1.11. The results were generated by measuring performance when parsing a set of XML documents of various sizes and complexities. The reported cpbp was determined by calculating the geometric mean of the per byte costs for all benchmark documents.

validation camparison for z/OS XML with V1.12 showing marked improvement over z/OS V1.11
 

Because of the unique design of z/OS XML System Services, comparisons with other parsers using the SAX or DOM APIs should be judged in light of their different designs. A straight comparison among various parsers does not take into account the application costs of processing the output data for each parser model, which should be kept in mind whenever making an evaluation.

Note: The API interfaces are in C/C++ and assembler, so applications in some high level languages will need to write a wrapper to access the parser.

With those caveats in mind, the following chart gives an idea of the capabilities of the z/OS XML System Services parser. In the chart below, comparison is made with XML4C 5.7, the parser included in the latest version of XML Toolkit for z/OS. The chart below shows XML4C 5.7 numbers for non-validating parsing, parsing when using XPLINK, and parsing when using the well-formedness checking scanner WFXMLScanner as compared to z/OS XML System Services V1.11 and V1.12.

comparison XML4C5.7 sax non validation and XMLSS v1.12 non validation with xmlss v1.12 showing better results, noting that SAX and DOM user interfaces were not used in the z/OS v1.12 test

The next chart shows XML4C 5.7 numbers for SAX validating parsing and parsing when using XPLINK.

comparison of XML4C5.7 SAX Schema Validation and XMLSS V1.12 Validation, with z/OS XML performing better (note: SAX and DOM interfaces were not used in the z/OS XML test)

The results were generated by measuring performance when parsing a set of XML documents of various sizes and complexities. The reported cpbp was determined by calculating the geometric mean of the per byte costs for all benchmark documents. Keep in mind that the z/OS XML System Services parser uses non-SAX APIs written in C/C++ and assembler. If you are converting an application from use of an alternative parser, significant application changes may be required to make use of the z/OS XML parser's data model output.


Performance Considerations

 

Fragment parsing mode:

Fragment parsing mode is a new function added to z/OS XML System Services V1.12. It allows validation of document fragments without obtaining and parsing the entire document.

 

 

Schema discovery:

The schema discovery feature added in z/OS XML System Services V1.12 enhances the usability of the validating parser. This enhancement is achieved by allowing the user to query the XML document for namespaces and schema locations detailed in the “schemaLocation” and “noNamespaceSchemaLocation” attribtues, in addition to the root element namespace and local name. The user then has the opportunity to load an OSR without having to reset the z/OS XML parser.

Having the schema discovery feature on when validating small documents degrades peformance. For larger documents(>50KB), there was no effect on performance. On a geometric mean across a set of benchmark documents, this feature resulted in an 3.3% higher cycles per bytes parsed (cpbp) cost.

 

 

Schema sizes and document complexity:

The performance of the z/OS XML System Services validating parser depend greatly on size and complexity of the schema and the document being parsed.

 

 

Input/Output buffer sizes:

z/OS XML System Services allows the application to specify the input buffer size and output buffer size, for flexibility depending on the application requirements. Using a small input buffer size may require multiple buffers to hold a large XML document, requiring both additional application and parser processing. This is also true for a small output buffer size where the parsed fields of a large XML document may require multiple buffers. However, performance tests have shown that as long as a minimum size of 4K is specified for an input buffer, and a minimum size of 8K is specified for an output buffer, parser performance is nearly equal to that when specifying much larger buffer sizes (for example, both input and output buffers at 8M).

 

 

Tokenize whitespace feature:

The objective of the tokenize whitespace feature is to enable an application to more easily ignore whitespace (spaces, tabs, line feed characters) that may be present in documents. However, since all document characters must be parsed in any event, unnecessary whitespace should be minimized when generating XML documents, in order to more efficiently transmit and parse them.

 

The amount of whitespace in an XML document makes a lot of difference in performance when this feature is used. On a geometric mean across a set of benchmark documents, this feature resulted in an 11% higher cycles per bytes parsed (cpbp) cost. Individual documents with little whitespace were minimally affected, while documents with a great deal of whitespace had a significantly higher cost.

 

 

Different encodings:

z/OS XML System Services supports several character encodings. The more commonly used ones are UTF-8, UTF-16, IBM-1047, and IBM-037. The parser is most optimized for single-byte characters. Both UTF-8 and IBM-1047 encodings have roughly the same performance, when UTF-8 documents don't contain significant amounts of multibyte characters. UTF-16 encoding, measured in cycles per character parsed rather than cycles per byte parsed, and where the benchmark documents have been converted from single-byte characters to their equivalent double-byte forms, has about a 10% higher cost. IBM-037 was not measured, presumed to be roughly equivalent to IBM-1047.

 

 

Running 31-bit vs. 64-bit:

The parser is optimized for 64-bit performance. Running in 31-bit mode shows about a 4% increase in cpbp.

 

Performance Recommendations

 

Load validating parser function into link pack area

Try loading the validating parser into the link pack area (LPA). This can provide a performance improvement.

 

 

Avoid multiple parses of the same document

There may be times when it would be convenient to parse the same document more than once in the course of a single transaction. Duplicate parses are expensive. Avoid the overhead of additional parses by storing the output buffers for later use in a format which may be efficiently accessed at a later time.

 

 

Reuse the initialized parser

When parsing a small document, most of the CPU time is spent on parser initialization. You should, therefore, avoid instantiating a new parser every time you parse. Instead, when possible initialize the parser instance, and do a control reset after the parse, rather than a termination and reinitialization. With z/OS XML System Services parser features - Strip Comments, Tokenize Whitespace, CDATA as CharData - can be changed with the control reset, thus avoiding the performance impact of termination and reinitialization.

 

 

Reduce character count

Try to reduce your character count; smaller documents are parsed more quickly. Avoid unnecessary use of white space as spaces, tabs, and line feed characters must also be parsed.

 

 

Avoid using default attributes

Avoid the use of default attributes. Attributes are associated with elements within a document. A purchase order element might, for example, contain a status attribute set to "in process", "shipped", or "billed". If, in your DTD, you specify a default value for the attribute rather than explicitly assigning a value in your XML document, processing will be slower.

 

Follow z/OS e-business performance tuning recommendations

Contact IBM

Browse z/OS