|
This paper represents subject knowledge based on data available as of March, 2000
Introduction
This paper is intended to provide guidance in understanding the characteristics of options available to optimize Java applications for execution on OS/390. The Java technology available on OS/390 is one of several made available to enable the OS/390 system to be used for high performance scaleable reliable e-Business applications. When combined with the latest S/390 hardware these technologies offer market leadership capabilities for servers in this highly competitive market.
Product Overview
IBM is currently offering three key products that when used together make it very easy and cost effective to build e-Business applications using Java whether they be Web-based Web-enabled stand alone general business or transactional in nature. These products are:
- VisualAge for Java Enterprise Edition for OS/390 which includes Enterprise Toolkit.
VAJava is the tool set IBM recommends to customers to build test and deploy the applications.
- Websphere Enterprise Edition for OS/390
Websphere EE for OS/390 contains the execution and application management services for the Web based/enabled applications. It provides the necessary network and security services supporting HTTP and SSL and IBM's Web Application Server (WAS).
- Java for OS/390
The Java for OS/390 product provides the Java compiler classes and execution environment for the applicatioins.
Option Overview
Given all this technology one is faced with understanding how best to leverage the technology to obtain optimal performance. System tuning is a first basic step. Advice on system tuning can be found at the OS/390 and Websphere web sites. The next step is to assess the options available with Java itself. There are fundamentally three Java optimization options. The Java for OS/390 product provides two of them: running the application in 'interpreted mode' or in 'JITed' mode. The VAJava product offers the third option pre-compiling the application using the High Performance Compiler for Java (HPCJ).
Option Details
All three cases start with Java source code. The source code gets compiled by a Java compiler into Java bytecode. This bytecode is platform independent and provides the basis for Java's complete application portability. For the first two options the bytecode is input to a Java Virtual Machine (JVM) which converts it to executable machine instructions. Having produced bytecode one then must make a choice of which JVM optimization option to use.
Dynamic Translation with the JVM
Within the JVM two different means exist for executing programs in Java bytecode form. Both of these depend on dynamic compilation to machine instructions that is at application execution time the bytecode is translated (or compiled) into native machine instructions which are then executed. The first and most basic means of doing this is through the Java interpreter. Bytecode execution by the interpreter is based on a rote implementation of JVM behavior as defined by the Java specification. With an interpreter bytecode instructions are fetched sequentially and one by one translated into machine code and executed. The translation time to executable machine instructions is low but the execution time is high because there is little opportunity to optimize the generated machine code. Once translated and executed there is no caching of the generated machine instructions. If the application logic reuses the bytecode then the translation is done again.
An alternative approach called just-in-time compilation (JIT) can provide a higher performance alternative for some types of applications. With a JIT a two phase approach is used. The JIT first compiles the program's bytecode or a subset of it into optimized machine language instructions and then executes those newly generated instructions. The JIT compilation includes applying many of the same optimization techniques used by standard compilers including for example inlining and the elimination of redundant code. Often the generated results are cached for subsequent reuse through the duration of the JVM. Repeated calls to a specific method for example will result in use of the cached instructions being used for execution. The Java for OS/390 product uses the JIT as it's default execution mode.
Pre-compilation
The third option pre-compilation of the application is provided by the HPCJ feature of VAJava. Again bytecode is used as input. But in this case a static compiler is used to pre-compile the byte code prior to execution. Since this is done out of line to the application's execution time window more time is available to apply compilation optimizations. There is not the urgency as with dynamic translation to minimize translation time in order to provide runtime responsiveness IBM applied over 30 years of compiler experience to building HPCJ. Included in this compiler its best C-compiler optimization technology. The result is that HPCJ produces highly optimized object code. That code is typically stored in a standard OS/390 data set (PDS). For applications compiled with HPCJ the traslated object code is loaded and execution proceeds using LE (Language Environment) services. This is just the same as for other LE enabled languages such as C and C++. There is no need for or dependency on a JVM for the execution of HPCJ translated applications.
Each of the above optimization options has a place it fits best in the broad portfolio of Java applications. We provide some guidance here as to what application characteristics best fit each of the three options.
Interpreted Mode
The interpreted mode is best for applications that will be executed infrequently and only once per execution. The application flow is typically largely sequential in nature where the methods are not repeatedly executed. Interpreted mode is also appropriate for cases where JVM startup time can dominate application execution time. The java compiler (javac) tends toward this behavior particularly when small applications are compiled into bytecode. Interpreted mode is conditionally available when executing any standard Java compliant application transported to the OS/390 system. With the availability of Java for OS/390 at JDK level 1.1.8 the interpreter was rewritten in S/390 assembler code. Previously it was written in C. This rewrite has made interpreter use a significantly better performing option with internal measurements often showing a thirty to fifty percent reduction in execution time.
Just-In-Time compilation
Just-In-Time compilation is appropriate for all standard compliant Java applications including servlets that require high performance and need to scale. Often these applications are heavily multi-threaded. Other appropriate characteristics for JITed applications are long running applications for which JVM initialization costs can be amortized over a relatively long execution time period and those where methods are executed repeatedly. This results in cached copies of highly optimized translated bytecode being re-used. Finally this mode is particularly appropriate for applications that meet the previous criteria and are transported to OS/390 from another platform. Over time there have been continuing improvements made to optimization techniques used in JIT translation. These have typically resulted in application execution time reductions in the range of twenty to thirty percent for each new JDK level from 1.1.1 through 1.1.8.
A note on Mixed Mode Interpretation
With the delivery of the JDK 1.1.8 level of Java for OS/390 these two approaches to dynamic compilation were combined to take advantage of the best characteristics of each. A new technology that bridges the interpreter and the JIT called Mixed Mode Interpretation (MMI) was introduced as part of the product. With MMI a count is kept of the number of times each Java application's method is executed. Bytecode corresponding to the method is interpreted until that count reaches a predetermined threshold value. In this way for methods used only a small number of times the large JIT translation cost does not occur. For methods that are frequently reused when the threshold count is reached JIT translation is done and method execution time is then optimized. Thus for short running applications with limited reuse the interpreter tends to be used and the relatively high cost JIT compilation avoided whereas for longer running applications with heavier code reuse JIT compilation is done and execution time is minimized.
High Performance Compiler for Java (HPCJ)
The pre-compiled (HPCJ) option is most appropriate for applications having one or more of the following characteristics:
- use either one or a very low number of threads.
- are standalone or disconnected from other applications. Not tied to nor run in parallel with other Java applications.
- have typically short execution times. Longer running applications may also benefit from the optimization but the benefit often decreases compared to the JIT as the execution time increases.
- are run frequently. Applications that are run over and over again gain performance advantage over the JIT in not having to incur the bring up cost of a JVM. Short batch or batch-like applications are examples.
- are intended for OS/390 execution as opposed to being frequently moved to different platforms. Given that this option executes pre-compiled code there is a dependency on using LE services specific to OS/390.
- are not dependent on full compliance with the Java specification. The use of HPCJ entails a number of functional restrictions. There is no support for servlets or the AWT classes and the Java API level supported is JDK 1.1.
Transactional Environments
The HPCJ option is currently highly recommended as the way to use Java to build or re-engineer applications running in transactional environments where many if not all of the above characteristics apply. These environments include:
- CICS transactions. This support is available now with CICS Transaction Server 1.3.
- DB2 Stored Procedures. This support will be available early in the first half of 2000 through PTFs for DB2 V5 and V6.
- IMS transactions. Support will be available later in 2000 with IMS V7.
Today CICS Transaction Server V1.3 provides options for Java usage. CICS programs written in Java can be HPCJ compiled for execution or run under the Java for OS/390 JVM. Using the JVM enables CICS programs to use all the function of the core Java classes but there is a large performance overhead compared to the typical cost for a CICS transaction. Much of this cost is associated with the current necessity to bring up and tear down the full JVM as part of processing each transaction. The performance cost of doing this can far outweigh the cost of running the CICS application itself. Hence using HPCJ is highly recommended for those cases where compilation to machine code is possible. VAJava provides a tool that can be used to determine whether a particular CICS Java program is a candidate for pre-compilation to object code. In almost all cases where existing programs are rewritten in Java compilation to object code is possible. In addition new CICS Java programs written using the standard CICS programming model and using the JCICS classes for access to CICS services will typically be compilable by HPCJ. Only in exceptional circumstances should it be necessary to use the JVM under CICS to run Java applications.
HPCJ will remain the preferred option for the specific releases of the transactional environments defined above and for applications using Java API level 1.1 that require increased performance but can accept the stated restrictions.
Sample Benchmark Comparison
A primary use of Java for OS/390 is with long running multithreaded server applications. The benchmark in the figure below is a sample of such an application. In this benchmark transaction rate is plotted against number of application threads. The two lines represent the results using HPCJ and the Java for OS/390 JVM. For the comparison all hardware and software was held constant. The number of application threads was varied and at each change in number of threads a steady state result was obtained. For the JVM this meant that at each measurement point all executing method code had been translated by the JIT prior to the measurement being made.
The results show that for the particular benchmark used that the HPCJ option provides slightly better performance for small numbers of threads. The percent difference in the two results is actually largest at one thread and declines until at six threads there is equality. Beyond six threads the JVM results are better with the margin increasing as threads are added. It should be emphasized that these results can vary widely depending on the application or benchmark. However the trends demonstrated here should be general for applications of this type that is some advantage with HPCJ for small numbers of threads and some often substantial advantage for the JVM with large numbers of threads.
Future Options
Beyond the Java API level of 1.1 future releases of CICS DB2 and IMS will support new levels of Java APIs in particular Java 2 using a restructured JVM that will address the current performance deficiencies for transactional environments for both interpreted and JIT-ed options. When this new JVM becomes available it will negate the need for static compilation in any environment. It will serve role for Java 2 that HPCJ currently does for the 1.1 level of Java APIs.
Other Environments
Apart from traditional OS/390 transaction servers the cost of creating and tearing down a JVM comes into play in web serving environments as well. For the dynamic options interpreted or JITed code execution the traditional client oriented mode of operation where a JVM is created and torn down for each execution of an application is not appropriate nor necessary. The optimal way of using a JVM is to create it keep it in existence over time and reuse it for successive applications. This can be done today. Using application server and load balancing technologies like Web Application Server (WAS) and OS/390's Workload Manager (WLM) Java applications can be scheduled for execution into an existing "persistent" JVM and so in this environment the overhead of JVM initialization and teardown is entirely avoided.
Summary
OS/390 provides the technologies and associated tools to create and execute your e-Business applications. Included in this support is a spectrum of performance options that you must understand and use to effectively achieve the optimum performance for a given application. Using the features of the VisualAge Java tool set like application profiling remote debugging and the performance analyzer one can debug a Java application analyze the characteristics of the application and tune the application to make it as efficient as possible. Based on these application and environmental characteristics and other factors the best execution option must be carefully weighed for each application to achieve optimum Java performance.
|