Porting Central
Skip to main content

Porting Central

Performance tips

The performance of your application depends on many factors, such as the hardware system, the information to be processed, the algorithms, and the data structures. You can likely improve the run-time performance of your application with minimum changes to your source programs. The amount of improvement each tip provides depends on how your application is organized and on the functions and language constructs your application uses.

Some tips may provide substantial performance improvement to your application, while others may offer almost none. Some tips may contradict each other because they may trade off one resource for another. For example, one tip can be to reduce the size of the call stack by using static and global variables, while another tip is to improve processing startup performance by reducing the use of static and global variables. You should use performance analysis tools to find out where your performance problems are, and try different tips to achieve the best overall performance for your application.

On this page, I'll explain several different techniques to help speed up your iSeries application and its development.

  • Interactive vs Batch compile and run
  • Function inlining
  • Compile-time optimization
  • Performance Explorer

Interactive vs. batch

In the beginning, AS/400 and iSeries models were designed to have great interactive performance. With many thousands of attached terminals, this was an effective design. When IBM shifted its focus to e-business in 1995, this expensive capability was dropped from some iSeries models in favor of great server and batch performance. Today's 'full-range' iSeries cover the spectrum and still offer, to those that need it, amazing interactive performance.

First, some clear definitions. In iSeries terminology, interactive refers only to text-based (i.e. green-screen), 5250 display applications. Server or batch processing refers to the server side of client/server and network computing, or e-business. Basically, any application which does the real work on the iSeries and places the end-user interface on a separate client (Web browser, Java client, PC, middle-tier app server, etc) is considered server/batch processing.

Today's eServer i5 and iSeries are all server/batch models. Customers can purchase additional interactive performance features as needed with the iSeries Enterprise Edition.

Consequently, all modern significant workloads on a typical iSeries will perform best in batch mode. This is the iSeries equivalent of running as a 'service' on a Windows operating system or a background task on a UNIX operating system. How is this done? Read on...

To submit a compile to batch mode, use one of the following methods.

From an iSeries command line, submit the command to a new, batch job

  • SBMJOB CMD(crtcmod mod(hello) srcstmf('/home/bob/src/hello.c'))

From a PC or Unix command line, submit a remote command through the remote execution server

  • rexec MYAS400 crtcmod mod(hello) srcstmf('/home/bob/src/hello.c')

Also, when running a server application, always submit the job to run in the background (batch).

If you are using 'make' or 'gmake' from within a shell, your compiles automatically run in a batch mode, so this is a preferable method.

Function inlining

The INLINE compile time options request that the compiler replace a function call with that function's code. If the compiler allows the inlining to take place, the function call will be replaced by the machine code that represents the source code in the function definition.

Inlining improves the run-time performance of a C application. Inlining allows for an expanded view of the application for optimization. Exposing constants and flow constructs on a global scale allows the optimizer to make better choices during optimization.

Inlining helps eliminate function call overhead. The function call overhead is small compared to an external dynamic call. However, this overhead can become significant when functions are called many times.

For ILE, the default behavior of the C and C++ compilers is not to inline functions, so you may want to specifically tell the compiler to do so.

For more details, see the ILE C/C++ Programmer's Guide, Chapter 6: Improving Run-Time Performance.

Benchmark Center

IBM operates facilities specifically designed to prove real-world scalability. These facilities operate on a cost-recovery basis. You can bring in your application and prove that a specific configuration, with a specific level of processor, will perform as required for a customer situation. It's a particularly powerful selling tool, as there is no guesswork involved. Thousands of simulated concurrent "users" can be set up in a laboratory environment.

Compile-time optimization

A significant performance benefit is available in optimization of generated instructions. In general, debug options diminish as performance enhancements increase. Except during the final development stage, a lower level of optimization should be specified (for ILE, the 'OPTIMIZE' compiler option). For released programs, full optimization should be used.

With full optimization,during a debug session user variables may not be modified but may be displayed. The presented values may not be the current value of the variable. In addition, code is eliminated from procedure prologue and epilogue routines that enable instruction trace and call trace system functions. Eliminating this code enables the creation of leaf procedures. A leaf procedure is a procedure that contains no calls to other procedures. Procedure call performance to a leaf procedure is significantly faster than to a normal procedure.

The Performance Explorer

A software toolset available from IBM can help diagnose trouble spots in your programs. The performance measurement keyword ENBPFRCOL allows you to specify whether or not the ILE C or C++ compiler should generate code (sometimes called "performance hooks") into your compiled program or module. These performance hooks enable the Performance Explorer to analyze your programs. The default for this keyword specifies that program entry procedure level performance measurement code be generated into a module or program. Compiling performance collection code into the module and program will allow performance data to be gathered and analyzed. The insertion of the additional collection code will result in slightly larger module/program objects and may affect performance to a small degree.

For more details, see the ILE C/C++ Programmer's Guide, Chapter 6: Improving Run-Time Performance and Performance Tools for iSeries, Chapter 11. Performance Explorer!

In addition to support for ILE, you may also Add a Performance Explorer definition (ADDPEXDFN) to capture PASE events.

iDoctor for iSeries provides, among other things, a graphical client interface to analyze Performance Explorer data.