Porting Central
Skip to main content

Porting Central

iSeries performance tools for UNIX application developers

Charlie Quigg

Starting with Version 3, Release 1 of the iSeries an effort was made to incorporate many UNIX-type interfaces into the operating system, OS/400. This trend has continued with additional functions in subsequent releases of OS/400. One of the objectives of providing UNIX-type interfaces for the iSeries is to facilitate porting of applications from a traditional UNIX environment. Once an application is ported consideration is given to performance. Obviously a ported application is of less commercial value if it does not perform as well (or nearly as well) as on the source system. This paper discusses performance tools available to assist in determining how well an application performs on the iSeries.

Since the iSeries's heritage is not UNIX, any familiar UNIX performance tools will not be available. However, the iSeries has a wealth of performance tools and commands that provide similar information. This paper outlines the tools that are available on the iSeries and where possible characterizes them compared to UNIX tools. The objective is to give UNIX developers an idea of what iSeries tools to use for various types of analysis. A more in-depth comparison is provides for one tool: the iSeries performance explorer with statistical type data collection is compared to the UNIX prof command.

Overview of examined tools

I started my review by getting an idea of what performance tools are available on each of the respective systems. I quickly learned that on both UNIX and on the iSeries there is no shortage of available tools with to analyze performance! Further, I realized that although the respective tools are rather easy to run, the analysis of the output can be quite complex.

Types of performance analysis: Performance tools, irrespective of the operating system, can be categorized as follows:

  • System-level analysis includes surveying hardware utilization, response times and batch run times. These tools identify such things as bottlenecks, project workload growth and system upgrade alternatives. System-level performance optimization may be at the expense of individual applications.
  • Application-level analysis includes job traces, profiles and statistics measured at a transaction or batch job level. This analysis may identify modules or subroutines that are consuming the most resources.
  • Module/subroutine-level analysis is the most detailed. It includes identifying specific areas in code that are 'hot spots' that can be redesigned or recoded to improve efficiency.

UNIX application developers are likely most concerned with application-level and module/subroutine-level performance. For completeness, and since ported UNIX applications will likely share the system with other processes or jobs, discussion is also provided on system-level tools and commands.

NOTE:On the iSeries, commands typically are interactive and provide a snapshot of data; they don't provide any lengthy report. iSeries tools typically run in batch (i.e. background) mode and capture data in a report with is later analyzed. On UNIX this distinction seems less clear although some commands provide real-time data (e.g. vmstat with no flags) whereas other generate reports more similar to iSeries tools (e.g. tprof).

The following table shows the tools and commands I considered:

  iSeries UNIX
System-Level Tools Performance Monitor
Performance Advisor
Commands:
WRKSYSSTS
WRKACTJOB
WRKDSKSTS
WRKSYSACT
WRKACTGRP
vmstat
iostat
sar
ps
uptime
cron
Application-Level and Module-Level Tools Trace Job
Performance Monitor
Performance Explorer
Performance Analyzer
time
timex
prof
gprof
tprof

Many of the various tools and commands (for both iSeries and UNIX) are rich in function and have several options (e.g. what data to collect, what type of reports to generate). It is beyond the scope of this paper to provide a comprehensive review of all the iSeries tools and commands and their respective counterpart on UNIX, if any. See the iSeries references at the end of this paper for details.

iSeries performance tools

As the purpose of this paper is to provide UNIX developers some familiarity with iSeries performance tools and commands and some understanding of how they are similar or different from UNIX tools, I will briefly describe the iSeries tools and commands listed above and where appropriate draw a comparison to similar UNIX tools. The tools and commands discussed here are available in iSeries Version 3, Release 6 or a later release of OS/400.

NOTE: Many of the iSeries performance tools (as well as most other iSeries commands) may be accessed through a series of menus. To access all iSeries performance-related tools, enter 'go perform' from the command line.

Performance Monitor: The performance monitor assesses relative amount of resource used by different areas of the iSeries system. The data is collected by submitting a batch job. The performance monitor can collect the following types of data:

  • System data related to all jobs on the system, attached devices, storage pools, communications I/O processors, disk I/O processors, local workstation I/O processors and response time.
  • Communications data including system data mentioned above and statistics on communications protocols, including: X.25, Ethernet, token-ring, ISDN, and SDLC.
  • Database query data including statistics on executable statements.
  • Trace data including internal data from the microcode trace table.

The performance monitor may collect both system-level and application-level (trace) data. The packaging of the performance monitor is as follows: The base operating system (OS/400) contains the commands and programs necessary to collect data. To analyze the collected data, the Performance Tools for OS/400 licensed program must be installed.

Performance Explorer: The performance explorer is a data collection tool that helps the user identify the causes of performance problems that cannot be identified with data collected by the performance monitor. The explorer is most useful for application-level performance analysis.

The performance explorer is a combination of what in previous releases (e.g. V3R1) used to be Sample Address Monitor (SAM) and Timing Paging and Statistics Tool (TPST). The performance explorer command interface (for data collection) is included as part of the OS/400. The reporting capabilty is part of the Performance Tools for OS/400 licensed program.

The performance explorer is similar to the performance monitor in that both collect performance data. The main difference is that performance explorer provides a much greater level of detail. Unlike the monitor, the explorer allows you to specify areas of interest.

The performance explorer provides the following types of data collection:

  • Profile type identifies the relative CPU usage for each program or module procedure included in the explorer definition. Data is represented in a histogram of CPU used within the program. Within a module, profile data may be collected detailed to a statement or group of statements.
  • Statistical type collects statistics about one or more jobs on the system. The statistics may be either hierarchical or flattened. A hierarchical structure organizes the statistics into a call tree in which each node represents a procedure. A flattened structure organizes the data into a simple list of procedures, each with its own statistics.
  • Trace type collects very specific information about when and in what order system events occurred. Trace data collection is commonly used to get detailed information about I/O requests.

Running the performance explorer and specifying statistical type data provides roughly the same type of data as is provided with the UNIX prof command. This comparison is examined more closely later in this paper.

Performance Advisor: The advisor analyzes performance data collected with the performance monitor, and it can produce recommendations and conclusions to help improve performance. The analysis includes:

  • Storage pool sizes
  • Activity levels
  • Disk and CPU utilization
  • I/O processor utilization
  • Job exception conditions and excessive resource utilization

The advisor does not make specific recommendations for changing application programs to improve performance. The advisor is most useful for system-level performance analysis.

Comparison to the Performance Explorer: The explorer's main purpose is to collect specific data. To do this, it has its own collection facility. The advisor's role is assessing data collected by the performance monitor. It produces, after its analysis, a list of conclusions and recommendations on ways to improve performance. The explorer does not do any analysis. The advisor is included in Performance Tools for OS/400 licensed program.

Trace Job: There are two ways to trace a job on the iSeries system. One is with the TRCJOB command which is provided in the base operating system, OS/400. The second method is with the STRJOBTRC command which is provided as part of the Performance Tools for OS/400 licensed program. Both tools provide the same kind of data with minor differences in the formatting. Trace information is collected in two summary reports and one detailed report. The main objective of tracing a job is to determine what parts of a job use the most resources, and measure the effect of program changes compared to previously collected data. Tracing an iSeries job provides roughly the same type of data as is provided with the UNIX tprof command.

Performance Analyzer: The performance analyzer is a PC-based tool that provides a graphical representation of the same data as is provided by the job trace. (In fact the performance analyzer uses data collected by the job trace.) The graphical representation requires Client Access for OS/400 to connect the PC to the iSeries. Included in the five graphical views that are presented is a dynamic call graph. This diagram uses color and size to represent execution time performance and call frequency. Components that are called excessively appear as large red rectangles. The performance analyzer is available as IBM PRPQ 5799-PAC.

iSeries Commands: The following iSeries commands are run interactively and provide a snapshot of data.

WRKSYSSTS: The Work with System Status command allows you to work with information about the current status of the system. It displays (among other things) the number of jobs currently in the system, the percentage of machine addresses used, page fault rates and other statistical information related to each storage pool that currently has main storage allocated to it. Information is similar to that supplied by the UNIX vmstat command.

WRKDSKSTS: The Work with Disk Status command shows the system's disk activity and helps you determine the performance capabilities of your system's disk units. Information is similar to that supplied by the UNIX iostat command.

WRKACTJOB: The Work with Active Job command measures system performance by looking at aspects such as CPU utilization and response time. Since WRKACTJOB lists all jobs active on the system, its information is similar to that supplied by the UNIX ps command.

WRKSYSACT: The Work with System Activity command allows you to view and collect performance data in a real-time fashion. This data--which consists of CPU utilization, synchronous and asynchronous I/O counts, and more--is reported for any job or task that is currently active on the system. The WRKSYSACT command is packaged as part of the Performance Tools for OS/400 licensed program.

Performance Explorer vs prof

As previously mentioned, the iSeries performance explorer has three types of data collection: profile, statistical, and trace. This section examines a comparison between the performance explorer with statistical type data and the UNIX prof command. The purpose of both tools is to help spot the 'high use' procedures in an application in hopes of being able to redesign the application to use system resources more efficiently. I will outline the steps I used to collect the UNIX data with the prof command and the steps I used to collect the iSeries data with the performance explorer. I ran the same C language sample program on both systems. For each tool I outline the steps sequentially.

Notes:

  • The purpose of this comparison is to demonstrate similarities in the tools. This example in not intended to provide a performance comparison of the sample program running on the respective systems. That is, I did not attempt to isolate the application, optimize performance of the application, normalize the collected data, or take other steps one would normally use for a performance comparison for the a specified application.
  • The UNIX system I used was an IBM RS/6000 running AIX 4.1. The syntax (i.e. flags provided with the command) and semantics (specific data collected with the command) may vary from one UNIX system (and from one release) to another.

Collecting prof data:

  1. I compiled the sample program, filetest.c, with the cc command with the -p flag. (I did not specify an object file name, so cc generated a.out which is used by the prof command.) The -p flag causes the compiler to insert a call to the mcount subroutine into the object code generated for the program.
  2. Call the program (in this case a.out). The mcount subroutine is implicitly called when the executable program ends. Mcount creates and writes to a file mon.out. The prof command, then, interprets profile data collected by the mcount subroutine in the mon.out file for the specified program object.
  3. Then, to display the object file profile data I simply ran prof without any flags. The prof command displays the following for each call:
    • Percent of execution time
    • Total seconds spent for that call
    • Number of times the function was called
    • Average number of milliseconds per call

See listing 1 for the output produced for the sample program filetest.c.

Collecting Performance Explorer data:

  1. Compile an ILE C program using the CRTBNDC command. Although it was not necessary to create this sample program, CRTBNDC (and related CRTCMOD command) has an 'Enable Performance Collection' parameter to specify the type of data collected. This is similar to flags that may be specified with the UNIX cc command.
  2. Use the WRKJOB command to display the job name, user name, and job number. The combination of these three is roughly equivalent to a UNIX process ID. Although it's not necessary, I use this information for the performance explorer definition. It focuses the collection data for this job.
  3. Use the ADDPEXDFN command to add a performance explorer definition. Press PF4 to prompt on the command. I named my definition FILESTATS. You may want to write this down. I have found no easy way to retrieve the definition name if you want to reuse it later. For the profile type, specify *STATS. For the job information, use the job name, user name, and job number from the WRKJOB command. Enter an optional description for the definition. You may use default values for the other parameters.
  4. Use the STRPEX command to start the performance explorer. You will need to specify a session ID for this collection; I used FILESSN5. You will also need to specify the definition from step 3 above, e.g. FILESTATS.
  5. Call the program. Here I used 'call quigg/filetest'.
  6. Use the ENDPEX command to end the performance explorer. Specify the session ID to end, e.g. FILESSN5. Optionally, you may specify a member name where the data is to be stored. The member name defaults to the session ID.
  7. Use the PRTPEXRPT command. Although you may not necessarily print the performance explorer report, this command creates an iSeries spool file which may also be browsed.
  8. Use the WRKSPLF command to display or print the report.

Performance Explorer Data Compared to prof: When viewing the data, the first thing I noticed is that the performance explorer report is much longer than that created by the prof command! See listing 2. The first page of the report lists information about the definition used. It includes the following:

  • Name of the collect: FILESSN5
  • Type of the collection: STATISTICS
  • Selected job

The second page of the report lists runtime information, including start time, stop time, and total time. The third page summarizes CPU utilization, including the CPU used by this job. It's 89% for the job which ran the FILETEST program.

Now, I skip ahead to page 9, which lists statistics information similar to that obtained through the prof command. A detailed understanding of the functions used in the respective operating system is required to truly get an apples-to-apples comparison of the calls. However, for even the casual user there are some similarities in the two reports. Both reports show the following:

  • Number of times a function was called, labeled 'A' in each of the respective listings.
  • Time spent within a function, labeled 'B'. (prof records this time in seconds. The performance explorer records this time in microseconds.)
  • Percent of CPU used for a function, labeled 'C'. Two examples of similar calls are OS/400 *FSREAD and *FSWRITE which correspond to UNIX .read and .write respectively. They are labeled with 'D' and 'E' on the listings.

Methodology

My approach as I started to work on this project was two-fold: to learn what UNIX performance tools are available and widely used and to learn about iSeries performance tools. On the UNIX side, I did this with personal references from UNIX application developers to find out what tools they use. I complemented these references with other sources: a UNIX performance tools video, UNIX man pages, and articles and manuals. This provided me with a good basis of which tools to examine. I tested selected tools. On the iSeries side, I learned about available tools again through personal references, documentation (online and printed) and by testing the tools. Some of the comparative information was obvious to me by reading about and testing the tools. I validated the comparisons by consulting respective UNIX and iSeries developers who were more familiar than me with specified tools.

References

  • IBM iSeries Advanced Series Work Management, SC41-3306-01, IBM Corporation (1996); available through IBM branch offices.
  • IBM iSeries Advanced Series Performance Tools/400, SC41-4340-01, IBM Corporation (1996); available through IBM branch offices.
  • Bleeker, Troy, Picture This..., iSeries (magazine), May-June, 1996, p. 36-42.
  • IBM AIX 4.1 man pages (online help text for commands and utilities), 1995.
  • Skill Dynamics Education/Express UNIX Education Library, UNIX Performance Tools video, 1993.
  • IBM AIX Performance Tuning Guide, Versions 3.2 and 4, SC23-2365-04, IBM Corporation (1996); available through IBM branch offices.
  • IBM iSeries Performance Management, Versions 3 Release 1, GG24-3723-02, IBM Corporation (1995); available through IBM branch offices.
  • Lynch, Jaqui, A Guide to Diagnosing and Fixing Performance Problems in UNIX, Capacity Management Review, March, 1996, p. 1-10.

Acknowledgments

Many individuals provided assistance with this project. Assistance took many forms including helping me understand the performance tools, interpreting output from the tools, reviewing this paper for accuracy and moral support. Thanks to the following:

- Stacy Benfield (IBM, Rochester, MN)
- Helen Olson-Williams (IBM, Rochester, MN)
- Rob Soosaar (MKS, Waterloo, Ontario, Canada)
- Johnie Wardlaw (IMA, Irvine, CA)
- Mark Williamson (IBM, Rochester, MN)