Many of us have been around the IBM i and its predecessors for a long time. When a product has been used for many years, it's easy to take some of its attributes and strengths for granted. The ease of use and low ownership costs of DB2 for i (DB2 UDB) are two attributes that many IBM i users take for granted instead of seeing DB2 UDB as a competitive edge.
Ease of use and low administrative costs are obvious requirements that need to be met to classify a relational database system (RDBMS) as a self-managing system. And while DB2 UDB provides graphical interfaces to simplify administrative tasks, its database engine actually automates many tasks to eliminate the need for human intervention. IBM's recently announced Project eLiza lets us step back and focus on the attributes of a self-managing system. Project eLiza's objective is to make the e-business infrastructure (in particular IBM eServer), storage, and software a self-managing system, one that can "take a lickin' and keep on tickin'" with minimal human intervention. DB2 UDB's self-managing features align it well with Project eLiza's initiatives, which I cover here in detail.
The IBM i integration of the DB2 UDB engine eliminates two tasks associated with database products on other server platforms. Other server platforms require that the relational database software be installed on the server. Before the relational database software can be installed, the administrator must first verify that the database software version is compatible with the operating system version installed on the server. With IBM i, DB2 UDB is pre-loaded on the server eliminating the need for product installation. In addition, the DB2 UDB version loaded onto the IBM i always matches the OS/400.
Another configuration step that's not needed on DB2 for i is the allocation of storage for database objects such as tables and indexes. For instance, the Create Table SQL statement is all that is needed to create and allocate space for a table on the IBM i. Other database products requires storage (i.e., table space) to be allocated on a prior step before the Create Table statement can be executed. As data is added to the table and index objects, other databases need to be continually monitored to verify if additional storage needs to manually added to the pre-allocated storage space for a database object. In contrast, DB2 for i automatically allocates additional storage for database objects as it's needed without user intervention. DB2 for i goes one step further by also spreading the storage for a single database object across multiple disk drives - this automatic data spreading eliminates disk hot spots by spreading the load across all disk drives on the systems. Database administrators on other systems have to manually configure the database to get an even distribution of the data across the disk units.
The IBM i database communication servers that are needed for ODBC and JDBC middleware access are started automatically when TCP/IP is activated on the IBM i server - another database configuration step performed by the system itself.
DB2 for i also interrogates the national language setting (English, French, etc) of an IBM i server. Based on this language setting, DB2 UDB automatically adjusts to use the character set, date format, etc that is associated with that language.
As data in a database is changed and deleted the associated indexes are updated to reflect the data changes. Depending on the order that data is added and changed, the underlying data structure (typically based on a binary tree) can become unbalanced. On other systems, an administrator would manually re-balance the underlying data structure by reorganizing the system. On IBM i, the database engine is continually monitoring the underlying index structure and rebalancing the index structure internally when necessary. This allows your applications to continue running without having to take the system down for index reorganization.
Database healing is also needed after an abnormal system termination. Although system failures are not a common occurrence with the IBM i, DB2 for i is fully equipped to address the situation. When a crashed IBM i server is restarted, DB2 UDB will automatically restore the database to a consistent state during the IPL of the system.
To speed database recovery after an abnormal system failure, the IBM i also provides system-managed access path protection (SMAPP). Access paths (i.e., indexes and keyed logical files) need additional protection because if indexes are in flux and the corresponding changes are not written out to disk when the system crashes, then the entire index will have to rebuilt after the system is restarted. Recovery can be delayed substantially if there are just a couple indexes that need to be rebuilt over the largest tables (or physical files). Access paths can be journaled (logged) explicitly by an administrator to prevent the indexes from being rebuilt, but that doesn't fit the model of a self-managed system. With SMAPP activated, the database engine automatically will selectively start journalling only those access paths that it has determined to be at risk and that it determines are worth protecting. This risk determination occurs is based on a user-specified value (set with EDTRCYAP CL command) which identifies an upper limit regarding how much time they would be willing to spend on access path recovery phase of a system restart (IPL). To make this determination, the database engine continually monitors change activity for all indexes on the system. Those indexes that have a high degree of activity that are not yet journalled and are of sufficient size to justify the extra run-time overhead are then automatically protected by starting journaling on their behalf. The system "automatically" protects at-risk access paths instead of relying on a human to intervene and start journaling.
Journaling is the primary method that is used on the IBM i to ensure a safe and fast database recovery after a system crash. IBM i provides another self-healing feature, system-managed journal receivers, to avoid database processing delays and to reduce the administrative requirements of journaling. The journal receiver is the object that actually stores the logged database changes - over time the size of the journal receiver object will grow as more and more changes are made to the data. When the journal receiver approaches the maximum receiver size, than a new receiver object needs to be attached to the journal. If not, a hard error will stop the database processing on your system. The system-managed journal receiver option (MNGRCVR (*SYSTEM) on CRTJRN & CHGJRN commands) prevents this hard error from occurring by automatically attaching a new journal receiver once the existing journal receiver nears the maximum size limit. An administrator no longer has to intervene to check on the receiver size or to attach a new journal receiver, the system handles it.
The cost-based query optimizer of IBM i is a key component of the self-tuning aspects of the DB2 UDB engine. The integration of the IBM i query optimizer's costing algorithms allows it to automatically adjust to system resource changes. This allow new system resources such as additional processing power or additional memory to be fully utilized by the DB2 for i engine to achieve the fastest performance. For instance, if the IBM i capacity on demand capability has been used to enable additional processors on a server, the DB2 optimizer will immediately recognize the new processing power and attempt to fully utilize it.
To benefit from recent database changes (eg, new index available) or system changes (eg, more memory), other databases require a bind operation to be performed by the administrator to force the possible update of an access plan for an SQL statement. Bind operations are usually only required for programs with SQL embedded in a high-level language program. DB2 for i does not even include a bind utility or bind operation since it automatically recognizes these changes and updates the access plan when it's appropriate. This is sometimes referred to as late-binding or automatic binding which is performed on the IBM i for both embedded SQL and dynamic SQL (ODBC, JDBC, etc). DB2 UDB does analyze the changes with some intelligence prior to rebuilding any access plans, so that DB2 UDB is not continuously rebuilding access plans. For example, a memory pool grows in size from 1.2 GB to 1.4 GB is a resource change that will probably not make a significant improvement in the performance of an SQL statement, so in this case DB2 for i decides not to update any existing access plans.
When an SQL statement is executed on the IBM i, it requires both an access plan and an open data path (ODP). The access plan contains information on how the SQL statement will be implemented (eg, table scan) and the ODP is the pipe used to get the data in or out. The IBM i SQL engine automatically tries to improve performance of frequently executed SQL statements on the server by caching their access plans and open data paths. Instead of spending system resources to create an access or ODP, the cached objects are reused as much as possible. Open data paths are cached at the job (or connection) level. Access plans for dynamic SQL interfaces are also cached at the job (or connection) level as well as a system-wide level. The storage areas used for all this caching are automatically allocated and populated by DB2 for i. Most other databases require an administrator to carve out storage for the database to use for caching.
Statistics collection and maintenance is another task automated by DB2 for i to improve database performance. On other databases this task is not automatic and requires manual intervention to create and refresh statistics. Cost-based optimizers rely on index and table statistics to make educated decisions when trying to determine the fastest way to implement a query request. Cardinality (number of distinct values) is a statistic to used by the query optimizer to better understand the data. For instance, if an index exists over the state column the optimizer can use the cardinality statistic to determine how many state values are stored in your database. If all 50 states are represented in your data then that could bias the optimizer a certain way to search the column. A smaller number of state values like 5 could bias the optimizer to search the column in a different way. DB2 for i is unique in that it automatically updates all of the table and index statistics current as rows are being inserted, updated, and deleted. This automatic maintenance ensures that the query optimizer is working with the most current statistical values. Other databases rely on some type of statistics collection utility to refresh the statistic values. If an administrator forgets to perform statistics collection or the statistics collection is not performed often enough, then performance may suffer because the query optimizer will be making decisions based on old data. If the state column only contained 5 state values six months ago, but now contains 40 state values then the optimizer can easily be making the wrong implementation choice if it still thinks the state column only contains 5 distinct values. DB2 for i continual maintenance of statistics ensures that the query optimizer is operating with the latest statistics - without administrator assistance.
Earlier it was pointed out that DB2 for i is able to automatically react to changes system resources such as an additional CPU to improve performance. DB2 UDB also has the ability to dynamically adjust how it's using an existing system resource (memory) in an effort improve database performance. This self-tuning database feature is known as expert cache and activated by changing the paging option for a memory pool from *FIXED to *CALC (CHGSHRPOOL... PAGING(*CALC)). Once the expert cache option has been activated, the DB2 UDB engine monitors and analyzes the access of database objects. If it detects that every row in a table that is being read sequentially, it will increase the internal blocking size to bring larger portions of the table into memory in an effort to reduce the total number of disk operations. The DB2 UDB engine can also detect if a range of rows in the table is being accessed frequently. If that type of data access is detected, then DB2 UDB will keep the memory pages associated with those rows in memory longer. This action once again reduces the number of disk operations which usually results in improved performance. The only administrative requirement is to activate the expert cache - DB2 for i analyzes and adjusts the database I/O access patterns all by itself to tune the performance of your IBM i server.
DB2 UDB stands out from the rest
DB2 UDB distinguishes itself from other database servers with an architecture and design that allows it look after itself instead of relying on the glamour of fancy graphical interfaces. This article should give you a new appreciation for all of the work that DB2 for i does behind the scenes. Behind the scenes work that no one notices is a big part of IBM's Project eLiza. DB2 for i has a huge advantage in support of Project eLiza due to the underlying foundation of the IBM i. And DB2 for i will continue to add more self-configuring, self-healing, and self-tuning features in the future.
This article appeared in the July 2002 issues of IBM i NEWS (link resides outside of ibm.com)