One aspect of the AS/400 that users have come to take for granted is that it doesn't require a traditional database administrator (DBA). Most other relational database systems require a dedicated manager who spends most of his or her time performing complex, low-level system tasks (e.g., allocating disk space for database objects, partitioning and balancing disk drives for optimal performance, re-balancing system index structures). This person must also run utilities to check the integrity of the database and keep database statistics up to date for the query optimizer. The AS/400 performs most of these types of functions automatically.
In recent years, however, AS/400 database features and functions have evolved at a dizzying pace. The increased functionality of DB2 Universal Database (DB2 UDB) for AS/400 requires a reconsideration of historical assumptions about the database. Although it's true that the AS/400 doesn't require a traditional DBA, its new database capabilities present AS/400 shops with a potential need for some type of dedicated database manager.
The Good Ol' Days
Traditionally, an application programmer played the DBA's role in many AS/400 shops. When simple data models were built into an application's design, programmers could easily create database objects as they implemented their applications (instead of passing off requests to a DBA). Due to the AS/400 database's ease of use, scalability, and basic set of features, one person could easily keep track of all the issues and new enhancements associated with the AS/400 database and application development.
All this changed with V3R1, when IBM re-branded the database as DB2 for AS/400 and introduced database constraints (e.g., referential integrity), triggers, and stored procedures. Once implemented, some of these new features — especially triggers and constraints — could have a systemwide impact. Database triggers, for example, give you flexibility for solving business problems by letting you automatically have a program called every time a program or user inserts, deletes, or updates a row in one of your database files. But who determines whether you can afford the overhead of an external call to that trigger program every time the database file is touched? If some overnight database batch jobs can't afford this extra overhead, who will remove the triggers before the batch runs and reestablish them after the batch runs finish?
Many AS/400 programmers can understand and keep track of new database features, but can they do so and still keep up with AS/400 application development? Since V3R1, when IBM introduced ILE to help programmers better modularize their applications, AS/400 application developers have had more choices than simply RPG, Cobol, and DDS. Today, a typical developer must Web-enable existing solutions, investigate Java and Domino on the AS/400, and keep up with the latest tools and methodologies. With so much to do, few AS/400 programmers have time to master DB2 UDB. As the database becomes more independent of applications, IT shops must concentrate database knowledge in one or two people instead of scattering it across a host of application developers and system administrators.
The AS/400 DBA Skill Set
As a DBA, your required database skills vary depending on the types of solutions you use. But, in general, an AS/400 DBA needs some mix of the following abilities:
I explain these skills from an SQL perspective because SQL is the industry standard for database access and the AS/400's strategic interface for DB2. However, most of my points also apply to the AS/400's native database interface.
Logical and physical data modeling and database design means creating an efficient physical database design from a logical database model and application specifications. Before you code the database definitions, you must design the database properly so applications and solutions can operate on it. Usually, you start by forming a logical data model that covers the entire enterprise. Then, you use this model to determine where new business data fits and to help modelers design a database where different applications can work together. A fundamental part of the data model is the identification of domains (data types); but in the logical model, domains shouldn't be concerned with physical storage properties such as length. Instead, range, precision, and units are more appropriate.
After you normalize the logical database design to tighten it and eliminate redundant data, you must transform it into the physical database design. Here, a DBA must be familiar with how data will be accessed and understand the architecture of the database on which the data will reside. Without getting down to bits and bytes, it's important to understand (at least conceptually) how different data types are implemented on the AS/400. For example, the implementation of variable-length character columns (e.g., VARCHAR) varies widely across databases, and it has a direct impact on storage and performance. Many new AS/400 application providers use VARCHAR columns widely on other databases, but to achieve real space savings with VARCHAR columns in DB2 for i, you should use them only for long descriptions and memo fields. Using VARCHAR for every character-based column adds unnecessary overhead to your database.
A DBA with data-modeling expertise can improve the basis (i.e., the schema) upon which all applications ultimately rest. Application developers can then simply be "consumers" of subschemas for the data they need to store and access.
Database creation, maintenance, and management means transforming the physical database design into the physical database objects and then managing those objects. As I mentioned earlier, AS/400 DBAs don't need to perform many mundane, low-level tasks because DB2 UDB takes care of them automatically. However, someone must manage the database definition source and verify that all the database objects are created successfully with the proper attributes.
Database objects are no longer limited to tables and indexes; many applications now rely on other database objects such as triggers, constraints, stored procedures, user-defined types, and user-defined functions. All these objects are related, so there are dependencies on the order in which you create them (e.g., you need to create a table before creating a constraint or trigger for it) and the library (or collection) where you create them. You must keep these dependencies and relationships intact as you move applications from development into production. For example, does your production-level stored procedure access the production-level database tables, or is it still working (erroneously) on the development version? Someone must control and verify this process.
Because application designs and requirements constantly change, you should dedicate someone to managing changes to database objects after they're created. This maintenance might involve adding, removing, or changing a column for a table and then determining which application programs (if any) must be changed to accommodate the database change. You also need someone to verify that related database objects such as triggers, constraints, and indexes are still intact after database modifications.
Metadata management means collecting, storing, managing, and querying data about data in the database. The core metadata for DB2 UDB is stored in the system-maintained database catalogs, which keep information about all database objects on the AS/400. This metadata is essential for helping someone manage dependencies between database objects by running reports to determine what constraints or indexes are defined over a table. You can also reference the catalogs to determine the data type or length for a particular column. A data warehouse environment requires even more metadata management because information about business data is key to making data easier for end users to find and understand.
Database performance monitoring, tuning, and management means monitoring database performance and improving it by making changes to the database, application logic, or system configuration. To manage performance, an AS/400 DBA must understand the database engine and query optimizer and know about available tools.
The database engine's internal workings and nuances are critical to performance. For example, creating open data paths (ODPs) with DB2 UDB is very expensive, so an AS/400 DBA must constantly look out for nonreusable ODPs that result in the unnecessary and repetitive creation of an ODP. In V4R4, IBM added the query performance tuner (QAQQINI configuration file), which provides an interface for controlling options that impact the performance behavior of database requests. DB2 for i does a pretty good job of optimization, but the more you learn about the query optimizer, the better you can tune performance. A DBA must understand all the access methods (e.g., full table scan) the query optimizer can employ and the factors (e.g., memory size, data skew, parallel processing degree) that cause the optimizer to choose one method over another.
IBM packages several useful tools with DB2 for i: two database performance monitors (the STRDBMON, or Start Database Performance Monitor, command and the Operations Navigator SQL Performance Monitor), the explain utility (the PRTSQLINF, or Print SQL Information, command), and the optimizer debug messages (the STRDBG, or Start Debugger, command). There's also more advanced database performance tools from third parties such as Database Essentials from Centerfield Technology. All these tools provide information about how a database request was implemented and why, including information about temporary indexes and join order.
Indexing strategy and management means creating and managing indexes for high performance. If your database doesn't have the proper indexes created, query performance can suffer because the AS/400 query optimizer is very dependent on statistics in indexes. Indexes can, for example, tell the optimizer how many distinct values a column contains (i.e., data skew or cardinality) and approximately how many rows a certain search value will return. You want the query optimizer to have enough information to make intelligent decisions about query implementations.
Due to the resources needed to maintain indexes, creating too many of them can cause performance problems. Index maintenance slows down inserts and updates for an index's underlying table, so you must consider maintenance each time you consider creating an index (rather than using an existing one) for query tuning. A DBA must control and track index creation and know when and how to use parallel processing when creating indexes over large tables or maintaining indexes during bulk data loads.
Furthermore, a DBA may now choose between multiple types of database indexing technology. In V4R3, IBM introduced encoded vector indexes (EVIs) to complement the traditional AS/400 indexing technology, which is based on a binary radix tree. EVIs can improve query and report performance, so you should create EVIs over columns with a relatively low number of distinct values. However, EVIs are meant to be complementary; it's not a good idea to simply replace all your existing indexes (and keyed logical files) with EVIs. Instead, a DBA who understands how the query optimizer can use EVIs should determine when to use them rather than traditional AS/400 indexes.
To be most effective with database performance and index management, it is strongly recommended that the DBA attend the "DB2 for i SQL and Query Performance Tuning and Monitoring Workshop" offered by IBM Learning Services (course number S6140)
SQL code reviews means reviewing SQL access and host-language programs and recommending changes for optimization before those programs go live. By tuning poorly written SQL immediately, a DBA can, for example, identify code that will cause ODPs to be non-reusable or queries that may require new indexes to be created. SQL code reviews are proactive performance tuning; they let you transfer a DBA's knowledge of the database engine and optimizer into the development process to nip performance problems in the bud.
Database security and integrity means designing and implementing a database so that only accurate data is added and only authorized users have access to the data. The DBA should work with the security officer or administrator to develop a security architecture. Then, the security administrator grants and revokes authorities and performs other tasks to control access to database objects.
The increasing popularity of interfaces such as Java Database Connectivity (JDBC) and ODBC, which make it easy for PC clients to access your AS/400 databases, underscores the need for object-level database security. Someone in your shop should be dedicated to managing the security requirements of new database objects such as triggers, stored procedures, and user-defined functions. Although many legacy AS/400 applications use menu-based controls (instead of object security) to control database access, the DBA must specify stored procedures (and, where necessary, appropriate adopted authority) to implement secure access not definable by views or column-level security.
Once data is secure, someone must keep it accurate and ensure that new data being added is valid. With commitment control, you can determine whether users can see business transactions in process or completed transactions only. For example, if you transfer $100 from savings to checking, your bank's update program may use commitment control levels to lock the $100 until the transaction is completed. If the bank loses power before your checking account is credited, commitment control could automatically "roll back" your $100 savings withdrawal. This is a simple example, but be aware that a DBA must often determine a level of commitment control that satisfies both the concurrent data-access and performance requirements of an application.
Another database feature you can use to improve your data's integrity is database constraints, which let you integrate some of your business rules (e.g., each order must have a unique order number) and enforce them on every system interface that can change your data. Application-enforced integrity, by contrast, is bypassed any time that the application isn't used to change the data (e.g., Data File Utility, ODBC). DB2 for i offers four types of constraints: primary key, unique, referential integrity, and check constraints. The DBA must implement the constraints, clean up any data-integrity issues they detect, and ensure that they're always in a valid state.
Many AS/400 databases were designed to be updated by a single set of application programs. For such databases, the application performed all integrity checks (e.g., Is a state abbreviation valid?). As systems evolve, however, databases are being accessed through many new mechanisms: ODBC, Java programs, Web pages, and more. Thus, it's important to migrate as many RPG integrity checks as possible into the database itself via constraints and referential integrity. Note that in cases where multiple applications can update the same data, all the applications must explicitly enforce integrity checks - a high-maintenance, error-prone approach that still may not cover all interfaces.
Database backup and recovery means implementing a database backup and recovery strategy based on data volatility and application availability requirements to protect your business from unexpected hardware and software failures. The AS/400 database has extensive backup and recovery facilities, including object-level backups, online backups, journaling (logging), and point-in-time recovery. The DBA must design a process that uses the right combination of tools to back up and recover your database.
Businesses that can live with a couple of hours of downtime can simply use the normal save/restore process. But companies with mission-critical applications require a clustered, high-availability solution that relies on journaling of database objects. Once the design is in place, the DBA helps implement and maintain the backup process, ensuring that all database objects are backed up. This verification process used to be straightforward because you only had to be concerned about the tables (physical and logical files), indexes (access paths), journals, and journal receivers. But now, a DBA must also make sure database objects such as triggers and stored procedures are backed up.
V4R4 further complicates backup and recovery with more objects (e.g., user-defined types and functions) and features such as datalinks, which let you logically link database tables to complex data forms (e.g., digital images) possibly stored in the AS/400 integrated file system (IFS). If a table has a datalink column, the DBA must back up the table and all related objects in the IFS to have a complete backup of that database table. When journaling is needed for a database object, the DBA must verify that the object is being journaled and manage the associated journal receivers. The DBA also must periodically test the database recovery process to verify that the proper controls and backups are in place.
General systems management and networking skills means controlling and configuring system resources that affect database access and database performance. Usually, the system administrator is responsible for systems and network management configurations. The DBA, however, must understand how these configurations can affect the database engine and be able to discuss these issues with the system administrator.
Memory pools, for example, affect database performance directly, so a DBA must be knowledgeable about AS/400 work management to set up these pools properly. I recall one AS/400 shop that tried to solve its ODBC performance problems by adding 4 GB of memory. This additional memory should have paid performance dividends immediately, but because the system manager didn't understand AS/400 work management, he allocated the new memory to the wrong pool.
New database interfaces such as Distributed Relational Database Architecture (DRDA) and ODBC require a DBA who can perform basic networking tasks. What if a user's PC database access stops functioning? The DBA must at least be able to check the state of the database-related host communication servers, and he or she might be responsible for starting those servers. In addition, these new interfaces require configuration of data sources and relational database directory entries to function. Some configuration options tie directly to database performance, so a DBA is the right person to handle these tasks.
Finding the Right Mix
It's easy to see that a wide variety of AS/400 database issues and functions require attention. Yes, companies that have an AS/400 running their business from a backroom can still say, "No DBA required." Such companies don't need to hire someone to make sure their database objects have enough storage or run tools to check the integrity of database structures.
However, AS/400 shops that are developing their own applications, deploying a data warehouse or business intelligence solution, or closely supporting a data-centric packaged solution (e.g., an ERP solution) must address the DBA role in some way. Each company's mix of applications and data is unique, so each company will require a different assortment of the skills I outlined here. Having a DBA with the right mix of these skills for your organization is the key to getting a maximum return from DB2 UDB for AS/400.
This information first appeared in the May 2000 issue of NEWS/400
Kent Milligan is a DB2 UDB Technology Specialist in PartnerWorld for Developers, AS/400 Kent spent the first eight years of his IBM career working on database enhancements, including parallel database technologies and database constraints, as a member of the DB2 UDB development team in Rochester. He is a regular author and frequent speaker on AS/400 relational database topics