Skip to main content

Business Intelligence

Indexing and statistics strategies for DB2 UDB for iSeries

Part 1: The Basics

Before describing indexing strategies and options for index creation, it is important to understand: what indexes are, what purposes they serve in DB2 UDB for iSeries, and their relationship to the query optimizer.

Database Indexes Defined
All relational database management systems (RDBMSs) have data structures called indexes. An index in a book allows you to quickly locate information on a specific topic without sequentially paging through the book. Database indexes provide similar benefits by providing a relatively quick method of locating data of interest. Without indexes, the database will probably be forced to perform a full sequential search or scan, accessing every row in the database table. Depending on the size of the tables and the complexity of the query, a full table scan can be a lengthy process, while also consuming a large amount of system resources.

Indexed scans are more efficient than full table scans since the index key values are usually shorter than the length of the database table row. Shorter entries means that more index entries can be stored in a single page. Indexing results in a considerable reduction in the total number of pages that must be processed (I/O requests) in order to locate the requested data. While indexed scans can improve performance, the complexity of the query and the data will determine how effectively the data access can be implemented. Different queries stress the database in unique ways and that is why different index types are needed to cope with ever-changing workloads of users. In addition to simply retrieving data more efficiently, indexes also assist in the ordering, grouping, and joining of data from different tables.

With DB2 UDB for iSeries, there are two kinds of persistent indexes: binary radix tree indexes, which have been available since AS/400 systems began shipping in 1988, and encoded vector indexes (EVIs), which became available in 1998 with OS/400 Version 4 Release 3. Both types of indexes are useful in improving performance for certain kinds of queries. This paper will explain the differences between the indexes and provide advice regarding when and how to use them.

Primary and Secondary Indexes — One and the Same
On platforms that rely on partitioning schemes, the database must distinguish between primary and secondary indexes. Primary indexes are those created with the partitioning key as the primary key. Secondary indexes are built over columns other than the partitioning key. On these platforms, primary indexes provide the majority of data retrieval.

Because of the integrated storage management and single level storage architecture on iSeries servers, there is no data partitioning on a single server; the data is automatically spread across all available disk units. One result is that all indexes are effectively primary indexes. In fact, there is no distinction between primary and secondary indexes. There is also no concept of a clustered index — where the table data is kept in the same physical order as the primary index key(s).

The net result of the integrated and automatic storage management system is that there is no need to consider the physical placement of tables or indexes on an iSeries server.

Binary Radix Tree Indexes
All commercially available RDBMSs use some form of binary tree index. A radix index is a multilevel, hybrid tree structure that allows a large number of key values to be stored efficiently while minimizing access times. A key compression algorithm assists in this process. The lowest level of the tree contains the leaf nodes, which house the address of the rows in the base table that are associated with the key value. The key value is used to quickly navigate to the leaf node with a few simple binary search tests.

fig 1

Thus, a single key value can be accessed quickly with a small number of tests. This quick access is pretty consistent across all key values in the index since the server keeps the depth of the index shallow and the index pages spread across multiple disk units.

The binary radix tree structure is very good for finding a small number of rows because it is able to find a given row with a minimal amount of processing. For example, using a binary radix index over a customer number column for a typical OLTP request, such as "find the outstanding orders for a single customer," will result in fast performance. An index created over the customer number field would be considered the perfect index for this type of query because it allows the database to zero in on the rows it needs and perform a minimal number of I/Os.

In business intelligence environments, database analysts do not always have the same level of predictability. Increasingly, users want ad-hoc access to the detail data underlying their data marts. They might, for example, run a report every week to look at sales data, then "drill down" for more information related to a particular problem area they found in the report. In this scenario, the database analysts cannot write all the queries in advance on behalf of the end users. Without knowing what queries will be run, it is impossible to build the perfect index.

Traditionally, the solution to this dilemma has been to either restrict ad-hoc query capability or define a set of indexes that cover most columns for most queries. With DB2 UDB for iSeries, the optimizer can intelligently use less-than-perfect indexes for many types of queries. But as the size of data warehouses grows into the terabyte range, less than perfect becomes less palatable.

Experts throughout the industry have recognized this less-than-perfect solution, and have developed new types of indexes that can be combined dynamically at run-time to cover a broader range of ad-hoc queries.

Bitmap Indexes — an Industry Solution for Ad-Hoc Queries
The need for newer index technologies has spawned the generation of a variety of similar solutions that can be collectively referred to as "bitmap indexes." The concept of a bitmap index is an array of distinct key values. For each value, the index stores a bitmap, where each bit represents a row in the table. If the bit is set on, then that row contains the specific key value.

fig 2

With this indexing scheme, bitmaps can be combined dynamically using Boolean arithmetic (ANDing / ORing) to identify only those rows that are required by the query. Unfortunately, this improved access comes with a price. In a Very Large Database (VLDB) environment, bitmap indexes can grow to ungainly sizes. In a one billion row table, for example, there will be one billion bits for each distinct value. If the table contains many distinct values, the bitmap index quickly becomes enormous. Usually, RDBMSs rely on some sort of compression algorithm to help alleviate this growth problem.

In addition, maintenance of very large bitmap indexes can be problematic. Every time the database is updated, the system must update each bitmap — a potentially tedious process if there are, say, thousands of unique values in a one billion row table. When adding a new distinct key value, an entire bitmap must be generated. These issues typically result in the database being used as "read only."

Encoded Vector Indexes — an IBM Solution for Bitmap Indexes
Realizing the limitations of bitmap indexes, IBM Research set out to find a better solution. The result is Encoded Vector Indexes (EVIs) — a new, patented indexing technology from IBM. DB2 UDB for iSeries is the first member of the IBM DB2 family to provide EVIs.

An EVI is a data structure that is stored as basically two components: the symbol table and the vector. The symbol table contains a distinct key list, along with statistical and descriptive information about each distinct key value in the index. The symbol table maps each distinct value to a unique code. The mapping of any distinct key value to a 1-, 2-, or 4-byte code provides a type of key compression. Any key value, of any length, can be represented by a small bytecode.

The other component, the vector, contains a bytecode value for each row in the table. This bytecode represents the actual key value found in the symbol table and the respective row in the database table. The bytecodes are in the same ordinal position in the vector, as the row it represents in the table. The vector does not contain any pointer or explicit references to the data in the table.

The optimizer can use the symbol table to obtain statistical information about the data and key values represented in the EVI. If the optimizer decides to use an EVI to process the local selection of the query, the database engine uses the vector to build a dynamic bitmap, which contains one bit for each row in the table. The bits in the bitmap are in the same ordinal position as the row it represents in the table. If the row satisfies the query, the bit is set on. If the row does not satisfy the query, the bit is set off. The database engine can also derive a list of relative row numbers (RRNs) from the EVI. These RRNs represent the rows that match the selection criteria, without the need for a bitmap.

fig 3

As with traditional bitmap indexes, the DB2 UDB dynamic bitmaps or RRN lists can be ANDed and ORed together to satisfy an ad-hoc query. For example, if a user wants to see sales data for a certain region during a specific time period, the database analyst can define an EVI over the Region column and the Quarter column of the database. When the query runs, the database engine will build dynamic bitmaps using the two EVIs and then AND the bitmaps together to produce a bitmap that represents all the local selection (bits turned on for only the relevant rows). This ANDing capability effectively utilizes more than one index to drastically reduce the number of rows that the database engine must retrieve and process.

Since EVIs were created primarily to support business intelligence and ad-hoc query environments, there are EVI creation and maintenance considerations, as well as recommendations — both of which will be covered in later sections.

How the Database Uses Indexes
OS/400 is an object-based operating system. Tables and indexes are objects. Like all objects, information about the object's structure, size, and attributes are contained within the table and index objects. In addition, tables and indexes contain statistical information about the number of distinct values in a column and the distribution of those values in the table. The DB2 UDB for iSeries optimizer uses this information to determine how to best access the requested data for a given query request.

For database designers and analysts coming from other platforms, this means that most of the information about tables and indexes is maintained in the object itself or derived in real time from the database objects. Unless specified explicitly by an analyst, indexes are maintained immediately, so the optimizer always has current information about the database.

Because the information that the optimizer needs is gathered automatically by the database engine, and maintained in the database objects themselves, administrators should never have to manually compile statistics for the optimizer. When the optimizer prepares an access plan for a query, it consults the objects themselves and gathers information about: the size of the object, the relevant indexes built over that object, and the columns that are required to run the query. Like other RDBMSs, the system saves its access plans for reuse whenever possible, but also has the ability to automatically change the access plan if the environment changes. This is sometimes referred to as "late binding."

Most importantly, the database engine can use indexes to identify and retrieve rows from a database table. As in any RDBMS, the optimizer can choose whether or not to use an index to identify and retrieve rows from a table. In general, the optimizer chooses the index that will efficiently narrow the number of rows matching the query selection, as well as for joining, grouping, and ordering operations. Put another way, the index is used to reduce the number of I/Os required to retrieve the data from disk, and to logically group and order the data.

Within Version 5 Release 2 of OS/400, DB2 UDB for iSeries incorporated enhancements to gather and maintain statistical information in new ways. Additional details about these enhancements can be found later in this paper (in Part 4: Statistics Manager, and in Part 5: Statistics Strategies for Performance Tuning).

Indexes and the Optimizer
In order to process a query, the database must build an access plan. Think of the access plan as a recipe, with a list of ingredients and methods for cooking. The optimizer is the component of the database that builds the recipe. The ingredients are the tables and indexes required for the query. The optimizer looks at the ingredients it has available for a given query, estimates which ones are the most cost-effective, and builds a set of instructions on how to use the ingredients. On an iSeries server, lower level database engine components do the cooking.

Since the iSeries optimizer uses cost-based optimization, the more information given about the rows and columns in the database, the better able the optimizer is at creating the best possible (least costly / fastest) access plan for the query. With the information from the indexes, the optimizer can make better choices about how to process the request (local selection, joins, grouping, and ordering).

The primary goal of the optimizer is to choose an implementation that quickly and efficiently eliminates the rows that are not interesting or required to satisfy the request. Normally, query optimization is concerned with trying to find the rows of interest. A proper indexing strategy will assist the optimizer and database engine with this task.

To understand indexing strategy, it is important to understand the science of query optimization and the possible implementation methods. With the limited scope of this paper, the implementation methods will only be covered at a high level, tying the use of indexes to the respective methods.

Here is an overview of the available implementation methods:

  · Selection
        · Table Scan
        · Table Probe*
        · Index Scan* (also known as index selection)
        · Index Probe* (also known as key row positioning)
  · Joining
        · Nested Loop Join via Index *
        · Nested Loop Join via Hashing (also known as hash join)
  · Grouping
        · Grouping via Index*
        · Grouping via Hashing
  · Ordering
        · Ordering via Index*
        · Ordering via Sort

* The method directly or indirectly relies on an index for implementation.

DB2 UDB for iSeries also supports parallelism when the optional OS/400 feature "DB2 Symmetric Multiprocessing" is installed. Parallelism is achieved via multiple tasks or threads that work on part, or the entirety, of the query request. Most, but not all, of the implementation methods are parallel-enabled.

Selection
The main job of an index is to reduce the number of I/Os that the database must perform. This is the first and most important aspect of query optimization. The sooner a row can be eliminated, the faster the request will be. In other words, the fewer number of rows the database engine has to process, the better, since I/O operations are usually the slowest element in the implementation.

For example, look at the following query that is asking for some customer information on orders (where the order was shipped on the first day of the month and the order amount is greater than 1,000):

 SELECT CUSTOMER_NAME, ORDERNUM, ORDERDATE, SHIPDATE, AMOUNT
	    FROM ORDER_TABLE
	    WHERE SHIPDATE IN ('2000-06-01', '2000-07-01', '2000-08-01')
	    AND AMOUNT > 1000

If the table is relatively small, it will not make much difference how the optimizer decides to process the query. The result will return quickly. However, if the table is large, choosing the appropriate access method becomes very important. If the number of rows that satisfy the query is small, it would be best to choose an access method that logically eliminates the rows that do not match, as quickly and efficiently as possible. This is where indexes are valuable.

To decide whether or not an index would help, it is important to have an estimate of the number of rows that satisfy this query. For example, if 90% of the rows satisfy this query, then the best way to access the rows is to perform a full table scan. But if only 1% of the rows satisfy the query, then a full table scan might be very inefficient, resource intensive, and ultimately slower. In this case, an index-based, keyed-access method would be most effective.

On an iSeries server, the optimizer estimates the number of rows that satisfy this query by looking at the query request, table attributes, and information in the indexes. Although the header information in each table contains information about the number of rows in the table, the header information will not tell the optimizer about the number of distinct values in an index or the number of rows that may contain a particular value.

Instead, indexes are used to derive this type of information. Both radix and encoded vector indexes contain information about the number of distinct values in a column and the distribution of values. DB2 UDB for iSeries is one of the few databases that can recognize data skew during optimization.

With radix indexes, the optimizer obtains cost information from the leftmost, contiguous keys in the index. In addition to knowing how many distinct values are in each column, indexes provide cross-column cardinality information. That is, the optimizer can look at the leftmost columns in an index and determine how many distinct permutations of the column values exist in table. To obtain this statistical information, the optimizer requests the database engine to run a "key estimate" of the number of rows (keys) that match the selection criteria. This estimate process is part of the query optimization and uses the index(es) to "count" a subset of the keys — thus providing the optimizer with a good idea of the selectivity of the query.

Since most databases do not have a way to accurately represent the cardinality of columns, the other optimizers may assume that the distinct values are equally distributed throughout the table. For example, if a table contains a column for STATE and the data reflects all 50 states in the United States, most optimizers assume that each state appears equally (the data distribution for any state is1/50th of the total). But as anyone familiar with the population distribution of the United States knows, it is very unlikely that the table will contain as many entries for North Dakota as it does for California.

However, in DB2 UDB for iSeries, an index built over STATE will give the optimizer information about how many rows satisfy 'North Dakota' and how many satisfy 'California.' And if the distributions vary widely, the optimizer may build different access plans based on the actual value of STATE specified in the query request.

The number of distinct values in a key column or composite key can be looked at by using the OS/400 command Display File Description (DSPFD) or using iSeries Navigator to view the properties of the index.

In some instances, the optimizer uses the index for optimization, but chooses to implement the query using a non-indexed method. For example, if the query is going to retrieve a high number of the rows, it may be faster to perform a full table scan. Remember that on an iSeries server, full table scans are highly efficient because of the independent I/O subsystems, parallel I/O technology, and very large memory system. The full table scan would be the fastest (least costly) method to access the requested rows.

Keep in mind, however, that even when a full table scan is the best access method for a given query, the optimizer makes that decision based in part, on what it learns from the indexes. Therefore, it is important to have a set of indexes defined for each table, regardless of whether the indexes are used for data retrieval.

Nested Loop Join via Index
When DB2 UDB for iSeries builds an access plan for a query that uses inner join to access rows from more than one table, it first gathers an estimate for how many rows will be retrieved from each individual table. It then chooses the best access method for each individual table, based on the cost of the various methods available. Based on those estimates and costs, the optimizer then runs through possible variations of the join order, or the sequence in which the tables in the query will be accessed. An access plan is then built that puts the tables in the most cost-effective join order. Unlike some databases, which can only process a join in the order the query is written, DB2 UDB for iSeries will evaluate the possible options and rewrite the query to ensure the best (least costly) join order is used. For other join types, such as left outer join and exception join, the join must be implemented in the order specified in the SQL request.

As expected, the optimizer needs information from the indexes in order to evaluate the join order. Join order is dependent on which tables will retrieve the most rows and the "fan-out" of the join. Join fan-out can be simply defined as the number of expected rows that match a given join value. The optimizer estimates the number of rows retrieved by looking at the indexes. Therefore, it is very important to build indexes over the join columns.

The most common method of join processing is called a nested loop join via index. It applies to queries where there are at least two tables to join together, as in the following example:

	SELECT A.COL1, B.COL2, B.COL3
		FROM TABLE1 A, TABLE2 B	
		WHERE A.FKEY1 = B.PKEY2

With nested loop join via index, a row is read from the first table or dial in the join using any access method (i.e., table scan, index probe, etc.). Then, a join key value is built up to probe into the index of the second table or dial. If a key value is found, then the row is read from the table and returned. The next matching key value is read from the index and the corresponding row is read from the table. This process continues until no matching keys are found in the index. The database engine then reads the next row from the first table or dial and starts the join process for the next key value. The nested loop join is not complete until all of the rows matching the local selection are processed from the first dial. It is important to understand the nested loop process, so that it can be as efficient as possible. Nested loop join via indexes can produce a lot of I/O if there are many matching values in the secondary dials, or high join fan-out.

fig 4

Nested loop join via index requires a radix index over the tables that are being joined (to the secondary dials). If an index does not exist for the join columns in the tables following the first table in the join order, the database might choose to build temporary indexes over these columns to complete a nested loop join. A common performance problem is that the index does not exist over the join columns, so the database must build temporary index(es) to process the query, lengthening query processing time, and requiring more system resources.

Starting in Version 4 Release 5, all nested loop joins via index are processed like index probes for local selection. In fact, when both the join and local selection predicates are present for a given join dial, the optimizer can use all of the columns to probe the radix index. This makes the join much more efficient since it narrows down the rows matching the selection and join criteria with a minimum of I/Os. This technique is called a "multi-key row positioning join." It is very important to have a radix index available that contains both the local selection column(s) and the join column(s) for a given table. If only the join column is present in the index, the database engine must probe the index for the join key value, then read the table and test the local selection values. If the data does not match the local selection, then the probe of the index and random read of the table is wasted.

Indexes are critical to this process because the database engine can use the index instead of reading the base table. In this way, the optimizer can position the portion of the index that contains the relevant keys. For example, if the previous query is modified to:

	SELECT A.COL1, B.COL2, B.COL3
	FROM TABLE1 A, TABLE2 B
	WHERE A.KEY = B.KEY
	AND A.COL1 = 'BLUE'
	AND B.COL2 = 123

Assume that the optimizer chooses to process TABLE2 first and TABLE1 second when implementing the join. If there is an index over TABLE1 with key columns COL1 and KEY, then the optimizer can use the index to locate only those values that contain both "BLUE" and the join key value. This improves performance considerably since it eliminates random reads of TABLE1 to process the local selection. An index over TABLE2 with key columns COL2 and KEY could provide the same advantage.

With V5R1 and earlier releases of OS/400, nested loop joins via index did not use parallelism during the processing of the join. Starting with V5R2, the SQL query engine does allow parallelism for nested loop joins.Symmetrical Multiprocessing (SMP) may be used to create a temporary index if required for a nested loop join. Nested loop joins via hashing can take advantage of parallelism, and do not require an index to perform the join. Joining via a hash table is another join implementation method that uses a hashing algorithm technique to "consolidate" join values together and to locate the data to be joined.

Grouping and Ordering
Other common functions within an SQL request are grouping and ordering. Using the SQL GROUP BY clause, queries will summarize or aggregate a set of rows together. In DB2 UDB for iSeries, the optimizer can use either an index or a hashing algorithm to perform grouping. The method that the optimizer picks is query and system dependent; the optimizer will make its selection based on the nature of the query, the data, and the system resources available.

When a query includes an ORDER BY clause, the database engine will order the result set based on the columns in the ORDER BY clause. In DB2 UDB for iSeries, the optimizer can use either an index or a sort. Therefore, indexes can be used for this function as well. Sometimes, the ORDER BY clause includes columns already used in the selection and grouping clauses, so the optimizer may take advantage of the "by key" processing used for other parts of the query request. The data is processed in order, so to speak.

For both grouping and ordering, the optimizer will cost the various methods available, based on the expected number of rows identified in the local selections and join. The optimizer estimates the number of unique groups based on the information that it finds in the indexes. In the absence of indexes, the optimizer uses a default number of groups and a default number of rows per group. As can be imagined, this estimate might be close, but if it is grossly inaccurate, the optimizer will choose an inefficient access plan. In working with large business intelligence applications where grouping to build aggregates is common, there may be millions of groups or millions of rows within a group. As the size of the database scales upward, it becomes even more important for the optimizer to be able to accurately estimate how many rows are involved in a given query operation. Indexes make that possible.

In general, grouping a large number of rows per group favors hash grouping. Grouping a small number of rows per group favors index grouping.

Another factor is the fact that the index grouping does not use SMP or parallelism to process the rows.

Using an index for grouping or ordering can affect the join order of the query. This is true when the grouping and/or ordering columns are from one table. That table tends to go first in the join order, allowing the database engine to read the row from the first dial in the join by key, thus allowing the grouping and/or ordering to occur naturally. This may not be the best plan for optimal performance for the entire query. One way to help the optimizer is to create radix indexes for the selection and join statistics, and another index for the selection and grouping or ordering statistics. While the database engine cannot use both indexes for implementation, the optimizer would then get a good idea of the selectivity of the query, the join fan out, and the grouping attributes.

Example:	SELECT A.COL1, A.COL2, SUM(B.COL4)
		FROM TABLE1 A, TABLE2 B
		WHERE A.KEY1 = B.KEY2
		AND A.COL3 = 'XYZ'
		GROUP BY A.COL1, A.COL2

Indexes:	CREATE INDEX TABLE1_JOIN_INDEX1 ON TABLE1
			(COL3, KEY1)
		CREATE INDEX TABLE1_GROUPING_INDEX1 ON TABLE1
			(COL3, COL1, COL2)
		CREATE INDEX TABLE2_JOIN_INDEX1 ON TABLE2
			(KEY2)

Another index grouping technique the optimizer can choose is "early exit" on MIN and MAX functions. This is really just another way to employ an index probe with multiple keys, and take advantage of the data ordering via the index. The requirement is to have all of the local selection keys represented in the primary portion of the index followed by the column that is used with the MIN or MAX function. For MAX, the key column should be descending order. By specifying the local selection column(s) as key(s), followed by the column with the MIN or MAX function, the database engine can effectively read the first composite key value that matches the local selection and the MIN or MAX condition. The database engine then moves on, or positions down, to the next value that matches the local selection. This "early exit" routine saves the database from reading and processing all of the rows that match the local selection, searching for the MIN or MAX value.

[Back | Next]