eX5 Memory Advantage: How Add’l Capacity Can Benefit SQL Server DW Workloads
Up to 3TB of memory and enterprise-class availability features make eX5 ideal for DW
The IBM eX5 Memory Advantage: How Additional Memory Capacity on eX5 Can Benefit SQL Server Data Warehouse Workloads
By Roland Mueller
IBM Systems and Technology GroupExecutive Overview
A Data Warehouse (DW) is a type of database designed to archive data for the purposes of reporting and analysis. Data Warehouses are typically used as the back-end data storage for Decision Support Systems (DSS). A DW environment is different from an OLTP (online transaction processing) database in that it is often subjected to ad hoc queries rather than the predefined queries typical of the OLTP database. Queries run against DW databases often scan millions of rows, whereas typical OLTP queries generally scan a handful of records. The DSS I/O pattern is mostly large sequential reads, compared to small random reads with an OLTP workload.
This technical report documents a series of tests that demonstrate the benefits of using large amounts of server memory for a large-scale Decision Support System workload.
For this series of tests, a Microsoft® SQL Server® 2008 300GB data warehouse was used running on an IBM® System x3850 X5 32-core server connected via Fibre Channel to an IBM System Storage™ DS3400 storage controller.
IBM System x3850 X5
The x3850 X5 is built on eX5, the fifth generation of IBM X-Architecture® chipset technology. This new system incorporates the latest in hardware acceleration and scalability providing resource configurations that push the boundaries of x86/x64-based systems while providing industry-leading flexibility. With the release of new multi-core processors from Intel®, the 4-socket IBM System x3850 X5 now offers up to 32 cores or 64 logical processors with Hyper-Threading Technology and up to 1TB of memory per chassis. The memory is highly available, incorporating such IBM OnForever™ features as Memory ProteXion™, Chipkill™ error correction, memory rank sparing, memory scrubbing, and memory mirroring. Other high-availability features, such as hot-swap redundant fans and power supplies (with 220V AC input), and hot-swap HDDs and solid-state drives (SSDs) deliver high levels of reliability and availability to maximize up-time. These kinds of features provide the reliability and high availability needed for the most demanding virtualization requirements.
Compared to the previous-generation server, the x3850 X5 doubles the memory capacity to 64 DIMM slots/1TB of PC3-10600 double data rate 3 (DDR3) memory. (Using the optional 1U MAX5 memory expansion unit, the x3850 X5 has access to an additional 32 DIMM slots, for a total of 1.5TB of memory in only 5U.) This paper will demonstrate how leveraging the memory capabilities of the x3850 X5 can take the performance, efficiency and reliability of your Data Warehouse environment to the next level.
IBM System Storage DS3400
DSS workloads require large bulk reads, so it’s important to maximize throughput and minimize, or eliminate, potential bottlenecks that can occur in the storage system or the storage interconnects. The DS3400 was chosen as the appropriate storage platform because it met the requirements for large capacity, high throughput, and the capability to sustain a high IOPS (I/O operations per second) rate. The DS3400 features a 4Gbps Fibre Channel (FC) interface for either direct-attach storage (DAS) or SAN-ready solutions.
The combination of 12 SAS or SATA 3.5" drives per enclosure provides the option to scale to 5.4TB of storage capacity with 450GB hot-swappable SAS disks or 12.0TB with 1.0TB hot-swappable SATA disks in the first enclosure. The DS3400 is expandable by attaching up to three additional EXP3000 storage enclosures for a total of 21.6TB of storage capacity with 450GB SAS or up to 48.0TB with 1.0TB SATA.
For more information about the DS3400 System Storage controller, visit the following URL:
Microsoft SQL Server 2008
SQL Server 2008 running on Microsoft Windows® Server 2008 enables organizations to build comprehensive, enterprise-scale analytic solutions that deliver intelligence where customers want it.
SQL Server 2008 makes a major advance in Data Warehouse scalability. The database engine contains numerous enhancements designed to improve both absolute performance and multi-processor scaling on decision support workloads.
Decision support workloads are frequently very I/O intensive. The high I/O requirements of these tests were met using DS3400 storage systems connected to the IBM x3850 X5 server using multiple Fibre Channel inter-connects.
Figure 1 illustrates the topology of the Data Warehouse workload test environment. The x3850 X5 server was connected to seven DS3400 storage controllers using fourteen 8Gbps Fibre Channel interconnects (two FC links per controller).
Note: Although the host bus adapters (HBAs) are designed to operate at 8Gbps, they auto-negotiate to 4Gbps due to the 4Gbps FC interface of the DS3400
Figure 1. Test Configuration
The x3850 X5 incorporated four Intel Xeon® X7560 8-core processors (32 cores). All seven PCIe slots were populated with Emulex LPE 12002-M8 8Gb HBAs. The tests were performed with 64GB, 128GB, 256GB, and 512GB of memory.
Figure 2 illustrates the storage configuration. Four RAID-1 arrays were created on each storage controller. Two logical drives were created on each of the RAID-1 arrays. The 300GB test database as well as SQL Server’s TempDB were spread across the 56 logical drives. The SQL log file was on a separate array.
Figure 2. Storage Configuration showing 28 RAID-1 arrays, each with two logical drives
Looking at Graph 3, there is only a slight reduction in query response times for most queries when moving from 64GB to 128GB of server memory. Only queries 3 and 6 show a fair amount of improvement. However, when moving from 128GB of memory to 256GB or 512GB, we start seeing the benefits of the additional memory in terms of much reduced response time on all the queries.
The following queries were executed against the 300GB database:
Query 1 (Q1) - Shipping Priority Query
This query retrieves the shipping priority and potential revenue of the top 10 orders having the largest revenue among those that had not been shipped as of a given date.
Query 2 (Q2) - Local Supplier Volume Query
This query lists the revenue volume that resulted from line-item transactions in which the customer ordering parts and the supplier filling them were both within that nation for a particular year.
Query 3 (Q3) - National Market Share Query
This query determines how the market share of a given nation within a given region has changed over two years for a given part type.
Query 4 (Q4) - Returned Item Reporting Query
This query identifies the top 20 customers who might be having problems with the parts that are shipped to them for a given quarter.
Query 5 (Q5) - Small-Quantity-Order Revenue Query
This query determines how much average yearly revenue would be lost if orders were no longer filled for small quantities of certain parts. This may reduce overhead expenses by concentrating sales on larger shipments.
Query 6 (Q6) - Discounted Revenue Query
This query reports the gross discounted revenue attributed to the sale of selected parts that were shipped by air or delivered in person.
The following graph shows the comparison of query response times for each of the six queries
with 64GB, 128GB, 256GB, and 512GB of server memory.
Figure 3. Test results
Note: The following results use 64GB as the baseline performance.
Query 1 Results
When doubling the memory from 64GB to 128GB we see a slight improvement in query response time. However, when executing the query with 256GB or 512GB, we get a 57% reduction in query response time for both runs.
It appears that the dataset for this query was greater than 128GB and less than 256GB, which is why we see similar performance for the 64GB and 128GB runs and then again for the 256GB and 512GB runs. When executing the query with 256GB and 512GB, the entire dataset could fit into memory. In this case, no further reduction in query response time can be realized by increasing memory from 256GB to 512GB.
Query 2 Results
As in query 1, we see a slight improvement in query response time when doubling memory from 64GB to 128GB, which tells us the dataset was too large to fit into memory. When executing the query with 256GB, we get a 38% reduction in query response time. It appears much of the dataset was able to fit into memory when executing with 256GB. However, not everything fit into memory, because when executing with 512GB, we see a 71% reduction in query response time.
Query 3 Results
In query 3, we see a 26% reduction in query response time when moving from 64GB to 128GB. When we execute the query with 256GB and 512GB, we get 52% and 59% reductions, respectively.
Query 4 Results
Query 4 is almost identical to query 1. Query response time remains the same when executing with both 64GB and 128GB, and then drops 65% when memory is increased from 128GB to 256GB (no change with 512GB).
Query 5 Results
Query 5 displays very linear improvement when more memory is added. When executing the query with 128GB of memory, we see a 19% reduction in query response time. The 256GB run reduces the query response time by 56%. Finally when executed with 512GB of server memory, we see a 93% reduction in query response time.
It appears that the dataset was larger than 256GB but smaller than 512GB. The query was able to execute very quickly on the 512GB test run because the entire dataset resided in memory.
Query 6 Results
Query 6 tells a similar story to query 3. We see a 39% reduction in query response time when executing with 128GB of memory. When the query is executed with 256GB and 512GB, we see a 92% and a 95% reduction, respectively.
I/O is the slowest component in a Data Warehouse configuration. When you run a query against a database, the server needs to process data. Data is stored on the storage, and reads are issued to gather the dataset. On a system with a small amount of memory, current data in memory needs to be evicted to make room for the dataset. If the evicted data is needed later, it must be retrieved again and again.
On a system with a large amount of memory, more of the dataset can remain in memory, and future references to the dataset will be satisfied at very low latencies, reducing the query response times significantly. If achieving faster query response times is your goal, then one way to achieve that goal is to use enough memory to keep most or all of the dataset in memory. This will let you reduce the amount of storage needed for the same I/O bandwidth, which can offset the cost of adding memory. IBM offers multiple effective solutions with the 4-to-8-socket x3850 X5 that supports up to 1TB of memory natively, up to 1.5TB, using a MAX5 expansion unit, and up to 3TB using two chassis with two MAX5 units; or the 2-to-4-socket x3690 X5 server, which supports up to 512GB natively or up to 2TB of memory using two chassis and two MAX5 units; or the HX5 blade server, which supports up to 128GB natively or up to 640TB using two blade servers and two MAX5 units.
The results presented in this paper demonstrate the benefits of running your workload in these large memory configurations.
|For More Information
IBM System x Servers http://ibm.com/systems/x
IBM BladeCenter Server and options http://ibm.com/systems/bladecenter
IBM Systems Director Service and Support Manager http://ibm.com/support/electronic
IBM System x and BladeCenter Power Configurator http://ibm.com/systems/bladecenter/powerconfig
IBM Standalone Solutions Configuration Tool http://ibm.com/servers/eserver/xseries/library/configtools.html
IBM Configuration and Options Guide http://ibm.com/servers/eserver/xseries/cog
IBM ServerProven Program http://ibm.com/servers/eserver/serverproven/compat/us
IBM Technical Support http://ibm.com/server/support
Other IBM Technical Support Resources http://ibm.com/servers/eserver/techsupport.html
© IBM Corporation 2010
IBM Systems and Technology Group
3039 Cornwallis Road
Research Triangle Park, NC 27709
Produced in the USA
All rights reserved.
For a copy of applicable product warranties, write to: Warranty Information, P.O. Box 12195, RTP, NC 27709, Attn: Dept. JDJA/B203. IBM makes no representation or warranty regarding third-party products or services including those designated as ServerProven or ClusterProven. Telephone support may be subject to additional charges. For onsite labor, IBM will attempt to diagnose and resolve the problem remotely before sending a technician.
IBM, the IBM logo, ibm.com, Active Energy Manager, Active Memory, Chipkill, Memory ProteXion, OnForever, Predictive Failure Analysis, System Storage, System x, and X-Architecture are trademarks of IBM Corporation in the United States and/or other countries. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. For a list of additional IBM trademarks, please see http://ibm.com/legal/copytrade.shtml.
Intel, the Intel logo, and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
Linux is a registered trademark of Linus Torvalds.
Microsoft, Windows, Windows Server, the Windows logo, and SQL Server are trademarks or registered trademarks of Microsoft Corporation.
Other company, product and service names may be trademarks or service marks of others.
|IBM reserves the right to change specifications or other product information without notice. References in this publication to IBM products or services do not imply that IBM intends to make them available in all countries in which IBM operates. IBM PROVIDES THIS PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS
OR IMPLIED, INCLUDING THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Some jurisdictions do not allow disclaimer of express or implied warranties in certain transactions; therefore, this statement may not apply to you.
This publication may contain links to third party sites that are not under the control of or maintained by IBM. Access to any such third party site is at the user's own risk and IBM is not responsible for the accuracy or reliability of any information, data, opinions, advice or statements made on these sites. IBM provides these links merely as a convenience and the inclusion of such links does not imply an endorsement.
Information in this presentation concerning non-IBM products was obtained from the suppliers of these products, published announcement material or other publicly available sources. IBM has not tested these products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
MB, GB and TB = 1,000,000, 1,000,000,000 and 1,000,000,000,000 bytes, respectively, when referring to storage capacity. Accessible capacity is less; up to 3GB is used in service partition. Actual storage capacity will vary based upon many factors and may be less than stated.
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will depend on considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
Maximum internal hard disk and memory capacities may require the replacement of any standard hard drives and/or memory and the population of all hard disk bays and memory slots with the largest currently supported drives available. When referring to variable speed CD-ROMs, CD-Rs, CD-RWs and DVDs, actual playback speed will vary and is often less than the maximum possible.