System Level Testing Evolution and Resulting Test Methodology Used to Verify CHARM Functionality on POWER Systems

Enterprise computing environments require constant availability. Time lost to unavailable compute resources can cost financial institutions millions of dollars1,2. Lost time can come from unscheduled outages. Such outages originate from influences outside the datacenter and from failing hardware components or software problems within the datacenter. Scheduled outages represent the larger contributor to enterprise system unavailability and lost time. Routine maintenance for system upgrades, code updates, or repairs are the biggest contributor to scheduled outages3,4. Generally when an unscheduled outage occurs with a computer system there will need to be one or more scheduled outages to repair the system. Further, upgrading system hardware capacity can require scheduled outages as well. Systems such as the IBM Power systems5 family of servers have robust availability features. These features are designed to address both unplanned and planned outages within customer’s enterprise computing environments. One of the key maintenance related feature sets of IBM Power systems servers is known as CEC Hot Add Repair and Maintenance (CHARM)6.

Given that enterprise servers rarely experience significant lulls in their utilization, adding more capacity to those systems while they are running is very valuable to the customer. However, adding more physical processor, memory or IO hardware while the machine is running, presents a challenge. IBM’s POWER enterprise class servers meet this challenge with CHARM. Additional nodes of compute resources can be added to the machine and those resources dynamically configured for use. Thus, hot node add, hot memory upgrade, and hot IO drawer add enable power systems to avoid scheduled outages for capacity upgrades. The next area of server unavailability addressed by CHARM relates to repairing hardware in the rare instance of a hardware failure. When hardware in an enterprise computer system experiences a failure, the system automatically restarts with the failing components logically isolated from the rest of the system. This allows customers to continue operations and defer the maintenance until a more convenient time. Using the capabilities of Hot Repair, IBM service personnel can replace the failing hardware while the server is running. The repaired hardware can then be dynamically returned to use by the customer applications. Hot repair allows for the repair of critical components within the power systems server in a manner most considerate of customer compute availability.

Contact IBM

Advantages

Companies embracing Smarter Computing are implementing IT infrastructure based on the IBM Power Systems platform that is designed for data, tuned to the task and managed with cloud technologies. With Power Systems, businesses can outpace their competitors by delivering services faster, differentiate their offerings by delivering higher quality services, and turn operational cost into investment opportunity by delivering services with superior economics.

Hardware

Power Systems hardware provides the foundation for designing workload optimized systems in conjunction with software and expert domain knowledge. Power servers and blades are modular and scalable and designed from the chip through the software stack to help deliver new levels of business performance.

Operating systems

Power servers deliver flexibility and choice of operating systems to enable your business to select the best applications for your business needs. Whether running 1, 2, or all 3 - coupled with PowerVM, they maximize the benefit of Power Systems in your business.

System software

IBM's integrated approach to developing Systems and Systems Software stacks together delivers maximum utilization, availability, and flexibility helping you deliver new advantages in your business.

Solutions

IBM and IBM Business Partner solutions exploit key benefits in Power Systems that help you deliver new capabilities and new competitive advantages to your business.

Migrate to Power

Over the last five years thousands of clients have migrated to POWER. Learn how Power Systems has helped them improve their business performance, reduce risk, and establish a more secure future.

Full text white paper