Skip to main content

 
IBM Systems  > Cluster systems  > 

Software

  
Learn more
Cluster Software Ordering Guide
Cluster 1350 frequently asked questions (33KB)
Learn about clustering
Get Adobe® Reader®
IBM offers a complete portfolio of cluster software to help organizations build, manage and expand cluster environments efficiently using IBM System p™ servers running AIX® or Linux®, IBM System x™ servers running Linux or a combination. Cluster-ready software from IBM enables collections of IBM servers to behave like a single high-performance system for end users and system administrators.

System management
xCAT
IBM offers the Extreme Cluster Administration Toolkit (xCAT), a scalable distributed computing management and provisioning tool that provides a unified interface for hardware control, discovery, and OS diskful/diskfree deployment. This robust toolkit can be used for the deployment and administration of Linux clusters. Its features are based on user requirements, and take advantage of IBM System x hardware.

xCAT makes simple clusters easy and complex clusters possible, by making it easy to:
  • install a Linux cluster with utilities for installing many machines in parallel
  • manage a Linux cluster with tools for management and parallel operation
  • set up a high-performance computing software stack, including software for batch job submission, parallel libraries, and other software that is useful on a cluster
  • create and manage diskless clusters

xCAT works well with the following cluster types:
  • HPC: High Performance Computing Physics, Seismic, Computational Fluid Dynamics, Finite Element Analysis, Weather, and other simulations; and Bio-informatic work
  • HS: Horizontal Scaling Web farms, etc.
  • Administrative: Not a traditional cluster, but a very convenient platform for installing and administering a number of Linux machines
  • Operating Systems: With xCAT's cloning and imaging support, it can be used to rapidly deploy and conveniently manage clusters with compute nodes that run Windows or other operating systems
  • Other: xCAT's modular tool kit approach makes it easily adjustable for building any type of cluster

Moab Cluster Software
New to the Cluster 1350 portfolio is the addition of Moab software (Moab Adaptive HPC Suite and Moab Adaptive Computing Suite), an intelligent workload management software offered by Cluster Resources, Inc.

Moab provides Web-based workload management, graphical workload and policy administration, and management reporting tools. Organizations will benefit from the ability to provide guaranteed service levels to users and organizations, higher resource utilization rates, and the ability to get more workloads processed with the same resources, resulting in an improved return on investment.

Moab dynamically manages HPC and data center resource pools to meet specific workload needs. Moab automates the behind-the-scenes virtualization and provisioning technologies and features flexible management policies to help provide that specific user, group, and workload needs are met. Moab can also track resources and energy usage for capacity planning and cost sharing.

Workload-driven Adaptive Infrastructure
Powered by Moab and xCAT
By combining Moab software with xCAT, IBM is delivering a robust management solution that is taking HPC and data center computing to a new level by providing solutions that meet clients’ dynamic infrastructure needs.

The powerful and affordable Moab-xCAT solution includes a specific solution offering for HPC environments and another for data centers. Both solutions also offer workload-driven adaptive cloud infrastructure capability.

Moab Adaptive HPC Suite
Serial and Parallel Batch Workloads
Adaptive HPC is the evolution of HPC management that intelligently adapts your cluster to your organization’s changing needs and resource conditions. The adaptive HPC infrastructure:

  • Orchestrates virtualization or stateless provisioning to dynamically adapt the application and OS environment to meet changing workload needs and allows users to submit Linux, UNIX®, or Windows HPC workload to a unified hybrid-OS system
  • Intelligently adapts to surges in HPC workloads and reallocates workloads to avoid and recover from resource failures
  • Optimally consolidates workloads to increase productivity by as much as 30%
  • Automatically powers down idle nodes and applies thermal balancing to reduce energy costs by 10 to 50%
  • Applies rich policies to ensure QoS and SLA delivery to the most important users, groups, and projects

Moab Adaptive Computing Suite (Data Center)
Web, Service, and Transactional Workloads
The Adaptive Data Center allows an organization to consolidate and virtualize its data center to a single shared system. This shared adaptive infrastructure:

  • Intelligently adapts to surges in Web, service and transactional workloads
  • Monitors application, user, project and organizational service levels and dynamically repurposes the computing environment to meet these requirements
  • Provides an advanced workflow engine for business-process automation
  • Reallocates workloads to avoid and recover from resource failures
  • Boosts data center productivity by as much as 300% by eliminating costly and inefficient application, project and resource silos
  • Automatically powers down idle nodes and applies thermal balancing to reduce power costs by 10 to 50%

Parallel System Support Programs (PSSP) for AIX
PSSP is the systems management predecessor to Cluster Systems Management (CSM) and does not support IBM System p servers or AIX 5L™ V5.3 or above. New cluster deployments should use CSM and existing PSSP clients with software maintenance will be transitioned to CSM at no charge.


Parallel file system
General Parallel File System (GPFS)
GPFS™ is a high-performance cluster file system for AIX, Linux and mixed clusters that provides users with shared access to files spanning multiple disk drives. By dividing individual files into blocks and reading/writing these blocks in parallel across multiple disks, GPFS provides very high bandwidth; in fact, GPFS has won awards and set world records for performance. In addition, GPFS's multiple data paths are designed to eliminate single points of failure, making GPFS extremely reliable. GPFS currently powers many of the world's largest scientific supercomputers and is increasingly used in commercial applications requiring high-speed access to large volumes of data such as digital media, engineering design, business intelligence, financial analysis and geographic information systems. GPFS is based on a shared disk model, providing lower overhead access to disks not directly attached to the application nodes, and using a distributed protocol to provide data coherence for access from any node.


Job scheduling
Tivoli® Workload Scheduler LoadLeveler®
Used for dynamic workload scheduling, Tivoli Workload Scheduler LoadLeveler is a distributed network-wide job management facility designed to dynamically schedule work such as maximize resource utilization and minimize job completion time. Jobs are scheduled based on job priority, job requirements, resource availability and user-defined rules to match processing needs with resources. LoadLeveler provides consolidated accounting and reporting and supports IBM servers including IBM System p and System x environments.


High Performance Math Libraries
Engineering Scientific Subroutine Library (ESSL) and Parallel ESSL
ESSL is a collection of state–of–the–art mathematical subroutines specifically tuned to IBM hardware and offering significant performance improvement to any math–intensive scientific or engineering applications. Parallel ESSL extends the function of ESSL to support parallel applications that use the Message Passing Interface included in IBM Parallel Environment. ESSL and Parallel ESSL support C, C++ and Fortran applications.


Parallel application development and execution
Parallel Environment (PE)
Parallel Environment for AIX is a comprehensive development and execution environment for parallel applications (distributed-memory, message-passing applications running across multiple nodes). It is designed to help organizations develop, test, debug, tune and run high-performance parallel applications in C, C++ and Fortran on IBM System p and System x clusters. Parallel Environment runs on AIX V5.2 and V5.3.


High availability (HA)
PowerHA SystemMirror for AIX Standard Edition
The PowerHA Standard Edition is designed to provide high availability for critical business applications and data through system redundancy and failover. PowerHA constantly monitors the status of servers, networks and applications to detect failures or performance degradation and can respond by automatically restarting a troubled application on designated backup hardware, taking care of all network or storage connections in the process. With PowerHA, clients can run up to 32 nodes running AIX. You can also mix and match system sizes and performance levels as well as network adapters and disk subsystems to satisfy specific application, network and disk performance needs.


Geographic clustering and disaster recovery
PowerHA SystemMirror for AIX Enterprise Edition
The PowerHA Enterprise Edition extends the PowerHA high availability capabilities across geographic sites with remote data mirroring (replication) and failover using this mirrored data; this combination can maintain application and data availability even if an entire site is disabled by a disaster. The PowerHA Enterprise Edition includes support for GLVM and the IBM Storage Server options; Global Mirror and Metro Mirror. With V6.1 PowerHA System Mirror Enteprise Edition also supports EMC SRDF.



 
AIX SWMA and upgrade center

Take the puzzle out of ordering System p server software maintenance and support.

Learn more


Tivoli software

IBM Tivoli System Automation for Linux
System Automation for Linux (SA) brings mainframe like high availability for critical applications to Linux on IBM System z and IBM System x.

Learn more


Clusters education
Linux curriculum  
Cluster 1600 curriculum  
IBM Redbooks  
Linux Clustering with CSM and GPFS