Hydra Upgraded to 832 CPUs

Posted on: 9:33 AM, February 26, 2009

On February 25, 2009, HYDRA, our IBM p5 cluster saw its computational capacity raised from a 640- to an 832- CPU system, an over all 30% increase. The increase reflects the addition of 12 nodes of the same type (16 1.9 GHz Power5+ CPUs per node) to the old 40. Three of the newcomer nodes came configured with 64 GB of memory each, instead of the standard 32. Large-memory jobs stand to benefit significantly from this feature. Due to capacity limits, however, only 8 of the 12 new nodes could be integrated onto the high performance switch (HPS), the interconnect for fast data communication. All in all, 48 nodes, including those with 64GB of memory, are now on the HPS. The other 4 nodes are connected using only gigabit Ethernet.

What does the new capacity mean for users? For one, the number of nodes allocated to MPI parallel jobs scheduled through the mpi queues has increased from 26 to 34, resulting in a jump of 30% in resources allocated to them. An even greater jump, 266% (from 24 to 88 cores), resulted for serial and, up to 16-way, parallel jobs (OpenMP or even MPI) through the deployment of the 4 nodes that are on Ethernet only.

To make use of these 4 nodes, users should direct a job to the smp_only queue by setting the #@class = smp_only option in a job file. And since the nodes associated with this queue are not connected to the HPS, a job submitted to it must not specify options that require it, such as those that contain the keywords network or bulk_xfer. Also significant from the user point of view is the fact that these jobs are not preemptible, but when directed to the smp_normal or smp_long queues they are.

For a host of other technical details relating to the upgrade, check the following sections in our updated user guide: Batch processing and Basic System Information.

Node Roles

Nodes Roles in the Hydra Cluster