Hydra Cluster Tentative Availability after Planned Maintenance
Dear Hydra users,
We are opening the Hydra cluster for interactive and batch use on a tentative basis.
Hydra has undergone major reconfiguration which intended to make the high-performance I/O subsystem and GPFS more fail-safe. As you may know, Hydra has already lost permanently nine nodes. Two of these were serving very important I/O and other function for the entire cluster.
During this latest planned maintenance two compute nodes were re-configured to replace the failed I/O nodes. The changes touched the entire I/O H/W and S/W stack from DDN, all the way up to GPFS. These nodes will be allowed to carry lower workload so they can server better in their I/O and GPFS functions.
NOTE: these replacement nodes are of the same age as those which have already died.
As the changes are pervasive please notify us in case you notice something out of the ordinary.
Posted on: 1:50 PM, November 8, 2012