Hydra is Off-Line for Maintenance (Update #2)

This is a quick update on the status of Hydra maintenance. Please see this previous announcement and this one for details on the nature and objectives of the current maintenance.

On Fri. 02/03/2012, the DDN 9550 disk storage of Hydra was powered down in the process of physically moving the connectivity from dead node f1n1 to a new replacement one. When the DDN9550 was powered back up to resume operation, one of its controllers (Singlet #2) did not power up to a stable state and it became itself inaccessible. Several attempts to wake up this controller were proven fruitless. Finally, DDN the vendor of this disk storage suggested they ship a new controller to replace controller 2.

The replacement controller just arrived this afternoon we are in the process of installing it into the system. This will take rest of the day today and tomorrow . We have to validate that the DDN storage system is stable and that it is at the state it was when we powered it down last Friday.

After we replace the failed controller and ensure that the disk storage is fully operational, we will resume the healing of the GPFS I/O subsystem of Hydra. At this stage we may bring the system back with ONLY three I/O servers and open it to our users. At the same time we will be installing the fourth one in parallel.

Stay tuned for developments.

Posted on: 2:30 PM, February 9, 2012