Texas A&M Supercomputing Facility Texas A&M University Texas A&M Supercomputing Facility

Basic System Information

Hydra is a 52-node, 832-processor IBM Cluster-1600 cluster. The processors are IBM's 1.9GHz RISC Power5+'s and these are physically packaged and organized 16 to a node. A node, a p575 node so called, is a symmetric multi-processor (SMP) system with 16 Power5+ processors and a shared memory of 32 gigabytes. Of these 32 gigabytes only 25 are available for user processing. Keep that in mind when setting batch job memory limits. The 52 nodes are further organized and housed into five physical frames or racks, stacked ten nodes to a rack for frames 1-4 and twelve for frame5. In all, only 48 nodes are interconnected via the HPS, the IBM High-Performance communication Switch: all 40 nodes in frames 1-4 and only 8 nodes in frame5. All 52 nodes are interconnected with gigabit ethernet. The cluster uses the HPS for parallel processing and other communication between the 48 nodes. Each of the 48 p575 nodes connects to the HPS network using two adapters. Each of these in effect attaches to one of the two available subnetworks via which HPS routes a message packet to another (of the 48) node.

Power 5+ Dual Core Chip

Fig. 1: Power5+ Dual Core Chip. The Basic Building Block

p5 575 node at Glance

Fig. 2: The p5 575 Node at a Glance

p5 575 Internal Schematic

Fig. 3: p5 575 Internal Schematic

Node Roles

Fig. 4: Node Roles as Configured on HYDRA

Node to DDN Connections

Fig. 5: The DDN Disk Raid Array connections to Hydra

Advanced Hardware and Software Architecture

You will find a much more detailed and informative description of a number of advanced hardware and software architecture issues for the entire Hydra cluster in the Advanced Cluster Architecture section of the user guide.

Login Nodes: hydra1.tamu.edu and hydra2.tamu.edu

The staff has configured the naming of the nodes to reflect their physical location in the five racks. In the first four racks there are ten nodes. A fifth rack, added in February 2009, has 12 nodes. A node name consists of a four- or a five-character string, f[1-5]n[1-10/12]. For example, f3n9, refers to the 9th node in rack 3, f5n12 is node 12 in rack 5, etc. Node numbers increase from the (physical) bottom up, 1-10/12. Two of the 48 HPS-interconnected nodes, f1n9 and f1n10, are allocated to interactive processing. Logins are enabled only to those two nodes. The internet host nanes of f1n9 and f1n10 are hydra1 and hydra2, respectively. The rest of the nodes are only accessible by the LoadLeveler, the batch facility. You can view this list of nodes using the LoadLeveler command, listnodes (llstatus -f %n will also work). Even more useful for tracking batch jobs is the listnodeusage command which lists the nodes that specific jobs run on. A sample listing follows.

hydra# listnodeusage
Job ID            Owner        Class  Cpus  ST  Node(Tasks,CCpusPerTask)
------            -----   ----------  ----  --  ------------------------
f1n2.154280.0   y0m4156        mpi32    32   R  f1n3(16,1), f1n4(16,1)
f1n2.154281.0   y0m4156        mpi32    32   R  f1n8(16,1), f3n7(16,1)
f1n2.154724.0   c0s2008     smp_long     8   R  f3n9(8,1)
f1n2.154781.0   hhp0872        mpi32    32   R  f3n4(16,1), f4n7(16,1)
f1n2.155218.0   q0s1711        mpi32    32   R  f1n7(16,1), f2n8(16,1)
f1n2.155232.0      link        mpi64    58   R  f2n3(10,1), f2n6(16,1), f3n10(16,1), f4n5(16,1)
f1n2.155844.0      ryan    geo_group    32   R  f3n2(16,1), f3n5(16,1)
f1n2.156041.0   georget     cs_group     4   R  f2n2(4,1)
f1n2.156042.0   georget     cs_group     8   R  f4n6(8,1)
f1n2.156043.0   georget     cs_group     8   R  f4n2(8,1)
f1n9.153824.0   y0m4156        mpi32    32   R  f3n1(16,1), f4n9(16,1)
f1n9.153825.0   y0m4156        mpi32    32   R  f2n1(16,1), f2n10(16,1)
f1n9.153826.0   y0m4156        mpi32    32   R  f4n3(16,1), f4n4(16,1)
f1n9.154325.0   hhp0872        mpi32    32   R  f3n3(16,1), f3n6(16,1)
f1n9.154763.0   q0s1711        mpi32    32   R  f1n6(16,1), f2n7(16,1)
f1n9.155270.0    rivera        mpi32    32   R  f3n8(16,1), f4n8(16,1)
f1n9.155372.0   yubofan        mpi32    32   R  f2n4(16,1), f4n1(16,1)
f1n9.155586.0   georget     cs_group     4   R  f1n5(4,1)
f1n9.155589.0   georget     cs_group     4   R  f4n10(4,1)
f1n9.155590.0   georget     cs_group     8   R  f2n9(8,1)

Total                                  486