Mathematical LibrariesLast update on Friday, 01-Feb-2008 15:54:13 CST. | |||
|
Using numerical libraries is the easiest way of achieving a high
performance in your application. On cosmos.tamu.edu
there is a variety of libraries that have been designed and tuned
specifically for the Altix/Itanium-2® architecture delivering
considerable performance improvements over freely available or general
purpose numerical codes. Some of the libraries have been extended with OpenMP directives to provide parallel
execution, so the benefits of multiprocessing can be easily obtained
just by linking with the SMP library and setting the
OMP_NUM_THREADS environmental variable to the
number of processors required. Note however that SMP libraries have
limited scalability on NUMA platforms like the Altix due to the
unavoidable higher latencies of remote memory accesses. In general,
this means that setting the number of threads to a value higher than
12/16 will not give further performance improvements, and it is even
likely that it will increase the run time. If you need a special purpose numerical library not listed here,
you can email us
your request and we will consider installing it on
cosmos.tamu.edu. The Math Kernel Library for the Itanium architecture contains the
following subroutines: First, you will need to choose and load a MKL module. Run the "module
avail" command to list all the available environment modules (including MKL).
To load the latest MKL module:
Please consult the modules page for more information about using modules for environment management. Linking MKL with the Intel CompilerAfter loading a mkl module, you need to link your program with the MKL libraries:
where subset refers to a specific part of the library e.g., -lmkl_lapack, -lmkl_solver. $MKL_PATH is defined by the corresponding mkl module that you loaded previously. Note that there is no separate subset for FFT routines. You may need to add "-lguide -lpthread" to take advantage of OpenMP threaded subroutines. Additional InformationThere is additional MKL documentation with detailed description of the library and examples of use. Product features and additional documentation from Intel.SGI Scientific Computing Software Library (SCSL)The Scientific Computing Software Library (SCSL; see man scsl) from SGI has been ported and optimized for the Itanium-2 architecture. It delivers performance similar and in some cases superior to the Intel MKL. The SCSL covers the following areas:Available SCSL Routines
The SCSL routines can be linked and loaded by using the -lscs or the -lscs_mp options. To link with the SCSL library add the following flag when linking:
The second option (-lscs_mp) gives you access to the OpenMP multi-processor (i.e., multi-threaded) version Of the SCSL library. Note that you must use version 7.x or later of the Intel Compilers to link against the latest release (1.5.1) of SCSL on cosmos.tamu.edu. Note: When linking to SCSL with -lscl, the default integer size is 4 bytes (32 bits). Another version of SCSL is available in which integers are 8 bytes (64 bits). This version allows the users access to larger memory sizes. It can be loaded by using the -lscs_i8 option or the -lscs_i8_mp options. A program can use only one of the two versions; 4-byte integer and 8-byte integer library calls cannot be mixed. Additional InformationFor further reference you can access the SCSL documentation resources at SGI.SGI Scientific Computing Software Library routines for Distributed Shared Memory (SDSM)The SGI Scientific Computing Software Library, for Distributed Shared Memory (SDSM) is the multi-processor version of SCSL. SDSM contains the following routines.
The SDSM routines can be loaded by using the -lsdsm option when linking your application. The required scsl and mpi libraries will automatically be included. To link with the SDSM library add the following flags when linking: Linking with -lsdsm enables the distributed shared memory routines (e.g., SCALAPACK) and links with the SCSL library and the Message Passing Toolkit (SGI's implementation of MPI) as needed. The second method of linking differs from the first in that PBLAS calls to BLAS routines will be made to the OpenMP parallel version of the library (libscs_mp.so). This will allow hybrid parallelism, which may reduce time to solution for some applications. Users of the hybrid approach are encouraged to carefully review the message passing toolkit (MPT) documentation (see man mpi) to determine optimal mechanisms for launching such hybrid jobs. Note: When linking to SDSM with -lsdsm, the default integer size is 4 bytes (32 bits). There is currently no version of SDSM available with a default integer size of 8 bytes (64 bits). Note that you must use the version 7.x or later of the Intel Compilers to link against the default version of SDSM and SCSL on cosmos.tamu.edu. |