Texas A&M Supercomputing Facility Texas A&M University Texas A&M Supercomputing Facility

Math Kernel Library

Last modified: Friday October 25, 2013 11:10 AM

The Intel Math Kernel Library (MKL) is a library of optimized and threaded math routines such as BLAS, LAPACK, sparse solvers, fast fourier transforms, vector math, and more for all the latest Intel architectures. This page will show you how to use MKL on Eos, with examples to demonstrate its common use.

Environment

Before using the MKL library, you need to load the MKL module file:

    module load intel/mkl

This command loads the default MKL library, which is the recommended one to use. It sets several MKL environment variables that are used in compiling and linking, such as $MKLROOT, $MKLPATH, and $MKLINCLUDE.

The Intel MKL library can be used by various compilers, including the Intel compilers, the GNU compilers, and the PGI compilers. However, the options required by each brand of compilers are different. Our discussion below is based on the Intel compilers.

How To Link

Linking to the MKL library can be much involved if all possible usage senarios are considered. In this user guide, we focus only on dynamic linking using the -mkl compiler option, with consideration of the commonly used BLAS95/LAPACK95 static libraries.

General forms of linking to the MKL library using the -mkl flag, with options of linking to the MKL BLAS95/LAPACK95 Fortran libraries, are as follows:

    ifort myprog.f   -mkl[=lib] [options] [-lmkl_blas95_lp64] [-lmkl_lapack95_lp64] ...
    icc   myprog.c   -mkl[=lib] [options] ...
    icpc  myprog.cpp -mkl[=lib] [options] ...

The flag -mkl[=lib] tells the compiler to link to certain parts of MKL, where lib can be one of three values shown in the table.

Value Meaning
parallel Tells the compiler to link using the threaded part of MKL.
This is the default if the option is specified with no lib.
(-mkl=parallel is equivalent to -mkl)

The threaded part of MKL includes multithreaded BLAS, LAPACK,
FFT, etc. The environment variable OMP_NUM_THREADS
must be set to control the number of threads at run time for
the threaded MKL library.
sequential Tells the compiler to link using the non-threaded part of MKL,
which includes sequential BLAS, LAPACK, FFT, etc.
cluster Tells the compiler to link using the cluster part and the
sequential part of MKL. The cluster part of MKL includes
distributed FFT (DCFT), ScaLAPACK, and other sub-libraries for
distributed computing. The Intel MPI library is required
when -mkl=cluster.

NOTE: On Eos, the -mkl flag works only with compiler versions ≥ 12.1.4.319 and MKL versions ≥ 10.3.11.339.


Dynamic linking is the preferred way of linking to the MKL library. In most cases, the -mkl flag provides all that you need.

Example 1 Link to the sequential part of MKL.

    ifort example.f -mkl=sequential -o example.exe

Example 2 Link to the threaded part of MKL.

    icc example.c -mkl=parallel -o example.exe

Example 3 Link to the cluster part of MKL, being it DCFT or ScaLAPACK. For this to compile, you must load the intel/mpi module first.

    mpiifort example.f -mkl=cluster -o example.exe

When -mkl=cluster is used, the non-cluster part of MKL linked will be sequential. If we want to use the cluster part of MKL and the threaded part of MKL at the same time, we have to link each and every library explicitly.

Example 4 Link to ScaLAPACK and the threaded part of MKL.

    mpiicpc example.cpp -openmp -I${MKLINCLUDE} -L$(MKLPATH} -lmkl_scalapack_lp64 \
    -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_lp64      \
    -lpthread -o example.exe

Link with BLAS95/LAPACK95

Dynamic linking to the right part of the MKL library has been made easy with the -mkl flag. However, not all commonly used libraries are shared libraries and hence cannot be linked dynamically. More specifically, the MKL BLAS Fortran95 and LAPACK Fortran 95 libraries are static and have no dynamic counterparts. These libraries must be stated explicitly at the command line for linking. Again, the -mkl flag can work with the BLAS95/LAPACK95 libraries to make things much easier.

Example 5 Link to the sequential part of MKL and BLAS95.

    ifort example.f -mkl=sequential -lmkl_blas95_lp64 -o example.exe

Example 6 Link to the threaded part of MKL and LAPACK95.

    ifort example.f -mkl -lmkl_lapack95_lp64 -o example.exe

Linking a code with a static library must place the code before the static library to allow symbols from the static library referenced in the code to be resolved correctly, as shown in the examples above. To allow arbitrary orders between the library and the source code, we can use -Wl,--start-group archives -Wl,--end-group. The flag -Wl tells the compiler to pass the linker option following the comma to the linker. --start-group and --end-group are linker options to enclose the archives in between (can be libraries, source codes, and object files) as a group, and these files will be searched repeatedly until no new undefined references are created.

Example 7 Link to the threaded part of MKL and LAPACK95 in arbitrary order. Notice that the source file is placed after the static library.

    ifort -mkl -Wl,--start-group -lmkl_lapack95_lp64 example.f -Wl,--end-group -o example.exe 

ScaLAPACK

The MKL library implemens routines from the ScaLAPACK package for distributed-memory architectures. ScaLAPACK solves dense and banded linear systems, least square problems, eigenvalue problems, and singular value problems. It is built on top of BLAS, LAPACK, and BLACS (Basic Linear Algebra Communication Subprograms). The latter includes a set of routines that support a linear algebra oriented messages passing interface for a large range of distributed memory platforms. Except a few supporting utility routines that are implemented in C, majority of ScaLAPACK routines are implemented in Fortran 77.

Before calling a ScaLAPACK routine, the processor grid has to be set up by the programmer and all global matrices must be distributed manually on the process grid. In ScaLAPACK, block cyclic distribution is used for dense matrices and block distribution is used for banded matrices.

In general, four basic steps are required to call a ScaLAPACK routine.

  1. Initialize the process grid
  2. Distribute the matrix on the process grid
  3. Call ScaLAPACK routine
  4. Release the process grid
A pseudo program that calls a ScaLAPACK dense linear solver (pdgesv) is shown as follow:

    ! Step 1: nitialize the process grid working environment

      call blacs_pinfo
      call blacs_setup
      call blacs_gridinit
      call balcs_gridinfo

    ! Step 2: distribute the data if they are not in place on each process

      if (i am the root process) then
          ! send data to each non-root process
          do i=0, np-1
            if (i.NE.root) call dgesd2d   
      else
          call dgerv2d   ! non-root process receives data
      endif
      call descinit      ! create descriptors about the global matrix 
   
    ! Step 3: call the scalapack routine

      call pdgesv

    ! Step 4: release the process grid.

      call blacs_gridexit
      call blacs_exit

Complete sample programs in Fortran 90 and C can be downloaded from here: mypdgesvdriver.f90 and mypdgesvdriver.c. These programs have been tested on Eos with the Intel compilers and the Intel MPI library.

    [pingluo@login001]$ mpiifort -mkl=cluster mypdgesvdriver.f90 -o mypdgesvdriver_f.exe
    [pingluo@login001]$ mpiicc -mkl=cluster mypdgesvdirver.c -o mypdgesvdriver_c.exe
    [pingluo@login001]$ mpirun -np 6 ./mypdgesvdriver_c.exe
    0.00000000 -0.16666667 -0.50000000 0.16666667 
    0.50000000 0.00000000 0.00000000 
    0.00000000 0.00000000 

Further Information

Our examples show some basic use of linking to the dynamic MKL libraries and the BLAS95/LAPACK95 libraries with the Intel compilers. For other usage, such as static linking, Single Dynamic Library, linking with MPICH2, compiling with the GNU compilers, compiling with the PGI compilers, please check with the Intel MKL Linking Advisor.

We also show an example (in C and Fortran) on how to program with ScaLAPACK, a distributed linear algebra package provided in MKL. For more examples on how to program with MKL routines, please see files in $MKLROOT/examples on Eos.

For a complete reference of MKL, check the Intel MKL Reference Manual.