Compiling and Running CUDA Programs

Under Construction

Last modified: Thursday April 11, 2013 11:46 AM

Miscellaneous

Mixing CUDA with MPI

CUDA+MPI code is compiled using different compilers depending on what launguage it uses. For CUDA C/C++, we use nvcc with the corresponding MPI library. For CUDA Fortran, we use the PGI compilers with the corresponding MPI library. The PGI compilers are also used to compile OpenACC+MPI code. We will see several examples below that cover all three cases.

Case 1: CUDA C/C++ with MPI

The OpenMPI library compiled with the Intel compiler is compabtible with gcc and can be used with nvcc. The usage is show below. The gist is to specify the correct locations of the MPI include files and the MPI libraries, plus -l mpi.

module load intel/compilers
module load openmpi
nvcc -I ${O_MPI_ROOT}/include/ -L ${O_MPI_ROOT}/lib/ -l mpi [options] test.cu

nvcc treats test.cu as C++. If we want to enforce the C features of the host code, we have to seperate the host code from the device code and compile them separately, and then link the object files, as illustrated below.

module load intel/compilers
module load openmpi
mpicc  [options] -c host.c
nvcc -I ${O_MPI_ROOT}/include/ -L ${O_MPI_ROOT}/lib/ -l mpi [options] -c device.cu
mpicc host.o device.o -o test.exe

Case 2: CUDA Fortran with MPI

module load pgi/compilers
module load openmpi/pgi
mpifortran [-Mcuda] [options] test.cuf

Case 3: OpenACC with MPI

module load pgi/compilers
module load openmpi/pgi
mpifortran -acc [options] test.f90
mpicc -acc [options] test.c
mpicpp -acc [options] test.cpp

Compiling OpenCL