Compiling and Running CUDA Programs
Last modified: Wednesday April 10, 2013 3:24 PM
Compiling CUDA FORTRAN with PGI Compilers
PGI and NVIDIA have worked in cooperation to develop CUDA Fortran, an analog to NVIDIA's CUDA C. CUDA Fortran includes a Fortran 2003 compiler and tool chain for programming NVIDIA GPUs using Fortran. The CUDA Fortran compiler is a part of the PGI Fortran compilers that are available on Eos. Similar to CUDA C/C++, programming with CUDA Fortran usually follows the same five-step sequence.
Sample CUDA Fortran Source CodeIn the following sample program, any features related to CUDA Fortran is marked in red. Notice that in the kernel code, a and n both have the attribue value, a keyword introduced in Fortran 2003 that spcifies to pass the value of the actualy arguments to the dummy arguments. Without the attribute, the address of a and n will be passed to the dummy arguments as by default Fortran passes variables by reference. This will cause an error since the device cannot access the host memory.
attributes(global) subroutine daxpy( n, a, x, y ) real*8, dimension(*) :: x,y real*8, value :: a integer, value :: n integer :: i i = (blockidx%x-1) * blockdim%x + threadidx%x if( i <= n ) y(i) = a * x(i) + y(i) end subroutine program testdaxpy use cudafor implicit none real*8, dimension(1024) :: x, y real*8, device, dimension(1024) :: dx, dy real :: a integer :: n n = 1024 call random_number(x) call random_number(y) call random_number(a) dx = x dy = y call daxpy<<<n/64, 64>>>( n, a, dx, dy ) y = dy end program
Before compiling a CUDA Fortran program, the following module file must be loaded:
module load pgi/compilersThe command line for invoking PGI CUDA Fortran compiler is
pgfortran [options] -o cuda_fortran.exe cuda_fortran.cuf ...or
pgfortran -Mcuda [options] -o cuda_fortran.exe file1 file2 ...if the none of the source files has a .cuf or .CUF extension.
Common OptionsThe table below lists some mandatory or optional options for compiling/linking CUDA Fortran programs/objects.
|-Mcuda[=option[,option,...]]||Enable CUDA Fortran for files not endded in .cuf. Required when linking CUDA Fortran objects.|
|-tp=nehalem-64||Specify the type of the target processor being 64 bit Intel nehalem/westmere|
|-O[level]||Set the compiler optimization level. level can be 0, 1, 2, 3, 4, 5. Commonly used level is 2 or 3.|
|-fast||Chooses generally optimal flags for the target platform.|
|-Minfo[=option[,option,...]]||Emit useful compilation information to stderr|
|-v||Verbose mode; print out the command line for each tool before it is executed.|
pgfortran -o daxpy.exe daxpy.cuf pgfortran -Mcuda -v -o daxpy.exe daxpy.cuf pgfortran -Minfo -O3 -fast daxpy.exe daxpy.f90 pgfortran -Mcuda=4.2,ptxinfo -o daxpy.exe daxpy.cufThe last example will print the resource usage for each kernel routine from the PTX assembler.