Texas A&M Supercomputing Facility Texas A&M University Texas A&M Supercomputing Facility

Compiling and Running CUDA Programs

Last modified: Wednesday April 10, 2013 3:24 PM

Compiling CUDA FORTRAN with PGI Compilers

PGI and NVIDIA have worked in cooperation to develop CUDA Fortran, an analog to NVIDIA's CUDA C. CUDA Fortran includes a Fortran 2003 compiler and tool chain for programming NVIDIA GPUs using Fortran. The CUDA Fortran compiler is a part of the PGI Fortran compilers that are available on Eos. Similar to CUDA C/C++, programming with CUDA Fortran usually follows the same five-step sequence.

Sample CUDA Fortran Source Code

In the following sample program, any features related to CUDA Fortran is marked in red. Notice that in the kernel code, a and n both have the attribue value, a keyword introduced in Fortran 2003 that spcifies to pass the value of the actualy arguments to the dummy arguments. Without the attribute, the address of a and n will be passed to the dummy arguments as by default Fortran passes variables by reference. This will cause an error since the device cannot access the host memory.


        attributes(global) subroutine daxpy( n, a, x, y )
        real*8, dimension(*) :: x,y
        real*8, value :: a
        integer, value :: n
        integer :: i
        i = (blockidx%x-1) * blockdim%x + threadidx%x
        if( i <= n ) y(i) = a * x(i) + y(i)
        end subroutine

        program testdaxpy
        use cudafor
        implicit none
        real*8, dimension(1024) :: x, y
        real*8, device, dimension(1024) :: dx, dy
        real :: a
        integer :: n

        n = 1024

        call random_number(x)
        call random_number(y)
        call random_number(a)

        dx = x
        dy = y
        call daxpy<<<n/64, 64>>>( n, a, dx, dy )

        y = dy

        end program

Before compiling a CUDA Fortran program, the following module file must be loaded:

    module load pgi/compilers
The command line for invoking PGI CUDA Fortran compiler is
    pgfortran [options] -o cuda_fortran.exe cuda_fortran.cuf ...
    pgfortran -Mcuda [options] -o cuda_fortran.exe file1 file2 ...
if the none of the source files has a .cuf or .CUF extension.

Common Options

The table below lists some mandatory or optional options for compiling/linking CUDA Fortran programs/objects.

Option Description
-Mcuda[=option[,option,...]] Enable CUDA Fortran for files not endded in .cuf. Required when linking CUDA Fortran objects.
-tp=nehalem-64 Specify the type of the target processor being 64 bit Intel nehalem/westmere
-O[level] Set the compiler optimization level. level can be 0, 1, 2, 3, 4, 5. Commonly used level is 2 or 3.
-fast Chooses generally optimal flags for the target platform.
-Minfo[=option[,option,...]] Emit useful compilation information to stderr
-v Verbose mode; print out the command line for each tool before it is executed.


    pgfortran -o daxpy.exe daxpy.cuf
    pgfortran -Mcuda -v -o daxpy.exe daxpy.cuf
    pgfortran -Minfo -O3 -fast daxpy.exe daxpy.f90
    pgfortran -Mcuda=4.2,ptxinfo -o daxpy.exe daxpy.cuf
The last example will print the resource usage for each kernel routine from the PTX assembler.