This subsection of necessity contains a summary account on exceptions The reader can consult the following man pages for more information: fp_trap, fp_any_xcp, fp_enable, fp_iop_snan, fp_sh_info, sigaction, dbx, pdbx, and idebug. Another invaluable resource on FPEs is listed at the end of the following section.
Floating-Point Exceptions (FPEs)
When computations go awry in your program, you may notice incorrect numbers in some output fields, even though your program continues to execute. Sometimes you may notice strings like INF and NaN in fields where only numbers should be; these indicate certain kinds of floating-point exceptions (FPEs). INF means "infinity" and NaN means "not a number." Sometimes it's hard to find where these FPEs occur in your code, but you must find and fix them. Salting your code with print statements is hit-or-miss and invasive. If you believe you have only a few FPEs, you are well advised to use a debugger like dbx, which will often point at the first FPE in your core file. But if you have many FPEs, weeding them out in this manner can be tedious.
An alternative and reliable method is called "trapping." By trapping, we mean setting a trap at your program's runtime that gets tripped when an FPE occurs, after which the program execution follows a prescribed course of your choice. This course is referred to as "handling" the error, where the handling you choose may cause the program to abort, print a diagnostic message, or provide a traceback. Since trapping and handling require extra processor time, you should remove trapping/handling subroutine calls and compiler options after you have removed your program's FPEs. Trapping is something you need to use as a first step in debugging, but not in production.
There are six conditions that can cause FPEs as defined by the IEEE standard for floating-point arithmetic. All of these conditions concern floating-point operations except integer overflow (although integer overflow is also traced as an FPE).
|TRP_INVALID||Invalid Operation Summary|
|TRP_DIV_BY_ZERO||Divide by Zero|
By default these conditions are "masked" and do not cause a floating-point exception. Instead, a default value is substituted for the result of the operation, and the program continues silently. To identify the sources of these conditions, trapping must be enabled. There are two ways to do this on the IBM: (1) the use of the fp_trap library routine or (2) the use of the -qsigtrap and -qflttrap options in compile and load statements.
The fp_trap routine gives a very precise location of the exception, but using fp_trap significantly slows down the execution of your code. This method also requires you to change your code to insert the routine of course. Users generally get sufficient trapping information on the IBM systems by specifying -qsigtrap and -qflttrap in both their compile statement and their load statement. Using the -qsigtrap and -qflttrap options has a smaller impact on performance than using the fp_trap routine.
The -qflttrap option identifies which exceptions to trap while -qsigtrap identifies a piece of code that will be called to handle the trapped exceptions. If -qsigtrap isn't used, the default behavior for trapped exceptions identified by -qflttrap is to dump core and exit.
When used, the -qsigtrap option by default generates a floating-point exception trace that collects each floating-point exception -- including the exception type and the call stack -- at the time of the exception, and then exits the program.
All of the above statements are equally applicable to Fortran, C, and C++.
In the example below, we ask that all FPEs be intercepted, and we enable trapping with the EN parameter to -qflttrap.
hydra% xlf -o prog.exe -g prog.f90 -qflttrap=OV:UND:ZERO:INV:INEX:EN -qsigtrap hydra% prog.exe
For a detailed treatment of this topic please refer to "Error trapping for multiprocessor systems: A primer". The FPE section of our user guide has been extracted from this primer, with minor modifications.