Debugging
This subsection of necessity contains a summary account on core debugging. The "General Programming Concepts: Writing and Debugging Programs" guide from IBM is another good resource for these and other topics.
dbx, pdbx
The serial command-line IBM debugger is called dbx. The parallel command-line IBM debugger is called pdbx.
Debugging in Batch Jobs
If the problem only arises after the program runs for a long time, or in a case where large amounts of memory is used, interactive debugging does not remain possible. In this case, you wish to capture the core file and run the debugger in a batch job.
Saving the Core File
To save the core file, check for existence of core file once your program terminates.
Example Job File (captures the core file)
#@ shell = /bin/ksh #@ comment = Core Debug #@ initialdir = $(home)/tests/project1/ #@ job_name = progDebug #@ error = $(job_name).o$(schedd_host).$(jobid).$(stepid) #@ output = $(job_name).o$(schedd_host).$(jobid).$(stepid) #@ resources = ConsumableCpus(1) ConsumableMemory(500mb) # Specify 50 minutes of wallclock time for the duration of the job #@ wall_clock_limit = 00:50:00 #@ node = 1 #@ tasks_per_node = 1 #@ notification = always #@ queue cd $TMPDIR cp $HOME/prog.exe . ./prog.exe if [ -f core ] ; then cp core $HOME fi cp prog.out $HOME
Running the Debugger in Batch Mode
Create a debugger command file (call "dbx.commands")
Example "dbx.commands" File
where dump . print xsect, temp, pres, vel, coeff quit
The batch job should be similar to the following
#@ shell = /bin/ksh #@ comment = Core Debug #@ initialdir = $(home)/tests/project1/ #@ job_name = progDebug #@ error = $(job_name).o$(schedd_host).$(jobid).$(stepid) #@ output = $(job_name).o$(schedd_host).$(jobid).$(stepid) #@ resources = ConsumableCpus(1) ConsumableMemory(500mb) # Specify 50 minutes of wallclock time for the duration of the job #@ wall_clock_limit = 00:50:00 #@ node = 1 #@ tasks_per_node = 1 #@ notification = always #@ queue cd $TMPDIR cp $HOME/prog.exe $HOME/core $HOME/dbx.commands . dbx -c dbx.commands prog.exe core
The job's output file (debug.out) will contain the results of the "where" and "dump" commands given to "dbx". As the user finds more information, he can add more commands in "dbx.commands" file to pin-point the cause of the error.