Dedicated Use & Batch Policies
Batch system policies are approved by the Steering Committee, firstname.lastname@example.org, and may on occasion change to reflect changing needs and load conditions. Your adherence to what we say below will be appreciated. What we aim at is to convince you that a little care on your part in doing certain things right will go a long way to keep our compute servers efficiently and fairly run. Very reluctantly, in order to maintain fairness and efficiency we will on occasion prematurely terminate jobs. The subsection Abnormal Job Termination lists common reasons for terminating a job by the staff.
All requests for dedicated machine use require the approval of the steering committee. To initiate the process, please send e-mail to the steering committee at, email@example.com. Assuming approval, arrangements must also be made in consultation with the staff. When machine maintenance is also scheduled, every other Tuesday is a strongly preferred day. Otherwise, machine load conditions will be a significant factor in selecting the preferred day for such an event. Please always give at least two weeks notice. The maximum processing time per request is also a steering committee decision.
Job Termination By Staff
The High Performance Research Computing staff reserves the right to terminate batch jobs when one or a combination of following effects occur:
- Use by your program of a larger number of cpus than its parallel efficiency warrants.
- Use by your program of a smaller number of cpus than that specified through the batch system. This is a particularly unacceptable practice since it results in wasting resources that they might otherwise be used by others. The batch system sets aside resources but it knows nothing about the actual number of cpus that your program will use.
- Submitting jobs with an artificially large wall-clock or cpu-time.
- Use/abuse of a special access queue to run a job that could very well run in one of the common queues.
- Excessive I/O with large files, which in turn overwhelms memory due to excessive file caching.
- Any use of large amounts of disk and/or memory that causes a significant disruption to the smooth operation of the system.
- Delayed file transfers with source or destination hosts that are remote.