UC San Diego SearchMenu

Overview of TSCC Accounting

pdaf nodes image

Triton Shared Computing Cluster supports five queues for job submission: hotel, condo, pdafm, home and glean. Users must have an account so that the accounting system can charge for the execution time.

The base unit for all billable TSCC queues is the Service Unit, or SU. Allocations are defined in terms of the number of SUs available to each account, and accounts may be shared among users.

Charges on condo, hotel and PDAF nodes (i.e. through the condo, hotel, pdafm and home queues) are calculated in SUs, which are measured per processing core per hour. All core hours so used are equivalent such that 1 core–hour = 1 Service Unit.

Jobs are allocated on a per–core basis, and only allocated cores will be charged. Hotel and condo nodes have 16 cores, while PDAF nodes have 32. Jobs can request fewer cores than the node maximum and be charged accordingly, rather than being charged for the entire node. The scheduler may allocate independent jobs for different users simultaneously on the same node in order to optimize queue waits.

The glean queue has no SU charges associated with it. However, it has a low priority and jobs submitted to it are subject to termination at any time.

The following general policies are in effect for scheduling jobs on TSCC:

  • PDAF nodes have 512GB of memory. Due to system overhead, slightly less than this is available for jobs, approximately 2GB less per 128GB of configured memory. The hotel and most condo nodes have 64GB of memory; some condo nodes have 128GB. 
  • Wallclock limits are in effect for all queues. The maximum walltime for a TSCC hotel job is 168 hours, while the condo queue has an eight-hour limit. Please see the Running Jobs page for details.
  • You may submit a special request to the TSCC Mailing List for jobs that require more walltime than is available in any queue to which you have access.
  • Jobs are allowed to access an amount of the total node memory equivalent to the proportion of cores allocated. Jobs attempting to access more memory than this will be terminated by the system.

Job Charging on TSCC

How are jobs charged to my account?

User accounts are set up to charge jobs against user-specific default account. If preferred, the default can be set to charge against another account, such as a project or shared account based on a TSCC purchase or allocation.

If a job is submitted and the account is depleted below the estimated SUs needed to run, it will be deferred until the account is replenished. The qstat -f command will report a message similar to:

cannot debit job account - no funds

After the account balance is adjusted, the job will be able to run without being resubmitted. It will go into the idle state when the scheduler rechecks balances, and then get scheduled normally.

How do I check my account balance?

You can check your account balance and status by running gbalance -u <username>. This will show the balance of all accounts that you can charge jobs to.

TSCC Account Charges and Processor/Memory Requests

To specify the account to be charged, use the -A option. It is recommended to use this option with all job submission scripts and qsub commands, to clearly indicate which account the user wants to be charged for the job.

#PBS -A <account name>

Memory requests are for all nodes combined. Node and core requests are per-node.

The maximum processors value for a queue is the maximum total processors that any single job can request. So these commands would be allowed since the hotel resources_max.proc value is 128:

qsub -q hotel -l nodes=128:np=1
qsub -q hotel -l nodes=8:np=16

but these would block:

qsub -q hotel -l nodes=129:np=1
qsub -q hotel -l nodes=9:np=16

Requests for resources exceeding the available maximums will be deferred and retried by the scheduler. After a limited number of retries, they will be put on hold and require administrator intervention.

Requests for more than the maximum number of nodes will not be rejected, as the scheduler makes no assumptions regarding future node availability. Requests that do not specify a memory size will be given the default amount of memory per node (approximately 4GB/core for hotel and small condo nodes, 8GB/core for large condo nodes, and 16GB for PDAF nodes).

Account Charge Pre-verification

Before a job can be scheduled, the system verifies available credits in the user account. It does not actually charge the account at this time, but SUs (CPU-hour credits) equal to the estimated charges must be available. The system uses values from the job script to estimate these charges according to the following formula:

The formula for hotel queue requests is:

#CPUs x #nodes x wall time

If a job runs more than five minutes beyond its requested wall time, it will be canceled by the system. Job charges that exceed available SUs in the account will not be canceled, but will result in a negative balance that can be credited later.

Users