HPC Getting Started: Quotas
Some of our HPC resources have quotas restricting the amount of the resource that available to any one researcher or group. This protects everyone from a person overusing a resource and depriving the rest of the researchers.
Home directories on the cluster have a default quota of 1 TB (1000 GB). In certain cases, a somewhat larger quota can be set, but one terabyte should be enough for most researchers. This quota is much larger than those at most other HPC sites. Please use this space responsibly! If everyone tried to fill up a TB, we would run out of space. Home directory quotas are not currently enforced by the filesystem, but they will be in the future.
To display your home directory quota use the quota command.
Scratch directories on the cluster do not have quotas and each job can use as much scratch space as it needs. However, you still must be careful, because the scratch space is shared by all the users on the system. If a runaway job fills up the scratch filesystem, most running jobs will crash.
Warnings: There are no backups for scratch. Files in scratch should not be left more than 30 days. Older files may be purged without warning.
The batch system enforces limits on the number of nodes a researcher or research group may use at any given time. There is a soft limit, which is enforced when jobs from other users are waiting in the queue. There is a hard limit when no other jobs are waiting. The hard limit is higher, so you can use more nodes when no one else needs them. Under normal circumstances you will not get any more nodes than the hard limit, no matter how many are free. The idea is to keep the nodes as busy as possible, without letting a small number of users 'hog' all the nodes. It is a tricky balancing act!
Basic compute node quotas are
16 nodes per user when other eligible jobs are waiting.
64 nodes per user when no other eligible jobs are waiting.
When several researchers are working on a project, a group node quota may be applied. The actual limit depends on the number of researching working in the group and the characteristics of their jobs.
The per-user hard and soft limits and the group node quotas are subject to change.
Exceeding your quota
When you submit a job, and there are not enough nodes available under your quota, the job will wait in the queue until one or more of your previous jobs finish. As usual, any time your job doesn't start executing soon after you submit it, use the checkjob command to see why it's waiting. For example:
BLOCK MSG: job 12345 violates active SOFT MAXNODE limit of 16