User Tools

Site Tools


techstaff:slurm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
techstaff:slurm [2015/12/07 16:58] kauffmantechstaff:slurm [2015/12/29 15:14] kauffman
Line 1: Line 1:
-====== DRAFT | Peanut Job Submission Cluster | DRAFT ======+====== DRAFT | Peanut Job Submission Cluster ======
  
 We are currently **alpha** testing and gauging user interest in a cluster of machines that allows for the submission of long running compute jobs. Think of these machines as a dumping ground for discrete computing tasks that might have been rude or disruptive to execute on the main (shared) shell servers (i.e., linux1, linux2, linux3). We are currently **alpha** testing and gauging user interest in a cluster of machines that allows for the submission of long running compute jobs. Think of these machines as a dumping ground for discrete computing tasks that might have been rude or disruptive to execute on the main (shared) shell servers (i.e., linux1, linux2, linux3).
Line 37: Line 37:
  
 ==== Storage ==== ==== Storage ====
-Shared scratch storage is being plannedbut not yet available. Techstaff hopes to have this done in time for the winter quarter.+There is slow scratch space mounted to ''%%/scratch%%''. It is a ZFS pool consisting of 10x 2TB 7200RPM SAS drives connected via a LSI 9211-8i and is made up of 5 mirrored VDEVswhich is similar to a RAID10. The servers uplink is 1G ethernet
  
 ==== Utilization Dashboard ==== ==== Utilization Dashboard ====
Line 43: Line 43:
  
 ==== Partitions / Queues ==== ==== Partitions / Queues ====
-As of December, 2015 we have partitions in our cluster.+As of December, 2015 we have will have at least 2 partitions in our cluster; 'debug' and 'general'. An other partition is not guarenteed and will server a specific purpose.
  
 ^ Partition Name ^ Description ^ ^ Partition Name ^ Description ^
Line 145: Line 145:
  
  
-===== Monitoring Jobs =====+====== Monitoring Jobs ======
  
 ''%%squeue%%'' and ''%%sacct%%'' are two different commands that allow you to monitor job activity in SLURM. ''%%squeue%%'' is the primary and most accurate monitoring tool since it queries the SLURM controller directly. ''%%sacct%%'' gives you similar information for running jobs, and can also report on previously finished jobs, but because it accesses the SLURM database, there are some circumstances when the information is not in sync with squeue. ''%%squeue%%'' and ''%%sacct%%'' are two different commands that allow you to monitor job activity in SLURM. ''%%squeue%%'' is the primary and most accurate monitoring tool since it queries the SLURM controller directly. ''%%sacct%%'' gives you similar information for running jobs, and can also report on previously finished jobs, but because it accesses the SLURM database, there are some circumstances when the information is not in sync with squeue.
Line 156: Line 156:
  
  
-===== Interactive Jobs =====+====== Interactive Jobs ======
  
 Though batch submission is the best way to take full advantage of the compute power in the job submission cluster, foreground, interactive jobs can also be run. Though batch submission is the best way to take full advantage of the compute power in the job submission cluster, foreground, interactive jobs can also be run.
/var/lib/dokuwiki/data/pages/techstaff/slurm.txt · Last modified: 2021/01/06 16:13 by kauffman

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki